160
Supporting Newcomers in Software Development Projects Doctoral Dissertation by Sebastiano Panichella Under the supervision of: Prof. Massimiliano Di Penta Prof. Gerardo Canfora July 2014 1

Supporting Newcomers in Open Source Software Development Projects

Embed Size (px)

DESCRIPTION

PhD Dissertation by Sebastiano Panichella Title Thesis: "Supporting Newcomers in Software Development Projects" Advisors: Massimiliano Di Penta and Gerardo Canfora

Citation preview

Page 1: Supporting Newcomers in Open Source Software Development Projects

1

Supporting Newcomers in SoftwareDevelopment Projects

Doctoral Dissertationby

Sebastiano Panichella

Under the supervision of

Prof Massimiliano Di PentaProf Gerardo Canfora

July 2014

2

Newcomer Learning Pathhellip

3

Newcomer Learning Pathhellip

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 2: Supporting Newcomers in Open Source Software Development Projects

2

Newcomer Learning Pathhellip

3

Newcomer Learning Pathhellip

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 3: Supporting Newcomers in Open Source Software Development Projects

3

Newcomer Learning Pathhellip

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 4: Supporting Newcomers in Open Source Software Development Projects

4

Newcomer Learning Pathhellip

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 5: Supporting Newcomers in Open Source Software Development Projects

5

Newcomer Learning Pathhellip

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 6: Supporting Newcomers in Open Source Software Development Projects

6

Data Extraction

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 7: Supporting Newcomers in Open Source Software Development Projects

7

Data Extraction

Empirical Studies

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Newcomer Learning Pathhellip

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 8: Supporting Newcomers in Open Source Software Development Projects

8

Data Extraction

Recommenders

bull Versioning Systemsbull Mailing Listsbull Issue trackersbull QampA site (eg StackOverflow)

Empirical Studies

Newcomer Learning Pathhellip

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 9: Supporting Newcomers in Open Source Software Development Projects

9

Data Extraction

Recommenders

Studies

Newcomer Training Process

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 10: Supporting Newcomers in Open Source Software Development Projects

10

Data Extraction

Recommenders

Mentoring

Studies

1) Recommend Mentors

Newcomer Training Process

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 11: Supporting Newcomers in Open Source Software Development Projects

11

Data Extraction

Recommenders1) Recommend Mentors

Studies

2) Supporting Source Code Comprehension and Re-documentation

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries

Perform Development Tasks Mentoring

Newcomer Training Process

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 12: Supporting Newcomers in Open Source Software Development Projects

12

1) Recommend Mentors

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) investigate how newcomers browse artifacts software d) investigate how newcomers generate source code summaries3) Analyze developers c) social activity d) technical activity

Newcomer Training Process

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 13: Supporting Newcomers in Open Source Software Development Projects

13

1) Recommend Mentors

3) Analyze developers c) social activity d) technical activity

Studies

3) Recommend Refactoring

2) Supporting Source Code Comprehension and Re-documentation

Perform Development Tasks Mentoring

Data Extraction

Recommenders

Team Collaborations

2) Analyze Software Artifacts c) generate source code summaries d) support maintenance tasks

Newcomer Training Process

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 14: Supporting Newcomers in Open Source Software Development Projects

14

Thesis Structure

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 15: Supporting Newcomers in Open Source Software Development Projects

15

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 16: Supporting Newcomers in Open Source Software Development Projects

16

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III presents recommenders to support concretely project newcomers

Thesis Structure

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 17: Supporting Newcomers in Open Source Software Development Projects

17

bull PART I analyzing data from software repositories to support team work

bull rs developers and support the team work

bull PART II analyzing how developers use software artifacts to help newcomers in program comprehension task Si

bull on ta

bull sk

bull PART III developing recommenders to support concretely project newcomers

Thesis Structure

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 18: Supporting Newcomers in Open Source Software Development Projects

18

PART I

Analysis of Developersrsquo

Communication

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 19: Supporting Newcomers in Open Source Software Development Projects

19

Team 1

Team 2

Team n

SHARING KNOWLEDGEAND TECHINCAL SKILLS

New FeaturesBugs fixing

Emerging Teams in Open Source Projects

httpscodegooglecompgource

>

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 20: Supporting Newcomers in Open Source Software Development Projects

21

Socio-Technical Congruence in Developers Social

Networks

Bird et al - FSE 2008

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 21: Supporting Newcomers in Open Source Software Development Projects

22

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

IRC CHAT LOG

VERSIONING SYSTEM

ISSUE TRACKERMAILING LIST

Sebastiano Panichella Gabriele Bavota Massimiliano Di Penta Gerardo Canfora Giuliano AntoniolHow Developersrsquo Collaborations Identified from Different Sources Tell us About Code Changes The 30th International Conference on Software Maintenance and Evolution (IEEE ICSME 2014)

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 22: Supporting Newcomers in Open Source Software Development Projects

23

Example Hibernate OSS Project

How Developersrsquo Collaborations Networks Identified from Different Sources Differ

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 23: Supporting Newcomers in Open Source Software Development Projects

24

Developers Overlap betweenDifferent Sources

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 24: Supporting Newcomers in Open Source Software Development Projects

25

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 25: Supporting Newcomers in Open Source Software Development Projects

26

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 26: Supporting Newcomers in Open Source Software Development Projects

27

Developers Overlap betweenDifferent Sources

Apache Httpd

Apache Lucene

Samba

Hibernate

ISSUE and CHAT ISSUE and MAIL

lt35 56

MAIL and CHAT MAIL and ISSUE

lt50

86

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 27: Supporting Newcomers in Open Source Software Development Projects

28

Overlap of Developers Social Links

ISSUE and CHAT ISSUE and MAIL

lt26 38

MAIL and CHAT MAIL and ISSUE

lt20

30

Apache Httpd

Apache Lucene

Samba

Hibernate

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 28: Supporting Newcomers in Open Source Software Development Projects

29

During an IRC Chat Meeting

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquobut we also need to create the attributes and values in the entity bindingrdquo

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 29: Supporting Newcomers in Open Source Software Development Projects

30

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

1) Brainstorming

ldquohowever planning a pure standalone test suite would make things easierrdquo

During an IRC Chat Meeting

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 30: Supporting Newcomers in Open Source Software Development Projects

31

PROJECT Hibernate

ldquois there a better way dunno like I said this is brainstorming and I have not given lots of thought to these casesrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

During an IRC Chat Meeting

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 31: Supporting Newcomers in Open Source Software Development Projects

32

PROJECT Hibernate

ldquookay I think it is a bug and Irsquom going to create a jira firstrdquo

ldquohowever planning a pure standalone test suite would make things easierrdquo

1) Brainstorming2) Planning (eg Testing activities)

3) Open an Issue

During an IRC Chat Meeting

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 32: Supporting Newcomers in Open Source Software Development Projects

33

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 33: Supporting Newcomers in Open Source Software Development Projects

34

Similarity Measure of Topics Extracted from Different

Communication Channels

issues vs mails issues vs chat mails vs chatApache Httpd 017 009 006

Apache Lucene

008 003 002

Hibernate 011 002 003

Samba 006 002 002

gtgtgtgt

gtgt

ge

Values in the first column (issues vs mails) are higher than those in the other two columns where issues and mails are compared with the chat

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 34: Supporting Newcomers in Open Source Software Development Projects

35

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 35: Supporting Newcomers in Open Source Software Development Projects

36

Leaders

Leaders

Leaders

Leaders

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

0

20

40

20

20

40

60

60

60

60

60

80

MAIL ISSUE CHATPrecision in Recommending Leaders

Use Issue Chat and Mail toIdentify Leaders

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 36: Supporting Newcomers in Open Source Software Development Projects

37

Analysis of the Evolution of Teams Why

1) To Better Understand the ReasonsBehind the Teams Reorganization

(splitmerge of developers teams)

2) Investigate whether Emerging Teams Evolve with the aim of Working on more Cohesive Groups of Files Than Support Re-factoring Remodulation

Sebastiano Panichella Gerardo Canfora Massimiliano Di Penta Rocco Oliveto How the evolution of emerging collaborations relates to code changes an empirical study The 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 37: Supporting Newcomers in Open Source Software Development Projects

38

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 38: Supporting Newcomers in Open Source Software Development Projects

39

By use FUZZYCLUSTER ALGORITHMS

Teams Identification from Emergent Collaborations

Analysis of the Evolution of Teams How

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 39: Supporting Newcomers in Open Source Software Development Projects

40

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 40: Supporting Newcomers in Open Source Software Development Projects

41

TEAMS SPLIT

TEAMS MERGE

R1

By use FUZZYCLUSTER ALGORITHMS

R2

Analysis of the Evolution of Teams How

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 41: Supporting Newcomers in Open Source Software Development Projects

42

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Analysis of the Evolution of Teams How

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 42: Supporting Newcomers in Open Source Software Development Projects

43

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Analysis of the Evolution of Teams How

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 43: Supporting Newcomers in Open Source Software Development Projects

44

R1

R2

By use FUZZYCLUSTER ALGORITHMS

Sub-system one Sub-system twoSub-systems two

Sub-Systems where Developers Working on

Mancoridis et al Modul Quality

Poshyvanyk et al CCBC

StructurePersprective

ConceptualPersprective

Analysis of the Evolution of Teams How

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 44: Supporting Newcomers in Open Source Software Development Projects

45

Apache HTTP Eclipse JDT Netbeans Samba

Period considered

091998-032012 012002-122011

012001-082012 012000-092011

Releases Considere

d

20220224

2212241

3032343642

3436556972

233020302535040

Systems characteristics Period of Time and Releases Considered

Case Study

bull Goal analyze data from mailing listsissue trackers and versioning systems

bull Purpose observe the reorganization of the teams between releasesbull Quality focus better understand the reason behind the

reorganization of teams

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 45: Supporting Newcomers in Open Source Software Development Projects

46

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 46: Supporting Newcomers in Open Source Software Development Projects

47

Teams Merge in a New Release

- in 20-35 of the cases

Teams Split in a New Release

- In 15-35 of the cases

How do Emerging Collaborations Change across Software Releases

Teams Disappeared

22-45

Teams Survived

50-70

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 47: Supporting Newcomers in Open Source Software Development Projects

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 48: Supporting Newcomers in Open Source Software Development Projects

49

TEAMS SPLIT

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 49: Supporting Newcomers in Open Source Software Development Projects

50

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 50: Supporting Newcomers in Open Source Software Development Projects

51

TEAMS SPLIT

TEAMS MERGED

How does the Evolution of DNs Relate to the Cohesiveness of Files Changed by Emerging Teams

MQ

CCBC

MQ

CCBC

The re-organization of developers into teams is reflected in cohesive changes occurring in the system structure

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 51: Supporting Newcomers in Open Source Software Development Projects

52

Analysis of DevelopersrsquoCommunication

1) Social network recommenders should not limit their information mining a single source

2) Issue and mail can be used to identify leaders with high accuracy

3) Social interaction between developers can be used to building better recommenders for software re-modularization or refactoring actions

PART I

Summary

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 52: Supporting Newcomers in Open Source Software Development Projects

53

PART II

How Developers Browse and Understand

Software Artifacts

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 53: Supporting Newcomers in Open Source Software Development Projects

54

PART II ndash Experiment A

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 54: Supporting Newcomers in Open Source Software Development Projects

55

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment B

Two Empirical Studies Aimed at Understanding

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 55: Supporting Newcomers in Open Source Software Development Projects

56

PART II ndash Experiment AHow such documentation is browsed by developers to perform maintenance activities

PART II ndash Experiment BWhat code elements are often used by humans when labeling a source code artifact

word1word2

word3word4

word5word6

Two Empirical Studies Aimed at Understanding

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 56: Supporting Newcomers in Open Source Software Development Projects

57

Experiment A Context

bull Object software artifacts from SMOS a school automation system developed by graduate students at the University of Salerno (Italy)

bull Subjects 33 participants

121 121 667 72

11 Bachelor Students 18 Master Students 4 PhD Students

G Bavota G Canfora M Di Penta ROliveto Sebastiano PanichellaAn Empirical Investigation on Documentation Usage Patterns in Maintenance Tasks

The 29th International Conference on Software Maintenance (ICSM 2013)

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 57: Supporting Newcomers in Open Source Software Development Projects

58

Maintenance Tasks

Bug Fixing

Add a new feature

Improve existing features

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 58: Supporting Newcomers in Open Source Software Development Projects

59

How Much Time did Participants Spend on Different Kinds of Artifacts

72

131032

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 59: Supporting Newcomers in Open Source Software Development Projects

60

72

131032

Undergraduates students used Source Code and Javadoc significantly

more than Graduate students

UndergraduateStudents

How Much Time did Participants Spend on Different Kinds of Artifacts

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 60: Supporting Newcomers in Open Source Software Development Projects

61

72

131032

GraduateStudents

Graduate students used Class Diagramssignificantly more than Undergraduates

How Much Time did Participants Spend on Different Kinds of Artifacts

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 61: Supporting Newcomers in Open Source Software Development Projects

62

Navigation Patterns Followed By Developers Before Reaching Source Code

S = Sequence Diagram

D = Class DiagramU = Use CaseJ = Javadoc

Simple Navigation Patterns

(SD)+ = Sequence Diagram before Class Diagram (US)+ = Use Case before Sequence Diagram(DS)+ = Class Diagram before Sequence DiagramU(SD)+ = Use Case before (SD)+

Complex Navigation Patterns

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 62: Supporting Newcomers in Open Source Software Development Projects

63

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 63: Supporting Newcomers in Open Source Software Development Projects

64

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 64: Supporting Newcomers in Open Source Software Development Projects

65

S

D

(SD)+

(US)+

U(SD)+

(DS)+

J

U

S(US)+

SU(SD)+

Other

0 2 4 6 8 10 12 14 16 18 20

18

8

2

2

1

1

3

0

1

0

4

12

16

12

10

4

4

1

2

1

1

5

GraduateUndegraduate

More experienced participants use a more ldquoIntegrated approachrdquo

S= Sequence Diagram

D= Class Diagram

U= Use Case

J= Javadoc

Most Frequent Navigation Patterns Before Reaching Source Code

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 65: Supporting Newcomers in Open Source Software Development Projects

66

Source Code

Sequence Diagram

Javadoc

56

36

7717

78

18

1678

49

37

Transition Graph between Kinds of Software Artifacts

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 66: Supporting Newcomers in Open Source Software Development Projects

67

Source Code

Sequence Diagram

56

36

7717

78

18

1678

49

37

Javadoc

1) From Source Code participants in most cases ldquogo backrdquo to

Sequence and Class Diagrams

2) From Sequence and Class Diagrams

participants in most cases ldquogo backrdquo to

Source Code

3) Starting from a Use Case participants go

ahead reading Sequence Diagrams Only after

they reading and writing Source Code

Transition Graph between Kinds of Software Artifacts

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 67: Supporting Newcomers in Open Source Software Development Projects

68

PART II ndash Experiment B

What Code Elements are Often Used by Humans When

Labeling a Source Code Artifact

word1word2

word3word4

word5word6

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 68: Supporting Newcomers in Open Source Software Development Projects

69

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

book hotel room reservation arrival departure smoking double card breakfastJava Class

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 69: Supporting Newcomers in Open Source Software Development Projects

70

Experiment B Context

bull Object

eXVantage (industrial test data generation tool)

bull Subjects17 Bachelor Student CS

hellip(Univ of Molise second year)

21 Master Student in CShellip

(University of Salerno)

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

room 3arrival 3book 2hotel 2reservation 2departure 2double 2card 2

bookhotelroomreservationarrivaldeparturesmokingdoublecardbreakfast

bookhotelroomrefundarrivalcheckparkingdoublesuitegroup

confirmationroomreservationarrivaldeparturedatebedcardpaymentspa

ORACLE terms selected by at least 50 of the subjects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 70: Supporting Newcomers in Open Source Software Development Projects

71

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 71: Supporting Newcomers in Open Source Software Development Projects

72

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 72: Supporting Newcomers in Open Source Software Development Projects

73

Experiment B ContextComparison of Different Labeling

Techniques

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 73: Supporting Newcomers in Open Source Software Development Projects

74

Experiment B ContextComparison of Different Labeling

Techniques

Data extracted from signature of methods match very well the mental model of newcomers when describing source code

Andrea De Lucia Massimiliano Di Penta Rocco Oliveto Annibale Panichella Sebastiano Panichella Labeling Source Code with Information Retrieval Methods An Empirical Study EMSE 2014

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 74: Supporting Newcomers in Open Source Software Development Projects

75

How Developers Browse andUnderstand Software Artifacts

1) Newcomers spend more time to analyze low-level artifacts as compared to high-level artifacts

2) Less experienced newcomers spend a significantly higher proportion of time on source code 3) More experienced newcomers instead spend more time on class diagrams

4) Heuristics based on data extracted form signature of methods are able to match very well the mental model of newcomers when describing source code elements

PART II

Summary

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 75: Supporting Newcomers in Open Source Software Development Projects

76

PART III

Recommenders

Data Extraction(Software Repositories)

Empirical Studies

PART I and PART II

Recommenders

PART III

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 76: Supporting Newcomers in Open Source Software Development Projects

77

Two Recommenders to SupportProject Newcomers

PART III - A)Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

PART III ndash B)Mining Source Code Descriptions from Developersrsquo Communication to Improve Newcomersrsquo Program Comprehension

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 77: Supporting Newcomers in Open Source Software Development Projects

78

PART III ndash A)

Suggest Appropriate Mentors to Help Newcomers in Open Source Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 78: Supporting Newcomers in Open Source Software Development Projects

79

Mentoring of Project Newcomers is Highly Desirablehellip

Previous Work

Dagenais et al ICSE 2010

MENTOR

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 79: Supporting Newcomers in Open Source Software Development Projects

80

bull Small Projects find Mentors is a trivial problem

bull Large Projects find Mentors is not a trivial problem

When a Newcomer Joins a Project

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 80: Supporting Newcomers in Open Source Software Development Projects

81

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 81: Supporting Newcomers in Open Source Software Development Projects

82

Motivation

httpscommunityapacheorgmentoringprogrammehtml

httpscommunityapacheorgmentoringprogrammehtml

Identifying Mentors in Software Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 82: Supporting Newcomers in Open Source Software Development Projects

83

Characteristics of a Good Mentor

Enough ability to help other people

Enough expertise about the topic of interest for the newcomer

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 83: Supporting Newcomers in Open Source Software Development Projects

84

1) Find Past Successful Mentors

2) Suggest Mentors Having Specific Skills

YODA (Young and newcOmer Developer

Assistant)

Approach for Mentors Identification in Open Source Projects

Gerardo Canfora Massimiliano Di Penta Rocco Oliveto Sebastiano Panichella Who is Going to Mentor Newcomers in Open Source Projects

International Symposium on the Foundations of Software Engineering (SIGSOFT FSE 2012)

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 84: Supporting Newcomers in Open Source Software Development Projects

Source of Inspiration Arnetminer

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 85: Supporting Newcomers in Open Source Software Development Projects

Source of Inspiration Arnetminer

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 86: Supporting Newcomers in Open Source Software Development Projects

Jim Alice

Is the mentor of

IF

Time

When Alice joinsthe project

F1

F1 Exchanged emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 87: Supporting Newcomers in Open Source Software Development Projects

Jim Alice

Is the mentor of

IF

F1

F2

gtF2 gt

F2 amount of emails

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 88: Supporting Newcomers in Open Source Software Development Projects

Jim Alice

Is the mentor of

IF

F1

F2 gtTime

F3

F3

F3 project age

YODA

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 89: Supporting Newcomers in Open Source Software Development Projects

Jim Alice

Is the mentor of

IF

F1

F2 gtTimeF3

F4 - 1st

F4 newcomer early emails

When Alice was a Student

YODA

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 90: Supporting Newcomers in Open Source Software Development Projects

When Alice joinsthe project

Jim Alice

Is the mentor of

IF

F1

F2 gtF3

F4 - 1st

F5

F5

Time

F5 Commits

YODA

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 91: Supporting Newcomers in Open Source Software Development Projects

Score Computed Aggregating the Factors in a Weighted Sum

Identify Past Successful Mentors

5

1iii fw

F1

F2 gtF3

F4 - 1st

F5

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 92: Supporting Newcomers in Open Source Software Development Projects

93

Recommending Mentors

Time

Project Developers

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 93: Supporting Newcomers in Open Source Software Development Projects

94

Time

Project Developers

Newcomer

t

Recommending Mentors

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 94: Supporting Newcomers in Open Source Software Development Projects

95

Time

Project Developers

Newcomer

t

Mentor with Adequate Skills

Recommending Mentors

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 95: Supporting Newcomers in Open Source Software Development Projects

96

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 96: Supporting Newcomers in Open Source Software Development Projects

97

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 97: Supporting Newcomers in Open Source Software Development Projects

98

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 98: Supporting Newcomers in Open Source Software Development Projects

99

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 99: Supporting Newcomers in Open Source Software Development Projects

100

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 100: Supporting Newcomers in Open Source Software Development Projects

101

Timet

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Past Project Developers

Recommending Mentors

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 101: Supporting Newcomers in Open Source Software Development Projects

102

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 102: Supporting Newcomers in Open Source Software Development Projects

103

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 103: Supporting Newcomers in Open Source Software Development Projects

104

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 104: Supporting Newcomers in Open Source Software Development Projects

105

Time

Past Mentors

Newcomer

t

Inspired to theWork on Bug Triaging by J Anvik et alTOSEM 2011

Recommending Mentors

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 105: Supporting Newcomers in Open Source Software Development Projects

106

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 106: Supporting Newcomers in Open Source Software Development Projects

107

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 107: Supporting Newcomers in Open Source Software Development Projects

108

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

085

03

1

064

094

081

024

1

077082

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Is it Possible to Recommend Mentors To Project Newcomers

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 108: Supporting Newcomers in Open Source Software Development Projects

109

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

Results When are Used Both Mails and Issues

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 109: Supporting Newcomers in Open Source Software Development Projects

110

Apache FreeBSD PostgreSQL Python Samba0

10

20

30

40

50

60

70

80

90

100

08

067

09 091 092

072

045

089084

076

Mentor Recommendations Precision on Top 1 and Top 2

Top 1Top 2

Prec

ision

YODA Make it Possibleto Recommend Mentors

It is Possible to Recommend Mentors To Project Newcomers

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 110: Supporting Newcomers in Open Source Software Development Projects

111

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 111: Supporting Newcomers in Open Source Software Development Projects

112

Use Issue Chat and Mail sources torecommend Mentors

Mentors

Mentors

Mentors

Mentors

Apac

he L

ucen

eSa

mba

0 10 20 30 40 50 60 70 80 90

20

20

40

20

80

60

80

60

80

80

60

60

MAIL ISSUE CHATPrecision in Recommending Mentors

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 112: Supporting Newcomers in Open Source Software Development Projects

113

YODA Architecture

httpwwwingunisannioitspanichellapagesprojectshtml

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 113: Supporting Newcomers in Open Source Software Development Projects

114

YODA Architecture

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 114: Supporting Newcomers in Open Source Software Development Projects

115

YODA Architecture

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 115: Supporting Newcomers in Open Source Software Development Projects

116

YODA Tool

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 116: Supporting Newcomers in Open Source Software Development Projects

117

YODA Tool

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 117: Supporting Newcomers in Open Source Software Development Projects

118

YODA Tool

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 118: Supporting Newcomers in Open Source Software Development Projects

119

YODA Tool

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 119: Supporting Newcomers in Open Source Software Development Projects

120

YODA Tool

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 120: Supporting Newcomers in Open Source Software Development Projects

121

YODA Tool

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 121: Supporting Newcomers in Open Source Software Development Projects

122

YODA Tool

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 122: Supporting Newcomers in Open Source Software Development Projects

123

YODA Tool

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 123: Supporting Newcomers in Open Source Software Development Projects

124

YODA Tool

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 124: Supporting Newcomers in Open Source Software Development Projects

125

PART III ndash B)

Mining Source Code Descriptions from Developer Communications to Improve

Newcomers Program Comprehension

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 125: Supporting Newcomers in Open Source Software Development Projects

126

Effort in Program Comprehension

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 126: Supporting Newcomers in Open Source Software Development Projects

127

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 127: Supporting Newcomers in Open Source Software Development Projects

128

Mining Summaries

We argue that messages exchanged among contributorsdevelopers are a useful source of information to help understanding source code

In such situations developers need to infer knowledge from

the source code itself source code descriptions in external artifacts

Newcomer

Can findSource code description

When call the method IndexSplittersplit(File destDir String[] segs) from the Lucene cotrib directory(contribmiscsrcjavaorgapacheluceneindex) it creates an index with segments descriptor file with wrong data Namely wrong is the number representing the name of segment that would be created next in this index

CLASS IndexSplitter METHOD split

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 128: Supporting Newcomers in Open Source Software Development Projects

129

A Five Step-Approach for Mining Method Descriptions

bull Step 1 Downloading emailsbugs reports and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods

bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

Sebastiano Panichella Jairo Aponte Massimiliano Di Penta Andrian Marcus Gerardo Canfora Mining source code descriptions from developer communications

International Conference on Program Comprehension (IEEE ICPC 2012)

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 129: Supporting Newcomers in Open Source Software Development Projects

130

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer

Supporting Software Development

QampA SITE

ISSUE TRACKER

MAILING LIST

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 130: Supporting Newcomers in Open Source Software Development Projects

131

Approach Precision vs Number of Method Covered

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 131: Supporting Newcomers in Open Source Software Development Projects

132

Approach Precision vs Number of Method Covered

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 132: Supporting Newcomers in Open Source Software Development Projects

133

Approach Precision vs Number of Method Covered

We mine useful java methods description from developers discussions

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 133: Supporting Newcomers in Open Source Software Development Projects

134

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 134: Supporting Newcomers in Open Source Software Development Projects

135

Help Newcomer Program Comprehension with extraction of summaries of code elements from

Newcomer QampA SITE

ISSUE TRACKER

MAILING LIST

Supporting Software Development

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 135: Supporting Newcomers in Open Source Software Development Projects

136

StackOverflow

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 136: Supporting Newcomers in Open Source Software Development Projects

137

StackOverflow

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 137: Supporting Newcomers in Open Source Software Development Projects

138

bull Step 1 Downloading SO discussions relying on its REST interface and tracing them onto classes

bull Step 2 Extracting paragraphs

bull Step 3 Tracing paragraphs onto methods ( Discards Paragraphs of discussions with 0 Votes)bull Step 4 Heuristic based Filtering bull Step 5 Similarity based Filtering

CODESApproach for Mining Method Descriptions

Carmine Vassallo Sebastiano Panichella Massimiliano Di Penta Gerardo Canfora CODES mining source code descriptions from developers discussions BEST TOOL AWARD at the 22nd International Conference on Program Comprehension (IEEE ICPC 2014)

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 138: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

139httpwwwingunisannioitspanichellapagesprojectshtml

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 139: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

140

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 140: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

141

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 141: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

142

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 142: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

143

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 143: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

144

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 144: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

145

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 145: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

146

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 146: Supporting Newcomers in Open Source Software Development Projects

CODES Tool

147

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 147: Supporting Newcomers in Open Source Software Development Projects

148

PART III

Summary

Recommenders

1) YODA make it possible to recommend mentors with a precision higher than 67

3) Combining Mails and Issues improve recommendersrsquo performance

2) CODES identifies relevant descriptions with a precision higher than 79

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 148: Supporting Newcomers in Open Source Software Development Projects

149

Future Work andConclusion

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 149: Supporting Newcomers in Open Source Software Development Projects

Future workhellip

Building better recommenders for software re-modularization or refactoring based on social interaction between developers

Performing a survey asking to developers to validate of the social links identified by analyzing different communication channels

We will aim at building recommenders to help newcomer in the choice of appropriate patterns to navigate software documentation during maintenance tasks

New Recommenders

Improve ExistingRecommenders

Improve the mentor recommender (YODA) by considering factorsable to better capture the technical skills of mentors

Improve CODES increasing the precision and coverage as high as possiblereducing the percentage of false positives Include a better classification of discussion

content using of natural language parsers

150

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 150: Supporting Newcomers in Open Source Software Development Projects

151

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

G Bavota Sebastiano Panichella N Tsantalis M Di Penta ROliveto G CanforaRecommending Refactorings based on Team Co-Maintenance Patterns

The 29th International Conference on Automated Software Engineering (ASE 2014)

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 151: Supporting Newcomers in Open Source Software Development Projects

152

Partition Attributes

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 152: Supporting Newcomers in Open Source Software Development Projects

153

Extract Class

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 153: Supporting Newcomers in Open Source Software Development Projects

154

Team Based RefactoringInformation derived from teams to identify refactoring opportunities

Such previous work motivate our conjecture that it is possible to suggest re-modularization (re-factoring) actions relying on data about social interaction between developers

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 154: Supporting Newcomers in Open Source Software Development Projects

155

PART I

PART II

PART III

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 155: Supporting Newcomers in Open Source Software Development Projects

156

PART I

PART II

PART III

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 156: Supporting Newcomers in Open Source Software Development Projects

157

PART I

PART II

PART III

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 157: Supporting Newcomers in Open Source Software Development Projects

158

PART I

PART II

PART III

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 158: Supporting Newcomers in Open Source Software Development Projects

159

PART I

PART II

PART III

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 159: Supporting Newcomers in Open Source Software Development Projects

160

PART I

PART II

PART III

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161
Page 160: Supporting Newcomers in Open Source Software Development Projects

161

PART I

PART II

PART III

  • Slide 1
  • Newcomer Learning Pathhellip
  • Newcomer Learning Pathhellip (2)
  • Newcomer Learning Pathhellip (3)
  • Newcomer Learning Pathhellip (4)
  • Newcomer Learning Pathhellip (5)
  • Newcomer Learning Pathhellip (6)
  • Newcomer Learning Pathhellip (7)
  • Slide 9
  • Slide 10
  • Slide 11
  • Slide 12
  • Slide 13
  • Thesis Structure
  • Thesis Structure (2)
  • Thesis Structure (3)
  • Thesis Structure (4)
  • PART I
  • Slide 19
  • Slide 21
  • How Developersrsquo Collaborations Networks Identified from Differe
  • Example Hibernate OSS Project
  • Slide 24
  • Slide 25
  • Slide 26
  • Slide 27
  • Slide 28
  • Slide 29
  • Slide 30
  • Slide 31
  • Slide 32
  • Similarity Measure of Topics Extracted from Different Communica
  • Similarity Measure of Topics Extracted from Different Communica (2)
  • Slide 35
  • Slide 36
  • Slide 37
  • Slide 38
  • Slide 39
  • Slide 40
  • Slide 41
  • Slide 42
  • Slide 43
  • Slide 44
  • Case Study
  • Slide 46
  • Slide 47
  • Slide 48
  • Slide 49
  • Slide 50
  • Slide 51
  • PART I Summary
  • PART II
  • Slide 54
  • Slide 55
  • Slide 56
  • Experiment A Context
  • Slide 58
  • How Much Time did Participants Spend on Different Kinds of Arti
  • How Much Time did Participants Spend on Different Kinds of Arti (2)
  • How Much Time did Participants Spend on Different Kinds of Arti (3)
  • Navigation Patterns Followed By Developers Before Reaching Sour
  • Most Frequent Navigation Patterns Before Reaching Source Code
  • Most Frequent Navigation Patterns Before Reaching Source Code (2)
  • Most Frequent Navigation Patterns Before Reaching Source Code (3)
  • Transition Graph between Kinds of Software Artifacts
  • Slide 67
  • Slide 68
  • Slide 69
  • Slide 70
  • Slide 71
  • Slide 72
  • Slide 73
  • Slide 74
  • PART II Summary
  • PART III
  • Slide 77
  • Slide 78
  • Slide 79
  • Slide 80
  • Motivation
  • Motivation (2)
  • Characteristics of a Good Mentor
  • Slide 84
  • Source of Inspiration Arnetminer
  • Source of Inspiration Arnetminer (2)
  • Slide 87
  • Slide 88
  • Slide 89
  • Slide 90
  • Slide 91
  • Slide 92
  • Recommending Mentors
  • Recommending Mentors (2)
  • Recommending Mentors (3)
  • Recommending Mentors (4)
  • Recommending Mentors (5)
  • Recommending Mentors (6)
  • Recommending Mentors (7)
  • Recommending Mentors (8)
  • Recommending Mentors (9)
  • Recommending Mentors (10)
  • Recommending Mentors (11)
  • Recommending Mentors (12)
  • Recommending Mentors (13)
  • Is it Possible to Recommend Mentors To Project Newcomers
  • Is it Possible to Recommend Mentors To Project Newcomers (2)
  • Is it Possible to Recommend Mentors To Project Newcomers (3)
  • Results When are Used Both Mails and Issues
  • It is Possible to Recommend Mentors To Project Newcomers
  • Slide 111
  • Slide 112
  • Slide 113
  • Slide 114
  • Slide 115
  • YODA Tool
  • YODA Tool (2)
  • YODA Tool (3)
  • YODA Tool (4)
  • YODA Tool (5)
  • YODA Tool (6)
  • YODA Tool (7)
  • YODA Tool (8)
  • YODA Tool (9)
  • Slide 125
  • Slide 126
  • Slide 127
  • Slide 128
  • A Five Step-Approach for Mining Method Descriptions
  • Slide 130
  • Slide 131
  • Slide 132
  • Slide 133
  • Slide 134
  • Slide 135
  • StackOverflow
  • StackOverflow (2)
  • Slide 138
  • Slide 139
  • Slide 140
  • Slide 141
  • Slide 142
  • Slide 143
  • Slide 144
  • Slide 145
  • Slide 146
  • Slide 147
  • PART III Summary
  • Future Work and Conclusion
  • Future workhellip
  • Slide 151
  • Slide 152
  • Slide 153
  • Slide 154
  • Slide 155
  • Slide 156
  • Slide 157
  • Slide 158
  • Slide 159
  • Slide 160
  • Slide 161