11
Research Article Understanding Contributor to Developer Turnover Patterns in OSS Projects: A Case Study of Apache Projects Aftab Iqbal INSIGHT, NUI, Galway, Ireland Correspondence should be addressed to Aſtab Iqbal; aſt[email protected] Received 31 August 2013; Accepted 17 November 2013; Published 19 January 2014 Academic Editors: Y. Dittrich and Y. K. Malaiya Copyright © 2014 Aſtab Iqbal. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. OSS projects are dynamic in nature. Developers contribute to a project for a certain period of time and later leave the project or join other projects of high interest. Hence, the OSS community always welcomes members who can attain the role of a developer in a project. In this paper, we investigate contributions made by members who have attained the role of a developer. In particular, we study the contributions made by the members in terms of bugs reported, comments on bugs, source-code patch submissions, and their social relation with other members of an OSS community. Further, we study the significance of nondevelopers contribution and investigate if and to what extent they play a role in the long-term survival of an OSS project. Moreover, we investigate the ratio of contributions made by a member before and aſter attaining the role of a developer. We have outlined 4 research questions in this regard and further discuss our findings based on the research questions by taking into account data from soſtware repositories of 4 different Apache projects. 1. Introduction and Motivation Open source soſtware (OSS) is a good example of global soſtware development. It has gained a lot of attraction from the public and the soſtware engineering community over the past decade. e success of an OSS project is highly dependent on the infrastructure provided by the community to the developers and users in order to collaborate with each other [1]. It is important to understand how the OSS project and the community surrounding it evolve over time. During the project and community evolution, the roles of the members change significantly, depending on how much the member wants to get involved into the community. Unlike a project member in a soſtware company whose role is determined by a project manager and remains unchanged for a long period of time until the member is promoted or leaves, the role in an OSS project is not preassigned and is assumed by a member as he/she interacts with other members. An active and determined member usually becomes a “core member” through the following path: a newcomer starts as a “reader”, reading messages on the mailing lists, going through the wiki pages and other documentation, and so forth, in order to understand how the system works. Later, he starts to discover and report bugs, which does not require any technical knowledge, and becomes a “bug reporter”. Aſter gaining good understanding of the system and community, he may start fixing small and easy bugs which he identifies himself or are reported by other members of the system, hence playing the role of either a “bug fixer,” “peripheral developer,” or an “active developer.” To this stage, his bug fixes are usually accepted through patches submitted on the mailing lists or bug tracking system. Finally, aſter some important contributions are accepted by the core developers, the member may obtain the right of committing source code directly to the source control repository, hence becoming the “core member” of the project. is process is also called “joining script” [2], also referred to as “immigration process” [3]. e general layered structure of OSS communities as discussed above is further depicted in Figure 1, in which the role closer to the center has a larger radius of influence on the system. e figure depicts an ideal model of role change in the OSS community. However, not all members want to be or become the “core member.” Some remain “passive user” and Hindawi Publishing Corporation ISRN Soware Engineering Volume 2014, Article ID 535724, 10 pages http://dx.doi.org/10.1155/2014/535724

Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

  • Upload
    others

  • View
    18

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

Research ArticleUnderstanding Contributor to Developer Turnover Patterns inOSS Projects A Case Study of Apache Projects

Aftab Iqbal

INSIGHT NUI Galway Ireland

Correspondence should be addressed to Aftab Iqbal aftabiqbalderiorg

Received 31 August 2013 Accepted 17 November 2013 Published 19 January 2014

Academic Editors Y Dittrich and Y K Malaiya

Copyright copy 2014 Aftab Iqbal This is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

OSS projects are dynamic in nature Developers contribute to a project for a certain period of time and later leave the project orjoin other projects of high interest Hence the OSS community always welcomes members who can attain the role of a developer ina project In this paper we investigate contributions made by members who have attained the role of a developer In particular westudy the contributions made by the members in terms of bugs reported comments on bugs source-code patch submissions andtheir social relation with other members of an OSS community Further we study the significance of nondevelopers contributionand investigate if and to what extent they play a role in the long-term survival of an OSS project Moreover we investigate the ratioof contributions made by a member before and after attaining the role of a developer We have outlined 4 research questions in thisregard and further discuss our findings based on the research questions by taking into account data from software repositories of4 different Apache projects

1 Introduction and Motivation

Open source software (OSS) is a good example of globalsoftware development It has gained a lot of attraction fromthe public and the software engineering community overthe past decade The success of an OSS project is highlydependent on the infrastructure provided by the communityto the developers and users in order to collaborate witheach other [1] It is important to understand how the OSSproject and the community surrounding it evolve over timeDuring the project and community evolution the roles of themembers change significantly depending on how much themember wants to get involved into the community Unlikea project member in a software company whose role isdetermined by a project manager and remains unchanged fora long period of time until the member is promoted or leavesthe role in an OSS project is not preassigned and is assumedby a member as heshe interacts with other members Anactive and determined member usually becomes a ldquocorememberrdquo through the following path a newcomer starts as aldquoreaderrdquo readingmessages on themailing lists going throughthe wiki pages and other documentation and so forth in

order to understand how the system works Later he startsto discover and report bugs which does not require anytechnical knowledge and becomes a ldquobug reporterrdquo Aftergaining good understanding of the system and communityhe may start fixing small and easy bugs which he identifieshimself or are reported by other members of the systemhence playing the role of either a ldquobug fixerrdquo ldquoperipheraldeveloperrdquo or an ldquoactive developerrdquo To this stage his bugfixes are usually accepted through patches submitted onthe mailing lists or bug tracking system Finally after someimportant contributions are accepted by the core developersthe member may obtain the right of committing source codedirectly to the source control repository hence becomingthe ldquocore memberrdquo of the project This process is also calledldquojoining scriptrdquo [2] also referred to as ldquoimmigration processrdquo[3] The general layered structure of OSS communities asdiscussed above is further depicted in Figure 1 in which therole closer to the center has a larger radius of influence on thesystem

The figure depicts an ideal model of role change in theOSS community However not all members want to be orbecome the ldquocore memberrdquo Some remain ldquopassive userrdquo and

Hindawi Publishing CorporationISRN Soware EngineeringVolume 2014 Article ID 535724 10 pageshttpdxdoiorg1011552014535724

2 ISRN Software Engineering

some stop somewhere in themiddleThekey point is thatOSSmakes it possible for an aspiring and determined developerto be part of the ldquocore membersrdquo group of developersthrough continuous contributions On the other hand thesustainability of an OSS project is related to the growth ofthe developer community The community surrounding anOSS must regenerate itself through the contributions of theirmembers and continuous emergence of new ldquocore membersrdquootherwise the project is going to stop or fail An example isthe GIMP project (httpwwwgimporg) [5] which startedas an academic project When the creators left the universityand decided to work on something else the project stoppedfor more than a year until someone else decided to take overthe control and resume working on the project Thereforeattracting or integrating newmembers is an important aspectto keep the system and the community evolve over time

Given these precedents the research goal of the studypresented in this paper is to understand the pattern ofcontributions made by members who eventually attainedthe role of a developer in an OSS project that is joiningthe ldquocore membersrdquo group of developers We are interestedto investigate the key factors which led members towardsattaining the role of a developer We studied the immigrationprocess in OSS projects as done in the past by othersbut using a quantitative approach based on extensive datamining The contribution of this paper is manifolds westudy the contributions made by the members in terms ofbugs reported comments on different bugs attachmentsor source-code patch submissions to fix certain bugs andsocial relation with other members on the mailing list in aparticular OSS communityproject Further we analyze thecontributions made by members before and after attainingthe role of a developer Moreover we compute the ratio ofaverage contributions made by a developer (before attainingthe role of a developer) and compare it with the averagecontributions made by other members of the project

The rest of the paper is structured as follows the relatedwork comparable to our approach will be discussed inSection 2 Research questions are outlined in Section 3 InSection 4 the methodology we used to extract informationfrom different software repositories is described Section 5presents the results based on the research questions andfinally in Section 6 we conclude our work

2 Related Work

The process of joining an OSS project has been studied bymany researchers in the past In this line the best knownmodel which describes the organizational structure of anOSS project is the ldquoonion modelrdquo [10] (cf Figure 1) avisual analogy which depicts how the members of an OSSproject are positioned within a community The onion-likestructure represents only a static picture of the projectlacking the time dimensionwhich is required to study the roletransformation (ie promotion) from being a passive userto the core member of the project Ye et al complementedthis shortcoming with a more theoretical identification anddescription of roles [5] According to this model a core

Project leader

Core membersActive developers

Bug fixers

Bug reporters

ReadersPassive users

Peripheral developers

Figure 1 General structure of an OSS community based on theonion model described in [4]

member is supposed to go through all the roles starting as apassive user until heshe attains the role of a core member Inthis regard Jensen and Scacchi also studied and modeled theprocess of rolemigration inOSS projects [11] focusing on endusers who eventually become core members They identifieddifferent paths for the joining process and concluded thatthe organizational structure of studied OSS projects is highlydynamic in nature

Von Krogh et al studied the joining and specializationprocess of FreeNet project [2] Based on the data gatheredfrom publicly available documents mailing list archivesand the source control repository they discovered thatoffering bug fixes is much common among newcomers whoeventually become core members of the project They alsofound that a certain period of time ranging from couple ofweeks to several months was required before a newcomercould contribute to a technical discussion There also existfew research studies which have reported and quantified theonion-like structure of a community for many OSS projectsFor example Mockus et al [12] studied the Apache httpdserver and Mozilla web browser projects and Dinh-Trongand Bieman [13] studied the FreeBSD project According totheir findings the ldquocore membersrdquo group is composed ofsmall number of members Surrounding the ldquocore membersrdquogroup is a large group of contributors (ie active developersperipheral developers etc) who submit bug reports offer bugfixes and participate heavily in discussions on the mailinglists

In an ethnographic study Ducheneaut studied the Pythonproject in order to investigate the contribution of the mem-bers during their role transition from being a newcomertowards attaining the role of a core member by taking intoaccount data frommailing lists and source control repository[14] He found that prior technical commitment and good

ISRN Software Engineering 3

social standing in the community were strong factors injoining the core members group of developers having write-access to the source control repository Bird et al [3] usedthe mailing lists and source control repository to investigatethe time required for members to be invited into the ldquocoremembersrdquo group of an OSS project They applied hazardrate analysis or survival analysis [15] to model the time-dependent phenomena such as employment duration Theyused survival analysis to understand which factors influencethe duration and occurrence of such events and to whatdegree They modeled the duration between activities byconsidering the first appearance of a member on the mailinglist to the first commit on the source control repositoryOne of their findings was that prior patch submission had astrong effect on becoming part of the ldquocore membersrdquo groupof a project Herraiz et al [16] studied the GNOME projectand found two different patterns of joining the project (1)volunteerscontributors who follow the ldquoonion modelrdquo and(2) firmorganization sponsored developerswho do notTheyfound that hired developers gain knowledge quickly enoughto start writing code than the volunteers

Although these research studies were carried out in detailon different OSS projects they considered data only frommailing lists and source control repositories However wealso take into account bug repositories and quantify thecontributions made by members in terms of the followingbugs reported comments on bugs social relation with othermembers based on comments social relation with othermembers based on email exchanges on the mailing list andpatch submissions on bug repositories In addition to thatthere is no published work known to us which studies thecontributions made by a member before and after attainingthe role of a developer in an OSS project Therefore we havequantified and analyzed the average rate of contributionsmade by a member before and after attaining the role of adeveloper which makes this work unique in contrast to otherrelated pieces of work which have been done so far in thisarea

3 Research Questions

As mentioned earlier the success of an OSS project is in itslong-term survival which is potentially due to the existenceof a community surrounding the project We are particularlyinterested to identify the role of a community in the long-term survival of an OSS project as well as the key factorswhich promote a nondeveloper (In this paper we will usethe term ldquonondeveloperrdquo to refer to all those members whodo not have write-access to the source control repository)to the role of a developer (In this paper we will use theterm ldquodeveloperrdquo to refer to all those members who havewrite-access to the source control repository) Further we areinterested to know if the potential developers (In this paperwe will use the term ldquopotential developerrdquorsquo to refer to all thosemembers who started as a passive user and later attained therole of a developer) follow the onion model or if there isa sudden integration of developers into the ldquocore membersrdquogroup of an OSS project In order to address these key pointswe have outlined few research questions in the following

which will be addressed using data from publicly availablesoftware repositories of few selected Apache projects

(1) RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over theperiod of timePrevious studies [12 17] on various OSS projects haveshown thatmost part of the source-code developmentis carried out by the developers of those projectsWe will investigate what are the contributions ofnondevelopers if the source-code development ismostly done by the developers of those projects Inparticular we will investigate the contributions ofnondevelopers in terms of reporting bugs comment-ing on bugs and exchanging emails Further we areinterested to investigate the role of nondevelopers inthe long-term success andmaturity of anOSS project

(2) RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the roleof a developerAttaining a higher role comes with more responsibil-ities and commitments to the project We will investi-gate if a potential developer after attaining the role of adeveloper contributes (except source-code modifica-tion or bug fixing) more in contrast to contributing asa nondeveloperDoes the contribution pattern changewith the change in role of a potential developer To bemore precise does hisher contribution to the projectin terms of bugs reporting and interaction with thecommunity increase or decreaseWehypothesize thatafter attaining the role of a developer heshe willparticipate actively in technical discussions on thebug tracking systems or on themailing lists and reportbugs effectively

(3) RQ-3 What is the average rate of contributions madeby a potential developer comparing to other membersof the project before attaining the role of a developerWe will investigate if the average contributions madeby a potential developer are more than the aver-age contributions made by nondevelopers who werealso active during hisher time period It has beenaddressed in previous studies [3] that demonstrationof technical commitment and social status with othermembers will positively influence in attaining therole of a developer We will investigate if a potentialdeveloper was more active (ie technically skilledand higher social status) than nondevelopers beforeattaining the role of a developer

(4) RQ-4 Does a potential developer follow onion modelin order to attain the role of a developerWe will investigate if a potential developer followsthe onion model in order to attain the role of adeveloper that is joining the ldquocore membersrdquo groupof the project Not every member who is contributingto an OSS project eventually becomes a developerIt depends on the level of involvement of a memberin an OSS project and also on the needs to promote

4 ISRN Software Engineering

Table 1 Apache projects data range

Apache projects Date rangeApache Ant [6] 2000ndash2010Apache Lucene [7] 2001ndash2010Apache Maven [8] 2003ndash2010Apache Solr [9] 2006ndash2010

a nondeveloper to the role of a developer There isno static or standard timeline for a member to jointhe ldquocore membersrdquo group of a project The timeperiod required to attain the role of a developervaries from project to project and also from memberto member Members often start contributing to aproject by participating in the mailing list conversa-tions to get themselves familiarwith the project beforecontributing source-code patches to the project Wewill study the appearance of a potential developer ondifferent software repositories by comparing the time-stamp value of their first activity on these softwarerepositories in order to validate if heshe actuallyfollowed the onion model

4 Data Extraction Process

In this section we describe our data extraction methodologyand theApache projects selected for evaluationWe extracteddata from 4 differentApache projects as shown in Table 1Therange of data selected for each project is different because ofthe difference in the starting date of each project The reasonof choosing these Apache projects is that the repositories ofthese projects are on theWeb and available to be downloaded(ie mailing list archives bugs subversion logs etc)

MostApache projects have at least 3 differentmailing listsuser dev and commits but some havemore than 3mailinglists (eg announcements notifications etc) For ourstudy we downloaded only the dev mailing list archives ofeach Apache project under consideration The reason is thatsoftware developers communicate often with each other onthe dev mailing list rather than on any other mailing listsWe developed our own scripts which were used to extractinformation from mailing list archives in a similar mannerto previous research [12 18] For example each email wasprocessed to extract information like sender name emailaddress subject date message-id and reference The referencefield containsmessage-id(s) if the email is a reply to previousthread(s) We used the reference field information to build asocial network correspondence and computed social networkmeasures [19] of all the members of a project

We retrieved all the bugs (related to the Apache projectswe considered for our study) which are publicly availablethrough the Bugzilla and JIRA Web interface (httpsissuesapacheorg) and extracted the required information usingour custom written scripts For further details on the infor-mation extracted from each email and bug we refer the read-ers to [20] We computed the social relation correspondenceamongmembers on the bug tracking system based on the bug

Table 2 Dataset overview

Attachments Bugs Commits EmailsApache Ant 1345 5480 6025 84737Apache Lucene 2865 3116 5790 59616Apache Maven 1169 3902 8815 87611Apache Solr 2146 2528 4288 25173

comments exchanged among themselves Bird et al [21]findings indicated the detection and acceptance of source-code patches through the mailing list but we discovered thatsource-code patches were always attached to the respectivebugs on the bug tracking system rather than sending it on themailing list Prior research has indicated the importance ofoffering bug fixing and its acceptance as an influential factorin gaining the developer status [14] therefore we have alsoanalyzed how many source-code patches were submitted bythe members on bug repositories

In order to get information from source control reposi-tory we wrote our script (see [22] for details) and extractednecessary information (ie log number date of commitauthor id and files committed) We only considered thosesubversion logs where a particular source-code file (ieldquolowastjavardquo because Apache projects under consideration areJava-based) was committed These subversion log files werefurther analyzed by our script in order to identify if it fixesany bug by looking for patterns such as ldquoPRxxxrdquo ldquoMNG-xxxrdquo ldquoSOLR-xxxrdquo ldquoLUCENE-xxxrdquo and patch acceptanceacknowledgements such as ldquopatch provided by xxxrdquo ldquopatchsubmitted by xxxrdquo On the identification of such patterns thebugs were queried to retrieve source-code patches associatedwith those bugs This would help to identify source-codepatches that are accepted by the ldquocore membersrdquo group of theproject Further it allows to identify members who possessstrong technical skills required for attaining the role of adeveloper in the project Table 2 gives an overview on the rawdata sources we extracted fromdifferent software repositoriesof the selected Apache projects based on the methodologydescribed

The values for Apache Ant show that there were 1345source-code patches found for a total of 5480 bugs reportedon the bug tracking system Further 6025 subversion logswere extracted from the source control repository wheresource-code files (ie lowastjava) were committed and 84737emails were extracted from the Apache Ant mailing listarchives between 2000 and 2010

5 Empirical Analysis

Before we address each of the research questions in detailwe present a high level overview on the development activityof each Apache project under consideration over the periodof time in Figure 2 This would give an insight into howmuch contributions were made each year to a project andthe peak development years of a project For each Apacheproject under consideration we computed the number ofcontributionswith respect to the number of peoplewhomadethose contributions For example we computed the number

ISRN Software Engineering 5

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

1

Figure 2 Development activity of Apache projects over the period of time

of distinct bugs reported each year along with the numberof distinct reporters who submitted those bug reports (cfFigure 2) This would make it easier to answer simplequestions like how many bugs were reported and how manymembers were involved in the bug reporting process duringthe 2nd year of a project

Preliminaries let C be the total number of members (iedevelopers nondevelopers etc) who worked on the project

C = 1198881 1198882 119888

119899 (1)

Let Y be the total number of years of a project underconsideration

Y = 1199101 1199102 119910

119899 (2)

where y1is considered to be the first year of the project y

2is

considered to be the second year of the project and so on LetC be the set of members (ie developers nondevelopers etc)who were active in a time period 119910

C = 1198881 1198882 119888

119899 C sube C (3)

and let Immig be the immigrants (ie potential developers)who started as contributors and later become the developers

6 ISRN Software Engineering

of the project We classified only those members as immi-grantspotential developers who had an activity in the project(ie number of bugs reported number of bugs commentednumber of patches submitted or number of emails sent)at least 4 months prior to their first commit on the sourcecontrol repository

Immig = Immig1 Immig

2 Immig

119899

Immig sube 119863119910sube C

(4)

Let 119863119910be the set of developers who have made commits

before and during time period y such that 119863119910sube C Let the

total number of bugs reported and commented and emailssent by a set of members in a time period y be represented asfollows

Contributionbugs (119862 119910)

Contributioncomments (119862 119910)

Contributionemails (119862 119910)

(5)

whereas the number of bugs reported and commented andemails sent by the developers in a given period of time y isrepresented as

Contributionbugs (119863119910 119910)

Contributioncomments (119863119910 119910)

Contributionemails (119863119910 119910)

(6)

respectively Let d be a single developer and letcommitDate(d) return the first commit date of a developerThe yearly average contribution of a member before and afterattaining the role of a developer is represented as

Contributionbefore (119889 commitDate (119889))

Contributionafter (119889 commitDate (119889)) (7)

and the total number of bugs reported and commented andemails sent by an immigrant before becoming a developer isrepresented as

Contributionbugs (Immig commitDate (Immig))

Contributioncomments (Immig commitDate (Immig))

Contributionemails (Immig commitDate (Immig))

(8)

RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over the period oftime

In order to compute the contributions we need todistinguish between the developers and nondevelopers of theproject As each subversion log has a time stamp associatedto it we queried all subversion logs from the start dateof the project till the last commit date of the year underconsideration Based on this we get a list of all developersIDs who have contributed to the source control repository

Table 3 Average rate of contributions made by developers and nondevelopers

Variable Contributions ParticipantsDev Non dev Dev Non dev

Apache Antbugs reported 2970 50930 630 37740bug comments 68150 41601 1130 24890emails 477391 562800 1454 28491

Apache Lucenebugs reported 14111 15677 866 9133bug comments 50711 24133 1077 9533emails 354700 574566 1366 17322

Apache Mavenbugs reported 20487 26825 1100 16500bug comments 56928 33428 1385 19514emails 432837 950575 1337 26087

Apache Solrbugs reported 20110 26560 1020 11600bug comments 81903 43280 1100 16660emails 180825 604775 900 11250

till that particular year For each developer ID we computedthe contributions (ie bugs reported comments on bugsemails etc) made to the project on yearly basis and add upthe contributions made by all the developers for each yearSimilarly we computed the contributions made by nonde-velopers on yearly basis and add up all their contributionsfor each year Later we plotted the contributions made bythe developers and nondevelopers for each year in the formof a chart which is shown in Figure 3 Figure 3 shows thecomparison of contributions made by the developers andnondevelopers of each Apache project under considerationFurther we computed the average rate of contributions madeby the developers and nondevelopers as well as the averagenumber of developers and nondevelopers who made thosecontributions per year which is shown in Table 3 Forexample the number of bugs reported by the nondevelopersin a given period of time y is computed as follows

Contributionbugs (119862

119863119910

119910) (9)

and the average number of bugs reported by the developersand nondevelopers is computed as follows

sum119910isinY

Contributionbugs (119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

sum119910isinY

Contributionbugs (119862119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

(10)

Let us assume that the nondevelopers who were active ina certain period of time is calculated by nonDev(C119863

119910 119910)

ISRN Software Engineering 7

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

1

100000

10000

1000

100

10

Cou

nts

1

100000

10000

1000

100

10

Cou

nts

1

Figure 3 Contributions made by developers and nondevelopers over the period of time

the average participation ratio of developers and nondevel-opers is computed as follows

sum119910isinY

119863119910

10038161003816100381610038161199101003816100381610038161003816 sum

119910isinY

nonDev (C119863119910 119910)

10038161003816100381610038161199101003816100381610038161003816

(11)

The results in Table 3 show that nondevelopers arehighly involved (ie contributing more than the developers)

in reporting bugs and participating in discussions on themailing list One potential reason for this is the existence ofa huge community surrounding theseApache projects Giventhat discussingcommenting on a bug report requires tech-nical knowledge about the project which is why developersappear to be more active in commenting on the bug reportsthan nondevelopers it is quite obvious from Table 3 (alsosee Figure 3) that nondevelopers play a significant role in

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

2 ISRN Software Engineering

some stop somewhere in themiddleThekey point is thatOSSmakes it possible for an aspiring and determined developerto be part of the ldquocore membersrdquo group of developersthrough continuous contributions On the other hand thesustainability of an OSS project is related to the growth ofthe developer community The community surrounding anOSS must regenerate itself through the contributions of theirmembers and continuous emergence of new ldquocore membersrdquootherwise the project is going to stop or fail An example isthe GIMP project (httpwwwgimporg) [5] which startedas an academic project When the creators left the universityand decided to work on something else the project stoppedfor more than a year until someone else decided to take overthe control and resume working on the project Thereforeattracting or integrating newmembers is an important aspectto keep the system and the community evolve over time

Given these precedents the research goal of the studypresented in this paper is to understand the pattern ofcontributions made by members who eventually attainedthe role of a developer in an OSS project that is joiningthe ldquocore membersrdquo group of developers We are interestedto investigate the key factors which led members towardsattaining the role of a developer We studied the immigrationprocess in OSS projects as done in the past by othersbut using a quantitative approach based on extensive datamining The contribution of this paper is manifolds westudy the contributions made by the members in terms ofbugs reported comments on different bugs attachmentsor source-code patch submissions to fix certain bugs andsocial relation with other members on the mailing list in aparticular OSS communityproject Further we analyze thecontributions made by members before and after attainingthe role of a developer Moreover we compute the ratio ofaverage contributions made by a developer (before attainingthe role of a developer) and compare it with the averagecontributions made by other members of the project

The rest of the paper is structured as follows the relatedwork comparable to our approach will be discussed inSection 2 Research questions are outlined in Section 3 InSection 4 the methodology we used to extract informationfrom different software repositories is described Section 5presents the results based on the research questions andfinally in Section 6 we conclude our work

2 Related Work

The process of joining an OSS project has been studied bymany researchers in the past In this line the best knownmodel which describes the organizational structure of anOSS project is the ldquoonion modelrdquo [10] (cf Figure 1) avisual analogy which depicts how the members of an OSSproject are positioned within a community The onion-likestructure represents only a static picture of the projectlacking the time dimensionwhich is required to study the roletransformation (ie promotion) from being a passive userto the core member of the project Ye et al complementedthis shortcoming with a more theoretical identification anddescription of roles [5] According to this model a core

Project leader

Core membersActive developers

Bug fixers

Bug reporters

ReadersPassive users

Peripheral developers

Figure 1 General structure of an OSS community based on theonion model described in [4]

member is supposed to go through all the roles starting as apassive user until heshe attains the role of a core member Inthis regard Jensen and Scacchi also studied and modeled theprocess of rolemigration inOSS projects [11] focusing on endusers who eventually become core members They identifieddifferent paths for the joining process and concluded thatthe organizational structure of studied OSS projects is highlydynamic in nature

Von Krogh et al studied the joining and specializationprocess of FreeNet project [2] Based on the data gatheredfrom publicly available documents mailing list archivesand the source control repository they discovered thatoffering bug fixes is much common among newcomers whoeventually become core members of the project They alsofound that a certain period of time ranging from couple ofweeks to several months was required before a newcomercould contribute to a technical discussion There also existfew research studies which have reported and quantified theonion-like structure of a community for many OSS projectsFor example Mockus et al [12] studied the Apache httpdserver and Mozilla web browser projects and Dinh-Trongand Bieman [13] studied the FreeBSD project According totheir findings the ldquocore membersrdquo group is composed ofsmall number of members Surrounding the ldquocore membersrdquogroup is a large group of contributors (ie active developersperipheral developers etc) who submit bug reports offer bugfixes and participate heavily in discussions on the mailinglists

In an ethnographic study Ducheneaut studied the Pythonproject in order to investigate the contribution of the mem-bers during their role transition from being a newcomertowards attaining the role of a core member by taking intoaccount data frommailing lists and source control repository[14] He found that prior technical commitment and good

ISRN Software Engineering 3

social standing in the community were strong factors injoining the core members group of developers having write-access to the source control repository Bird et al [3] usedthe mailing lists and source control repository to investigatethe time required for members to be invited into the ldquocoremembersrdquo group of an OSS project They applied hazardrate analysis or survival analysis [15] to model the time-dependent phenomena such as employment duration Theyused survival analysis to understand which factors influencethe duration and occurrence of such events and to whatdegree They modeled the duration between activities byconsidering the first appearance of a member on the mailinglist to the first commit on the source control repositoryOne of their findings was that prior patch submission had astrong effect on becoming part of the ldquocore membersrdquo groupof a project Herraiz et al [16] studied the GNOME projectand found two different patterns of joining the project (1)volunteerscontributors who follow the ldquoonion modelrdquo and(2) firmorganization sponsored developerswho do notTheyfound that hired developers gain knowledge quickly enoughto start writing code than the volunteers

Although these research studies were carried out in detailon different OSS projects they considered data only frommailing lists and source control repositories However wealso take into account bug repositories and quantify thecontributions made by members in terms of the followingbugs reported comments on bugs social relation with othermembers based on comments social relation with othermembers based on email exchanges on the mailing list andpatch submissions on bug repositories In addition to thatthere is no published work known to us which studies thecontributions made by a member before and after attainingthe role of a developer in an OSS project Therefore we havequantified and analyzed the average rate of contributionsmade by a member before and after attaining the role of adeveloper which makes this work unique in contrast to otherrelated pieces of work which have been done so far in thisarea

3 Research Questions

As mentioned earlier the success of an OSS project is in itslong-term survival which is potentially due to the existenceof a community surrounding the project We are particularlyinterested to identify the role of a community in the long-term survival of an OSS project as well as the key factorswhich promote a nondeveloper (In this paper we will usethe term ldquonondeveloperrdquo to refer to all those members whodo not have write-access to the source control repository)to the role of a developer (In this paper we will use theterm ldquodeveloperrdquo to refer to all those members who havewrite-access to the source control repository) Further we areinterested to know if the potential developers (In this paperwe will use the term ldquopotential developerrdquorsquo to refer to all thosemembers who started as a passive user and later attained therole of a developer) follow the onion model or if there isa sudden integration of developers into the ldquocore membersrdquogroup of an OSS project In order to address these key pointswe have outlined few research questions in the following

which will be addressed using data from publicly availablesoftware repositories of few selected Apache projects

(1) RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over theperiod of timePrevious studies [12 17] on various OSS projects haveshown thatmost part of the source-code developmentis carried out by the developers of those projectsWe will investigate what are the contributions ofnondevelopers if the source-code development ismostly done by the developers of those projects Inparticular we will investigate the contributions ofnondevelopers in terms of reporting bugs comment-ing on bugs and exchanging emails Further we areinterested to investigate the role of nondevelopers inthe long-term success andmaturity of anOSS project

(2) RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the roleof a developerAttaining a higher role comes with more responsibil-ities and commitments to the project We will investi-gate if a potential developer after attaining the role of adeveloper contributes (except source-code modifica-tion or bug fixing) more in contrast to contributing asa nondeveloperDoes the contribution pattern changewith the change in role of a potential developer To bemore precise does hisher contribution to the projectin terms of bugs reporting and interaction with thecommunity increase or decreaseWehypothesize thatafter attaining the role of a developer heshe willparticipate actively in technical discussions on thebug tracking systems or on themailing lists and reportbugs effectively

(3) RQ-3 What is the average rate of contributions madeby a potential developer comparing to other membersof the project before attaining the role of a developerWe will investigate if the average contributions madeby a potential developer are more than the aver-age contributions made by nondevelopers who werealso active during hisher time period It has beenaddressed in previous studies [3] that demonstrationof technical commitment and social status with othermembers will positively influence in attaining therole of a developer We will investigate if a potentialdeveloper was more active (ie technically skilledand higher social status) than nondevelopers beforeattaining the role of a developer

(4) RQ-4 Does a potential developer follow onion modelin order to attain the role of a developerWe will investigate if a potential developer followsthe onion model in order to attain the role of adeveloper that is joining the ldquocore membersrdquo groupof the project Not every member who is contributingto an OSS project eventually becomes a developerIt depends on the level of involvement of a memberin an OSS project and also on the needs to promote

4 ISRN Software Engineering

Table 1 Apache projects data range

Apache projects Date rangeApache Ant [6] 2000ndash2010Apache Lucene [7] 2001ndash2010Apache Maven [8] 2003ndash2010Apache Solr [9] 2006ndash2010

a nondeveloper to the role of a developer There isno static or standard timeline for a member to jointhe ldquocore membersrdquo group of a project The timeperiod required to attain the role of a developervaries from project to project and also from memberto member Members often start contributing to aproject by participating in the mailing list conversa-tions to get themselves familiarwith the project beforecontributing source-code patches to the project Wewill study the appearance of a potential developer ondifferent software repositories by comparing the time-stamp value of their first activity on these softwarerepositories in order to validate if heshe actuallyfollowed the onion model

4 Data Extraction Process

In this section we describe our data extraction methodologyand theApache projects selected for evaluationWe extracteddata from 4 differentApache projects as shown in Table 1Therange of data selected for each project is different because ofthe difference in the starting date of each project The reasonof choosing these Apache projects is that the repositories ofthese projects are on theWeb and available to be downloaded(ie mailing list archives bugs subversion logs etc)

MostApache projects have at least 3 differentmailing listsuser dev and commits but some havemore than 3mailinglists (eg announcements notifications etc) For ourstudy we downloaded only the dev mailing list archives ofeach Apache project under consideration The reason is thatsoftware developers communicate often with each other onthe dev mailing list rather than on any other mailing listsWe developed our own scripts which were used to extractinformation from mailing list archives in a similar mannerto previous research [12 18] For example each email wasprocessed to extract information like sender name emailaddress subject date message-id and reference The referencefield containsmessage-id(s) if the email is a reply to previousthread(s) We used the reference field information to build asocial network correspondence and computed social networkmeasures [19] of all the members of a project

We retrieved all the bugs (related to the Apache projectswe considered for our study) which are publicly availablethrough the Bugzilla and JIRA Web interface (httpsissuesapacheorg) and extracted the required information usingour custom written scripts For further details on the infor-mation extracted from each email and bug we refer the read-ers to [20] We computed the social relation correspondenceamongmembers on the bug tracking system based on the bug

Table 2 Dataset overview

Attachments Bugs Commits EmailsApache Ant 1345 5480 6025 84737Apache Lucene 2865 3116 5790 59616Apache Maven 1169 3902 8815 87611Apache Solr 2146 2528 4288 25173

comments exchanged among themselves Bird et al [21]findings indicated the detection and acceptance of source-code patches through the mailing list but we discovered thatsource-code patches were always attached to the respectivebugs on the bug tracking system rather than sending it on themailing list Prior research has indicated the importance ofoffering bug fixing and its acceptance as an influential factorin gaining the developer status [14] therefore we have alsoanalyzed how many source-code patches were submitted bythe members on bug repositories

In order to get information from source control reposi-tory we wrote our script (see [22] for details) and extractednecessary information (ie log number date of commitauthor id and files committed) We only considered thosesubversion logs where a particular source-code file (ieldquolowastjavardquo because Apache projects under consideration areJava-based) was committed These subversion log files werefurther analyzed by our script in order to identify if it fixesany bug by looking for patterns such as ldquoPRxxxrdquo ldquoMNG-xxxrdquo ldquoSOLR-xxxrdquo ldquoLUCENE-xxxrdquo and patch acceptanceacknowledgements such as ldquopatch provided by xxxrdquo ldquopatchsubmitted by xxxrdquo On the identification of such patterns thebugs were queried to retrieve source-code patches associatedwith those bugs This would help to identify source-codepatches that are accepted by the ldquocore membersrdquo group of theproject Further it allows to identify members who possessstrong technical skills required for attaining the role of adeveloper in the project Table 2 gives an overview on the rawdata sources we extracted fromdifferent software repositoriesof the selected Apache projects based on the methodologydescribed

The values for Apache Ant show that there were 1345source-code patches found for a total of 5480 bugs reportedon the bug tracking system Further 6025 subversion logswere extracted from the source control repository wheresource-code files (ie lowastjava) were committed and 84737emails were extracted from the Apache Ant mailing listarchives between 2000 and 2010

5 Empirical Analysis

Before we address each of the research questions in detailwe present a high level overview on the development activityof each Apache project under consideration over the periodof time in Figure 2 This would give an insight into howmuch contributions were made each year to a project andthe peak development years of a project For each Apacheproject under consideration we computed the number ofcontributionswith respect to the number of peoplewhomadethose contributions For example we computed the number

ISRN Software Engineering 5

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

1

Figure 2 Development activity of Apache projects over the period of time

of distinct bugs reported each year along with the numberof distinct reporters who submitted those bug reports (cfFigure 2) This would make it easier to answer simplequestions like how many bugs were reported and how manymembers were involved in the bug reporting process duringthe 2nd year of a project

Preliminaries let C be the total number of members (iedevelopers nondevelopers etc) who worked on the project

C = 1198881 1198882 119888

119899 (1)

Let Y be the total number of years of a project underconsideration

Y = 1199101 1199102 119910

119899 (2)

where y1is considered to be the first year of the project y

2is

considered to be the second year of the project and so on LetC be the set of members (ie developers nondevelopers etc)who were active in a time period 119910

C = 1198881 1198882 119888

119899 C sube C (3)

and let Immig be the immigrants (ie potential developers)who started as contributors and later become the developers

6 ISRN Software Engineering

of the project We classified only those members as immi-grantspotential developers who had an activity in the project(ie number of bugs reported number of bugs commentednumber of patches submitted or number of emails sent)at least 4 months prior to their first commit on the sourcecontrol repository

Immig = Immig1 Immig

2 Immig

119899

Immig sube 119863119910sube C

(4)

Let 119863119910be the set of developers who have made commits

before and during time period y such that 119863119910sube C Let the

total number of bugs reported and commented and emailssent by a set of members in a time period y be represented asfollows

Contributionbugs (119862 119910)

Contributioncomments (119862 119910)

Contributionemails (119862 119910)

(5)

whereas the number of bugs reported and commented andemails sent by the developers in a given period of time y isrepresented as

Contributionbugs (119863119910 119910)

Contributioncomments (119863119910 119910)

Contributionemails (119863119910 119910)

(6)

respectively Let d be a single developer and letcommitDate(d) return the first commit date of a developerThe yearly average contribution of a member before and afterattaining the role of a developer is represented as

Contributionbefore (119889 commitDate (119889))

Contributionafter (119889 commitDate (119889)) (7)

and the total number of bugs reported and commented andemails sent by an immigrant before becoming a developer isrepresented as

Contributionbugs (Immig commitDate (Immig))

Contributioncomments (Immig commitDate (Immig))

Contributionemails (Immig commitDate (Immig))

(8)

RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over the period oftime

In order to compute the contributions we need todistinguish between the developers and nondevelopers of theproject As each subversion log has a time stamp associatedto it we queried all subversion logs from the start dateof the project till the last commit date of the year underconsideration Based on this we get a list of all developersIDs who have contributed to the source control repository

Table 3 Average rate of contributions made by developers and nondevelopers

Variable Contributions ParticipantsDev Non dev Dev Non dev

Apache Antbugs reported 2970 50930 630 37740bug comments 68150 41601 1130 24890emails 477391 562800 1454 28491

Apache Lucenebugs reported 14111 15677 866 9133bug comments 50711 24133 1077 9533emails 354700 574566 1366 17322

Apache Mavenbugs reported 20487 26825 1100 16500bug comments 56928 33428 1385 19514emails 432837 950575 1337 26087

Apache Solrbugs reported 20110 26560 1020 11600bug comments 81903 43280 1100 16660emails 180825 604775 900 11250

till that particular year For each developer ID we computedthe contributions (ie bugs reported comments on bugsemails etc) made to the project on yearly basis and add upthe contributions made by all the developers for each yearSimilarly we computed the contributions made by nonde-velopers on yearly basis and add up all their contributionsfor each year Later we plotted the contributions made bythe developers and nondevelopers for each year in the formof a chart which is shown in Figure 3 Figure 3 shows thecomparison of contributions made by the developers andnondevelopers of each Apache project under considerationFurther we computed the average rate of contributions madeby the developers and nondevelopers as well as the averagenumber of developers and nondevelopers who made thosecontributions per year which is shown in Table 3 Forexample the number of bugs reported by the nondevelopersin a given period of time y is computed as follows

Contributionbugs (119862

119863119910

119910) (9)

and the average number of bugs reported by the developersand nondevelopers is computed as follows

sum119910isinY

Contributionbugs (119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

sum119910isinY

Contributionbugs (119862119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

(10)

Let us assume that the nondevelopers who were active ina certain period of time is calculated by nonDev(C119863

119910 119910)

ISRN Software Engineering 7

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

1

100000

10000

1000

100

10

Cou

nts

1

100000

10000

1000

100

10

Cou

nts

1

Figure 3 Contributions made by developers and nondevelopers over the period of time

the average participation ratio of developers and nondevel-opers is computed as follows

sum119910isinY

119863119910

10038161003816100381610038161199101003816100381610038161003816 sum

119910isinY

nonDev (C119863119910 119910)

10038161003816100381610038161199101003816100381610038161003816

(11)

The results in Table 3 show that nondevelopers arehighly involved (ie contributing more than the developers)

in reporting bugs and participating in discussions on themailing list One potential reason for this is the existence ofa huge community surrounding theseApache projects Giventhat discussingcommenting on a bug report requires tech-nical knowledge about the project which is why developersappear to be more active in commenting on the bug reportsthan nondevelopers it is quite obvious from Table 3 (alsosee Figure 3) that nondevelopers play a significant role in

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

ISRN Software Engineering 3

social standing in the community were strong factors injoining the core members group of developers having write-access to the source control repository Bird et al [3] usedthe mailing lists and source control repository to investigatethe time required for members to be invited into the ldquocoremembersrdquo group of an OSS project They applied hazardrate analysis or survival analysis [15] to model the time-dependent phenomena such as employment duration Theyused survival analysis to understand which factors influencethe duration and occurrence of such events and to whatdegree They modeled the duration between activities byconsidering the first appearance of a member on the mailinglist to the first commit on the source control repositoryOne of their findings was that prior patch submission had astrong effect on becoming part of the ldquocore membersrdquo groupof a project Herraiz et al [16] studied the GNOME projectand found two different patterns of joining the project (1)volunteerscontributors who follow the ldquoonion modelrdquo and(2) firmorganization sponsored developerswho do notTheyfound that hired developers gain knowledge quickly enoughto start writing code than the volunteers

Although these research studies were carried out in detailon different OSS projects they considered data only frommailing lists and source control repositories However wealso take into account bug repositories and quantify thecontributions made by members in terms of the followingbugs reported comments on bugs social relation with othermembers based on comments social relation with othermembers based on email exchanges on the mailing list andpatch submissions on bug repositories In addition to thatthere is no published work known to us which studies thecontributions made by a member before and after attainingthe role of a developer in an OSS project Therefore we havequantified and analyzed the average rate of contributionsmade by a member before and after attaining the role of adeveloper which makes this work unique in contrast to otherrelated pieces of work which have been done so far in thisarea

3 Research Questions

As mentioned earlier the success of an OSS project is in itslong-term survival which is potentially due to the existenceof a community surrounding the project We are particularlyinterested to identify the role of a community in the long-term survival of an OSS project as well as the key factorswhich promote a nondeveloper (In this paper we will usethe term ldquonondeveloperrdquo to refer to all those members whodo not have write-access to the source control repository)to the role of a developer (In this paper we will use theterm ldquodeveloperrdquo to refer to all those members who havewrite-access to the source control repository) Further we areinterested to know if the potential developers (In this paperwe will use the term ldquopotential developerrdquorsquo to refer to all thosemembers who started as a passive user and later attained therole of a developer) follow the onion model or if there isa sudden integration of developers into the ldquocore membersrdquogroup of an OSS project In order to address these key pointswe have outlined few research questions in the following

which will be addressed using data from publicly availablesoftware repositories of few selected Apache projects

(1) RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over theperiod of timePrevious studies [12 17] on various OSS projects haveshown thatmost part of the source-code developmentis carried out by the developers of those projectsWe will investigate what are the contributions ofnondevelopers if the source-code development ismostly done by the developers of those projects Inparticular we will investigate the contributions ofnondevelopers in terms of reporting bugs comment-ing on bugs and exchanging emails Further we areinterested to investigate the role of nondevelopers inthe long-term success andmaturity of anOSS project

(2) RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the roleof a developerAttaining a higher role comes with more responsibil-ities and commitments to the project We will investi-gate if a potential developer after attaining the role of adeveloper contributes (except source-code modifica-tion or bug fixing) more in contrast to contributing asa nondeveloperDoes the contribution pattern changewith the change in role of a potential developer To bemore precise does hisher contribution to the projectin terms of bugs reporting and interaction with thecommunity increase or decreaseWehypothesize thatafter attaining the role of a developer heshe willparticipate actively in technical discussions on thebug tracking systems or on themailing lists and reportbugs effectively

(3) RQ-3 What is the average rate of contributions madeby a potential developer comparing to other membersof the project before attaining the role of a developerWe will investigate if the average contributions madeby a potential developer are more than the aver-age contributions made by nondevelopers who werealso active during hisher time period It has beenaddressed in previous studies [3] that demonstrationof technical commitment and social status with othermembers will positively influence in attaining therole of a developer We will investigate if a potentialdeveloper was more active (ie technically skilledand higher social status) than nondevelopers beforeattaining the role of a developer

(4) RQ-4 Does a potential developer follow onion modelin order to attain the role of a developerWe will investigate if a potential developer followsthe onion model in order to attain the role of adeveloper that is joining the ldquocore membersrdquo groupof the project Not every member who is contributingto an OSS project eventually becomes a developerIt depends on the level of involvement of a memberin an OSS project and also on the needs to promote

4 ISRN Software Engineering

Table 1 Apache projects data range

Apache projects Date rangeApache Ant [6] 2000ndash2010Apache Lucene [7] 2001ndash2010Apache Maven [8] 2003ndash2010Apache Solr [9] 2006ndash2010

a nondeveloper to the role of a developer There isno static or standard timeline for a member to jointhe ldquocore membersrdquo group of a project The timeperiod required to attain the role of a developervaries from project to project and also from memberto member Members often start contributing to aproject by participating in the mailing list conversa-tions to get themselves familiarwith the project beforecontributing source-code patches to the project Wewill study the appearance of a potential developer ondifferent software repositories by comparing the time-stamp value of their first activity on these softwarerepositories in order to validate if heshe actuallyfollowed the onion model

4 Data Extraction Process

In this section we describe our data extraction methodologyand theApache projects selected for evaluationWe extracteddata from 4 differentApache projects as shown in Table 1Therange of data selected for each project is different because ofthe difference in the starting date of each project The reasonof choosing these Apache projects is that the repositories ofthese projects are on theWeb and available to be downloaded(ie mailing list archives bugs subversion logs etc)

MostApache projects have at least 3 differentmailing listsuser dev and commits but some havemore than 3mailinglists (eg announcements notifications etc) For ourstudy we downloaded only the dev mailing list archives ofeach Apache project under consideration The reason is thatsoftware developers communicate often with each other onthe dev mailing list rather than on any other mailing listsWe developed our own scripts which were used to extractinformation from mailing list archives in a similar mannerto previous research [12 18] For example each email wasprocessed to extract information like sender name emailaddress subject date message-id and reference The referencefield containsmessage-id(s) if the email is a reply to previousthread(s) We used the reference field information to build asocial network correspondence and computed social networkmeasures [19] of all the members of a project

We retrieved all the bugs (related to the Apache projectswe considered for our study) which are publicly availablethrough the Bugzilla and JIRA Web interface (httpsissuesapacheorg) and extracted the required information usingour custom written scripts For further details on the infor-mation extracted from each email and bug we refer the read-ers to [20] We computed the social relation correspondenceamongmembers on the bug tracking system based on the bug

Table 2 Dataset overview

Attachments Bugs Commits EmailsApache Ant 1345 5480 6025 84737Apache Lucene 2865 3116 5790 59616Apache Maven 1169 3902 8815 87611Apache Solr 2146 2528 4288 25173

comments exchanged among themselves Bird et al [21]findings indicated the detection and acceptance of source-code patches through the mailing list but we discovered thatsource-code patches were always attached to the respectivebugs on the bug tracking system rather than sending it on themailing list Prior research has indicated the importance ofoffering bug fixing and its acceptance as an influential factorin gaining the developer status [14] therefore we have alsoanalyzed how many source-code patches were submitted bythe members on bug repositories

In order to get information from source control reposi-tory we wrote our script (see [22] for details) and extractednecessary information (ie log number date of commitauthor id and files committed) We only considered thosesubversion logs where a particular source-code file (ieldquolowastjavardquo because Apache projects under consideration areJava-based) was committed These subversion log files werefurther analyzed by our script in order to identify if it fixesany bug by looking for patterns such as ldquoPRxxxrdquo ldquoMNG-xxxrdquo ldquoSOLR-xxxrdquo ldquoLUCENE-xxxrdquo and patch acceptanceacknowledgements such as ldquopatch provided by xxxrdquo ldquopatchsubmitted by xxxrdquo On the identification of such patterns thebugs were queried to retrieve source-code patches associatedwith those bugs This would help to identify source-codepatches that are accepted by the ldquocore membersrdquo group of theproject Further it allows to identify members who possessstrong technical skills required for attaining the role of adeveloper in the project Table 2 gives an overview on the rawdata sources we extracted fromdifferent software repositoriesof the selected Apache projects based on the methodologydescribed

The values for Apache Ant show that there were 1345source-code patches found for a total of 5480 bugs reportedon the bug tracking system Further 6025 subversion logswere extracted from the source control repository wheresource-code files (ie lowastjava) were committed and 84737emails were extracted from the Apache Ant mailing listarchives between 2000 and 2010

5 Empirical Analysis

Before we address each of the research questions in detailwe present a high level overview on the development activityof each Apache project under consideration over the periodof time in Figure 2 This would give an insight into howmuch contributions were made each year to a project andthe peak development years of a project For each Apacheproject under consideration we computed the number ofcontributionswith respect to the number of peoplewhomadethose contributions For example we computed the number

ISRN Software Engineering 5

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

1

Figure 2 Development activity of Apache projects over the period of time

of distinct bugs reported each year along with the numberof distinct reporters who submitted those bug reports (cfFigure 2) This would make it easier to answer simplequestions like how many bugs were reported and how manymembers were involved in the bug reporting process duringthe 2nd year of a project

Preliminaries let C be the total number of members (iedevelopers nondevelopers etc) who worked on the project

C = 1198881 1198882 119888

119899 (1)

Let Y be the total number of years of a project underconsideration

Y = 1199101 1199102 119910

119899 (2)

where y1is considered to be the first year of the project y

2is

considered to be the second year of the project and so on LetC be the set of members (ie developers nondevelopers etc)who were active in a time period 119910

C = 1198881 1198882 119888

119899 C sube C (3)

and let Immig be the immigrants (ie potential developers)who started as contributors and later become the developers

6 ISRN Software Engineering

of the project We classified only those members as immi-grantspotential developers who had an activity in the project(ie number of bugs reported number of bugs commentednumber of patches submitted or number of emails sent)at least 4 months prior to their first commit on the sourcecontrol repository

Immig = Immig1 Immig

2 Immig

119899

Immig sube 119863119910sube C

(4)

Let 119863119910be the set of developers who have made commits

before and during time period y such that 119863119910sube C Let the

total number of bugs reported and commented and emailssent by a set of members in a time period y be represented asfollows

Contributionbugs (119862 119910)

Contributioncomments (119862 119910)

Contributionemails (119862 119910)

(5)

whereas the number of bugs reported and commented andemails sent by the developers in a given period of time y isrepresented as

Contributionbugs (119863119910 119910)

Contributioncomments (119863119910 119910)

Contributionemails (119863119910 119910)

(6)

respectively Let d be a single developer and letcommitDate(d) return the first commit date of a developerThe yearly average contribution of a member before and afterattaining the role of a developer is represented as

Contributionbefore (119889 commitDate (119889))

Contributionafter (119889 commitDate (119889)) (7)

and the total number of bugs reported and commented andemails sent by an immigrant before becoming a developer isrepresented as

Contributionbugs (Immig commitDate (Immig))

Contributioncomments (Immig commitDate (Immig))

Contributionemails (Immig commitDate (Immig))

(8)

RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over the period oftime

In order to compute the contributions we need todistinguish between the developers and nondevelopers of theproject As each subversion log has a time stamp associatedto it we queried all subversion logs from the start dateof the project till the last commit date of the year underconsideration Based on this we get a list of all developersIDs who have contributed to the source control repository

Table 3 Average rate of contributions made by developers and nondevelopers

Variable Contributions ParticipantsDev Non dev Dev Non dev

Apache Antbugs reported 2970 50930 630 37740bug comments 68150 41601 1130 24890emails 477391 562800 1454 28491

Apache Lucenebugs reported 14111 15677 866 9133bug comments 50711 24133 1077 9533emails 354700 574566 1366 17322

Apache Mavenbugs reported 20487 26825 1100 16500bug comments 56928 33428 1385 19514emails 432837 950575 1337 26087

Apache Solrbugs reported 20110 26560 1020 11600bug comments 81903 43280 1100 16660emails 180825 604775 900 11250

till that particular year For each developer ID we computedthe contributions (ie bugs reported comments on bugsemails etc) made to the project on yearly basis and add upthe contributions made by all the developers for each yearSimilarly we computed the contributions made by nonde-velopers on yearly basis and add up all their contributionsfor each year Later we plotted the contributions made bythe developers and nondevelopers for each year in the formof a chart which is shown in Figure 3 Figure 3 shows thecomparison of contributions made by the developers andnondevelopers of each Apache project under considerationFurther we computed the average rate of contributions madeby the developers and nondevelopers as well as the averagenumber of developers and nondevelopers who made thosecontributions per year which is shown in Table 3 Forexample the number of bugs reported by the nondevelopersin a given period of time y is computed as follows

Contributionbugs (119862

119863119910

119910) (9)

and the average number of bugs reported by the developersand nondevelopers is computed as follows

sum119910isinY

Contributionbugs (119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

sum119910isinY

Contributionbugs (119862119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

(10)

Let us assume that the nondevelopers who were active ina certain period of time is calculated by nonDev(C119863

119910 119910)

ISRN Software Engineering 7

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

1

100000

10000

1000

100

10

Cou

nts

1

100000

10000

1000

100

10

Cou

nts

1

Figure 3 Contributions made by developers and nondevelopers over the period of time

the average participation ratio of developers and nondevel-opers is computed as follows

sum119910isinY

119863119910

10038161003816100381610038161199101003816100381610038161003816 sum

119910isinY

nonDev (C119863119910 119910)

10038161003816100381610038161199101003816100381610038161003816

(11)

The results in Table 3 show that nondevelopers arehighly involved (ie contributing more than the developers)

in reporting bugs and participating in discussions on themailing list One potential reason for this is the existence ofa huge community surrounding theseApache projects Giventhat discussingcommenting on a bug report requires tech-nical knowledge about the project which is why developersappear to be more active in commenting on the bug reportsthan nondevelopers it is quite obvious from Table 3 (alsosee Figure 3) that nondevelopers play a significant role in

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

4 ISRN Software Engineering

Table 1 Apache projects data range

Apache projects Date rangeApache Ant [6] 2000ndash2010Apache Lucene [7] 2001ndash2010Apache Maven [8] 2003ndash2010Apache Solr [9] 2006ndash2010

a nondeveloper to the role of a developer There isno static or standard timeline for a member to jointhe ldquocore membersrdquo group of a project The timeperiod required to attain the role of a developervaries from project to project and also from memberto member Members often start contributing to aproject by participating in the mailing list conversa-tions to get themselves familiarwith the project beforecontributing source-code patches to the project Wewill study the appearance of a potential developer ondifferent software repositories by comparing the time-stamp value of their first activity on these softwarerepositories in order to validate if heshe actuallyfollowed the onion model

4 Data Extraction Process

In this section we describe our data extraction methodologyand theApache projects selected for evaluationWe extracteddata from 4 differentApache projects as shown in Table 1Therange of data selected for each project is different because ofthe difference in the starting date of each project The reasonof choosing these Apache projects is that the repositories ofthese projects are on theWeb and available to be downloaded(ie mailing list archives bugs subversion logs etc)

MostApache projects have at least 3 differentmailing listsuser dev and commits but some havemore than 3mailinglists (eg announcements notifications etc) For ourstudy we downloaded only the dev mailing list archives ofeach Apache project under consideration The reason is thatsoftware developers communicate often with each other onthe dev mailing list rather than on any other mailing listsWe developed our own scripts which were used to extractinformation from mailing list archives in a similar mannerto previous research [12 18] For example each email wasprocessed to extract information like sender name emailaddress subject date message-id and reference The referencefield containsmessage-id(s) if the email is a reply to previousthread(s) We used the reference field information to build asocial network correspondence and computed social networkmeasures [19] of all the members of a project

We retrieved all the bugs (related to the Apache projectswe considered for our study) which are publicly availablethrough the Bugzilla and JIRA Web interface (httpsissuesapacheorg) and extracted the required information usingour custom written scripts For further details on the infor-mation extracted from each email and bug we refer the read-ers to [20] We computed the social relation correspondenceamongmembers on the bug tracking system based on the bug

Table 2 Dataset overview

Attachments Bugs Commits EmailsApache Ant 1345 5480 6025 84737Apache Lucene 2865 3116 5790 59616Apache Maven 1169 3902 8815 87611Apache Solr 2146 2528 4288 25173

comments exchanged among themselves Bird et al [21]findings indicated the detection and acceptance of source-code patches through the mailing list but we discovered thatsource-code patches were always attached to the respectivebugs on the bug tracking system rather than sending it on themailing list Prior research has indicated the importance ofoffering bug fixing and its acceptance as an influential factorin gaining the developer status [14] therefore we have alsoanalyzed how many source-code patches were submitted bythe members on bug repositories

In order to get information from source control reposi-tory we wrote our script (see [22] for details) and extractednecessary information (ie log number date of commitauthor id and files committed) We only considered thosesubversion logs where a particular source-code file (ieldquolowastjavardquo because Apache projects under consideration areJava-based) was committed These subversion log files werefurther analyzed by our script in order to identify if it fixesany bug by looking for patterns such as ldquoPRxxxrdquo ldquoMNG-xxxrdquo ldquoSOLR-xxxrdquo ldquoLUCENE-xxxrdquo and patch acceptanceacknowledgements such as ldquopatch provided by xxxrdquo ldquopatchsubmitted by xxxrdquo On the identification of such patterns thebugs were queried to retrieve source-code patches associatedwith those bugs This would help to identify source-codepatches that are accepted by the ldquocore membersrdquo group of theproject Further it allows to identify members who possessstrong technical skills required for attaining the role of adeveloper in the project Table 2 gives an overview on the rawdata sources we extracted fromdifferent software repositoriesof the selected Apache projects based on the methodologydescribed

The values for Apache Ant show that there were 1345source-code patches found for a total of 5480 bugs reportedon the bug tracking system Further 6025 subversion logswere extracted from the source control repository wheresource-code files (ie lowastjava) were committed and 84737emails were extracted from the Apache Ant mailing listarchives between 2000 and 2010

5 Empirical Analysis

Before we address each of the research questions in detailwe present a high level overview on the development activityof each Apache project under consideration over the periodof time in Figure 2 This would give an insight into howmuch contributions were made each year to a project andthe peak development years of a project For each Apacheproject under consideration we computed the number ofcontributionswith respect to the number of peoplewhomadethose contributions For example we computed the number

ISRN Software Engineering 5

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

1

Figure 2 Development activity of Apache projects over the period of time

of distinct bugs reported each year along with the numberof distinct reporters who submitted those bug reports (cfFigure 2) This would make it easier to answer simplequestions like how many bugs were reported and how manymembers were involved in the bug reporting process duringthe 2nd year of a project

Preliminaries let C be the total number of members (iedevelopers nondevelopers etc) who worked on the project

C = 1198881 1198882 119888

119899 (1)

Let Y be the total number of years of a project underconsideration

Y = 1199101 1199102 119910

119899 (2)

where y1is considered to be the first year of the project y

2is

considered to be the second year of the project and so on LetC be the set of members (ie developers nondevelopers etc)who were active in a time period 119910

C = 1198881 1198882 119888

119899 C sube C (3)

and let Immig be the immigrants (ie potential developers)who started as contributors and later become the developers

6 ISRN Software Engineering

of the project We classified only those members as immi-grantspotential developers who had an activity in the project(ie number of bugs reported number of bugs commentednumber of patches submitted or number of emails sent)at least 4 months prior to their first commit on the sourcecontrol repository

Immig = Immig1 Immig

2 Immig

119899

Immig sube 119863119910sube C

(4)

Let 119863119910be the set of developers who have made commits

before and during time period y such that 119863119910sube C Let the

total number of bugs reported and commented and emailssent by a set of members in a time period y be represented asfollows

Contributionbugs (119862 119910)

Contributioncomments (119862 119910)

Contributionemails (119862 119910)

(5)

whereas the number of bugs reported and commented andemails sent by the developers in a given period of time y isrepresented as

Contributionbugs (119863119910 119910)

Contributioncomments (119863119910 119910)

Contributionemails (119863119910 119910)

(6)

respectively Let d be a single developer and letcommitDate(d) return the first commit date of a developerThe yearly average contribution of a member before and afterattaining the role of a developer is represented as

Contributionbefore (119889 commitDate (119889))

Contributionafter (119889 commitDate (119889)) (7)

and the total number of bugs reported and commented andemails sent by an immigrant before becoming a developer isrepresented as

Contributionbugs (Immig commitDate (Immig))

Contributioncomments (Immig commitDate (Immig))

Contributionemails (Immig commitDate (Immig))

(8)

RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over the period oftime

In order to compute the contributions we need todistinguish between the developers and nondevelopers of theproject As each subversion log has a time stamp associatedto it we queried all subversion logs from the start dateof the project till the last commit date of the year underconsideration Based on this we get a list of all developersIDs who have contributed to the source control repository

Table 3 Average rate of contributions made by developers and nondevelopers

Variable Contributions ParticipantsDev Non dev Dev Non dev

Apache Antbugs reported 2970 50930 630 37740bug comments 68150 41601 1130 24890emails 477391 562800 1454 28491

Apache Lucenebugs reported 14111 15677 866 9133bug comments 50711 24133 1077 9533emails 354700 574566 1366 17322

Apache Mavenbugs reported 20487 26825 1100 16500bug comments 56928 33428 1385 19514emails 432837 950575 1337 26087

Apache Solrbugs reported 20110 26560 1020 11600bug comments 81903 43280 1100 16660emails 180825 604775 900 11250

till that particular year For each developer ID we computedthe contributions (ie bugs reported comments on bugsemails etc) made to the project on yearly basis and add upthe contributions made by all the developers for each yearSimilarly we computed the contributions made by nonde-velopers on yearly basis and add up all their contributionsfor each year Later we plotted the contributions made bythe developers and nondevelopers for each year in the formof a chart which is shown in Figure 3 Figure 3 shows thecomparison of contributions made by the developers andnondevelopers of each Apache project under considerationFurther we computed the average rate of contributions madeby the developers and nondevelopers as well as the averagenumber of developers and nondevelopers who made thosecontributions per year which is shown in Table 3 Forexample the number of bugs reported by the nondevelopersin a given period of time y is computed as follows

Contributionbugs (119862

119863119910

119910) (9)

and the average number of bugs reported by the developersand nondevelopers is computed as follows

sum119910isinY

Contributionbugs (119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

sum119910isinY

Contributionbugs (119862119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

(10)

Let us assume that the nondevelopers who were active ina certain period of time is calculated by nonDev(C119863

119910 119910)

ISRN Software Engineering 7

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

1

100000

10000

1000

100

10

Cou

nts

1

100000

10000

1000

100

10

Cou

nts

1

Figure 3 Contributions made by developers and nondevelopers over the period of time

the average participation ratio of developers and nondevel-opers is computed as follows

sum119910isinY

119863119910

10038161003816100381610038161199101003816100381610038161003816 sum

119910isinY

nonDev (C119863119910 119910)

10038161003816100381610038161199101003816100381610038161003816

(11)

The results in Table 3 show that nondevelopers arehighly involved (ie contributing more than the developers)

in reporting bugs and participating in discussions on themailing list One potential reason for this is the existence ofa huge community surrounding theseApache projects Giventhat discussingcommenting on a bug report requires tech-nical knowledge about the project which is why developersappear to be more active in commenting on the bug reportsthan nondevelopers it is quite obvious from Table 3 (alsosee Figure 3) that nondevelopers play a significant role in

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

ISRN Software Engineering 5

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

BugsReportersCommentsCommentors

AttachmentsAttachersEmailsPeople

1

Figure 2 Development activity of Apache projects over the period of time

of distinct bugs reported each year along with the numberof distinct reporters who submitted those bug reports (cfFigure 2) This would make it easier to answer simplequestions like how many bugs were reported and how manymembers were involved in the bug reporting process duringthe 2nd year of a project

Preliminaries let C be the total number of members (iedevelopers nondevelopers etc) who worked on the project

C = 1198881 1198882 119888

119899 (1)

Let Y be the total number of years of a project underconsideration

Y = 1199101 1199102 119910

119899 (2)

where y1is considered to be the first year of the project y

2is

considered to be the second year of the project and so on LetC be the set of members (ie developers nondevelopers etc)who were active in a time period 119910

C = 1198881 1198882 119888

119899 C sube C (3)

and let Immig be the immigrants (ie potential developers)who started as contributors and later become the developers

6 ISRN Software Engineering

of the project We classified only those members as immi-grantspotential developers who had an activity in the project(ie number of bugs reported number of bugs commentednumber of patches submitted or number of emails sent)at least 4 months prior to their first commit on the sourcecontrol repository

Immig = Immig1 Immig

2 Immig

119899

Immig sube 119863119910sube C

(4)

Let 119863119910be the set of developers who have made commits

before and during time period y such that 119863119910sube C Let the

total number of bugs reported and commented and emailssent by a set of members in a time period y be represented asfollows

Contributionbugs (119862 119910)

Contributioncomments (119862 119910)

Contributionemails (119862 119910)

(5)

whereas the number of bugs reported and commented andemails sent by the developers in a given period of time y isrepresented as

Contributionbugs (119863119910 119910)

Contributioncomments (119863119910 119910)

Contributionemails (119863119910 119910)

(6)

respectively Let d be a single developer and letcommitDate(d) return the first commit date of a developerThe yearly average contribution of a member before and afterattaining the role of a developer is represented as

Contributionbefore (119889 commitDate (119889))

Contributionafter (119889 commitDate (119889)) (7)

and the total number of bugs reported and commented andemails sent by an immigrant before becoming a developer isrepresented as

Contributionbugs (Immig commitDate (Immig))

Contributioncomments (Immig commitDate (Immig))

Contributionemails (Immig commitDate (Immig))

(8)

RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over the period oftime

In order to compute the contributions we need todistinguish between the developers and nondevelopers of theproject As each subversion log has a time stamp associatedto it we queried all subversion logs from the start dateof the project till the last commit date of the year underconsideration Based on this we get a list of all developersIDs who have contributed to the source control repository

Table 3 Average rate of contributions made by developers and nondevelopers

Variable Contributions ParticipantsDev Non dev Dev Non dev

Apache Antbugs reported 2970 50930 630 37740bug comments 68150 41601 1130 24890emails 477391 562800 1454 28491

Apache Lucenebugs reported 14111 15677 866 9133bug comments 50711 24133 1077 9533emails 354700 574566 1366 17322

Apache Mavenbugs reported 20487 26825 1100 16500bug comments 56928 33428 1385 19514emails 432837 950575 1337 26087

Apache Solrbugs reported 20110 26560 1020 11600bug comments 81903 43280 1100 16660emails 180825 604775 900 11250

till that particular year For each developer ID we computedthe contributions (ie bugs reported comments on bugsemails etc) made to the project on yearly basis and add upthe contributions made by all the developers for each yearSimilarly we computed the contributions made by nonde-velopers on yearly basis and add up all their contributionsfor each year Later we plotted the contributions made bythe developers and nondevelopers for each year in the formof a chart which is shown in Figure 3 Figure 3 shows thecomparison of contributions made by the developers andnondevelopers of each Apache project under considerationFurther we computed the average rate of contributions madeby the developers and nondevelopers as well as the averagenumber of developers and nondevelopers who made thosecontributions per year which is shown in Table 3 Forexample the number of bugs reported by the nondevelopersin a given period of time y is computed as follows

Contributionbugs (119862

119863119910

119910) (9)

and the average number of bugs reported by the developersand nondevelopers is computed as follows

sum119910isinY

Contributionbugs (119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

sum119910isinY

Contributionbugs (119862119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

(10)

Let us assume that the nondevelopers who were active ina certain period of time is calculated by nonDev(C119863

119910 119910)

ISRN Software Engineering 7

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

1

100000

10000

1000

100

10

Cou

nts

1

100000

10000

1000

100

10

Cou

nts

1

Figure 3 Contributions made by developers and nondevelopers over the period of time

the average participation ratio of developers and nondevel-opers is computed as follows

sum119910isinY

119863119910

10038161003816100381610038161199101003816100381610038161003816 sum

119910isinY

nonDev (C119863119910 119910)

10038161003816100381610038161199101003816100381610038161003816

(11)

The results in Table 3 show that nondevelopers arehighly involved (ie contributing more than the developers)

in reporting bugs and participating in discussions on themailing list One potential reason for this is the existence ofa huge community surrounding theseApache projects Giventhat discussingcommenting on a bug report requires tech-nical knowledge about the project which is why developersappear to be more active in commenting on the bug reportsthan nondevelopers it is quite obvious from Table 3 (alsosee Figure 3) that nondevelopers play a significant role in

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

6 ISRN Software Engineering

of the project We classified only those members as immi-grantspotential developers who had an activity in the project(ie number of bugs reported number of bugs commentednumber of patches submitted or number of emails sent)at least 4 months prior to their first commit on the sourcecontrol repository

Immig = Immig1 Immig

2 Immig

119899

Immig sube 119863119910sube C

(4)

Let 119863119910be the set of developers who have made commits

before and during time period y such that 119863119910sube C Let the

total number of bugs reported and commented and emailssent by a set of members in a time period y be represented asfollows

Contributionbugs (119862 119910)

Contributioncomments (119862 119910)

Contributionemails (119862 119910)

(5)

whereas the number of bugs reported and commented andemails sent by the developers in a given period of time y isrepresented as

Contributionbugs (119863119910 119910)

Contributioncomments (119863119910 119910)

Contributionemails (119863119910 119910)

(6)

respectively Let d be a single developer and letcommitDate(d) return the first commit date of a developerThe yearly average contribution of a member before and afterattaining the role of a developer is represented as

Contributionbefore (119889 commitDate (119889))

Contributionafter (119889 commitDate (119889)) (7)

and the total number of bugs reported and commented andemails sent by an immigrant before becoming a developer isrepresented as

Contributionbugs (Immig commitDate (Immig))

Contributioncomments (Immig commitDate (Immig))

Contributionemails (Immig commitDate (Immig))

(8)

RQ-1 What is the ratio of contributions made by thedevelopers and nondevelopers to the project over the period oftime

In order to compute the contributions we need todistinguish between the developers and nondevelopers of theproject As each subversion log has a time stamp associatedto it we queried all subversion logs from the start dateof the project till the last commit date of the year underconsideration Based on this we get a list of all developersIDs who have contributed to the source control repository

Table 3 Average rate of contributions made by developers and nondevelopers

Variable Contributions ParticipantsDev Non dev Dev Non dev

Apache Antbugs reported 2970 50930 630 37740bug comments 68150 41601 1130 24890emails 477391 562800 1454 28491

Apache Lucenebugs reported 14111 15677 866 9133bug comments 50711 24133 1077 9533emails 354700 574566 1366 17322

Apache Mavenbugs reported 20487 26825 1100 16500bug comments 56928 33428 1385 19514emails 432837 950575 1337 26087

Apache Solrbugs reported 20110 26560 1020 11600bug comments 81903 43280 1100 16660emails 180825 604775 900 11250

till that particular year For each developer ID we computedthe contributions (ie bugs reported comments on bugsemails etc) made to the project on yearly basis and add upthe contributions made by all the developers for each yearSimilarly we computed the contributions made by nonde-velopers on yearly basis and add up all their contributionsfor each year Later we plotted the contributions made bythe developers and nondevelopers for each year in the formof a chart which is shown in Figure 3 Figure 3 shows thecomparison of contributions made by the developers andnondevelopers of each Apache project under considerationFurther we computed the average rate of contributions madeby the developers and nondevelopers as well as the averagenumber of developers and nondevelopers who made thosecontributions per year which is shown in Table 3 Forexample the number of bugs reported by the nondevelopersin a given period of time y is computed as follows

Contributionbugs (119862

119863119910

119910) (9)

and the average number of bugs reported by the developersand nondevelopers is computed as follows

sum119910isinY

Contributionbugs (119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

sum119910isinY

Contributionbugs (119862119863119910 119910)10038161003816100381610038161199101003816100381610038161003816

(10)

Let us assume that the nondevelopers who were active ina certain period of time is calculated by nonDev(C119863

119910 119910)

ISRN Software Engineering 7

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

1

100000

10000

1000

100

10

Cou

nts

1

100000

10000

1000

100

10

Cou

nts

1

Figure 3 Contributions made by developers and nondevelopers over the period of time

the average participation ratio of developers and nondevel-opers is computed as follows

sum119910isinY

119863119910

10038161003816100381610038161199101003816100381610038161003816 sum

119910isinY

nonDev (C119863119910 119910)

10038161003816100381610038161199101003816100381610038161003816

(11)

The results in Table 3 show that nondevelopers arehighly involved (ie contributing more than the developers)

in reporting bugs and participating in discussions on themailing list One potential reason for this is the existence ofa huge community surrounding theseApache projects Giventhat discussingcommenting on a bug report requires tech-nical knowledge about the project which is why developersappear to be more active in commenting on the bug reportsthan nondevelopers it is quite obvious from Table 3 (alsosee Figure 3) that nondevelopers play a significant role in

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

ISRN Software Engineering 7

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

Developer (bugs reported) Nondeveloper (bugs reported)

Nondeveloper (bug comments)Developer (bug comments)

Developer (emails)Nondeveloper (emails)

10000

1000

100

10

Cou

nts

100000

10000

1000

100

10

Cou

nts

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

rYear

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

Year

1st

year

2nd

yea

r

3rd

yea

r

4th

yea

r

5th

yea

r

6th

yea

r

7th

yea

r

8th

yea

r

9th

yea

r

10

th y

ear

Apache ant Apache lucene

Apache maven Apache solr

1

100000

10000

1000

100

10

Cou

nts

1

100000

10000

1000

100

10

Cou

nts

1

Figure 3 Contributions made by developers and nondevelopers over the period of time

the average participation ratio of developers and nondevel-opers is computed as follows

sum119910isinY

119863119910

10038161003816100381610038161199101003816100381610038161003816 sum

119910isinY

nonDev (C119863119910 119910)

10038161003816100381610038161199101003816100381610038161003816

(11)

The results in Table 3 show that nondevelopers arehighly involved (ie contributing more than the developers)

in reporting bugs and participating in discussions on themailing list One potential reason for this is the existence ofa huge community surrounding theseApache projects Giventhat discussingcommenting on a bug report requires tech-nical knowledge about the project which is why developersappear to be more active in commenting on the bug reportsthan nondevelopers it is quite obvious from Table 3 (alsosee Figure 3) that nondevelopers play a significant role in

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

8 ISRN Software Engineering

the projects under consideration and hence it is one of themajor factors in the long-term survival success and maturityof these projects over the period of time

The high ratio of nondevelopers involvement in theproject (cf Figure 3 and Table 3) allows the core membersto select or vote for the potential developers to be invited tothe ldquocore membersrdquo group of the project

RQ-2 What is the ratio of contributions made by apotential developer before and after attaining the role of adeveloper

We are only interested in those developers who did notstart contributing directly to the project but instead followsthe onion model (cf Figure 1) In order to select thosedevelopers we retrieved all developers from subversion logsLater for each developer we compared his first commit dateon the project to his first appearance on any of the projectrepositories (ie first bug reporting date bug comment dateattachment or email date) in order to compute the number ofdays or months before he started to contribute as a developerAlthough there is no fixed or standard timeline for attainingthe role of a developer in the project we considered only thosedevelopers who had an activity (bug report bug commentattachment or email) on the project at least 4 months priorto their first commit on the source control repository of theproject

For each of those selected developers we queried thecontributions made to the project before and after the firstcommit date of each developer As the time period of attainingthe role of a developer is different for each developer wecomputed the average yearly rate of contributions made bya developer before and after attaining the role of a developerWe do not show each individualrsquos contribution to the projectdue to the privacy issues and hence we have summarizedthe aggregated results of each project as shown in Table 4All the variables (except 119899) used in our study representthe contribution of potential developers on yearly basis Foreach Apache project 119899 represents the number of potentialdevelopers who have attained the role of a developer Theaverage yearly rate of contributions by a potential developerbefore and after attaining the role of a developer is calculatedas follows

sum119889isinImmig

Contributionbefore (119889 commitDate (119889))|119889|

sum119889isinImmig

Contributionafter (119889 commitDate (119889))|119889|

(12)

Based on Table 4 we find that the bugs reporting patterndoes not change much before and after attaining the roleof a developer in Apache Maven and Apache Solr projectsHowever in Apache Ant it decreased tremendously afterattaining the role of a developer As shown in Figure 3there are only few bugs reported by the developers incontrast to nondevelopers in the Apache Ant project whichis also reflected by the value of bugs reported variablefor the Apache Ant project Members after joining theldquocore membersrdquo group participate more often in technicaldiscussions on the bug tracking system which is reflected by

Table 4 Yearly average contribution ratio of a potential developerbefore and after attaining the role of a developer

Variable Mean St DevBefore After Before AfterApache Ant (n = 13)

bugs reported 1723 235 1798 365bug comments 4144 3902 3766 3483bug social relation 4465 3157 5141 2334emails 23049 28034 259 22452email social relation 305 2031 3035 1355

Apache Lucene (n = 22)bugs reported 1412 2373 912 3274bug comments 2484 7321 1818 9185bug social relation 3357 3467 2375 3002emails 130 39871 12487 50802email social relation 1959 2169 1653 1532

Apache Maven (n = 21)bugs reported 2840 2782 5327 7949bug comments 1642 2786 1207 5868bug social relation 2716 2408 2143 3085emails 15841 18555 21827 17489email social relation 1833 2334 1831 1573

Apache Solr (n = 13)bugs reported 2477 2375 3038 1750bug comments 4472 8477 4135 8625bug social relation 6804 8046 5218 5239emails 15601 31382 22757 33319email social relation 1472 3847 785 3304

the value of bug comment variable However an increase inthe participation in technical discussions did not increase thesocial relation of the developers on the bug tracking system(ie bug social relation) in the case of Apache Antand Apache Maven project One reason could be that afterattaining the role of a developer they focused only on certainmodules of a project and hence involved in discussionson bugs relevant to those modules with other developersof the project There is also a tremendous increase in thenumber of emails sent by the members after attaining therole of a developer which eventually increases the value ofemail social relation variable

Based on the Apache projects under consideration wefound that members after attaining the role of a developertend to participate actively in technical discussions either onthe mailing list or bug tracking system which also increasestheir social relation networks except the case of Apache AntprojectThe bugs reporting behavior of these members variesin our studied Apache projects and hence it is difficult to sayif they reportmore bugs after attaining the role of a developer

RQ-3 What is the average rate of contributions made by apotential developer comparing to other members of the projectbefore attaining the role of a developer

For each potential developer we took the first time-stampvalue where he first appears on the project and the secondtime-stamp value when he actually made the first commit

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

ISRN Software Engineering 9

Table 5 Average contribution rate of a potential developer compar-ing to other members of the project before attaining the role of adeveloper

Variable Mean St devApache Ant (n = 13)

bugs reported 1115 1325bug comments 1043 835bug social relation 842 665emails 1989 1625email social relation 1097 842

Apache Lucene (n = 22)bugs reported 453 309bug comments 408 289bug social relation 318 173emails 391 327email social relation 623 442

Apache Maven (n = 21)bugs reported 405 368bug comments 236 201bug social relation 244 198emails 405 576email social relation 431 334

Apache Solr (n = 13)bugs reported 456 577bug comments 44 472bug social relation 252 215emails 105 107email social relation 271 195

to the source control repository of the project We extractedthe contributions (ie bugs reported comments emails etc)made by a potential developer between those time-stampvalues Using the same time-stamp values we computedthe contributions made by other members who were alsoactive during that specific time period Later we dividedthe contributions of a potential developer by the averagecontributions of all other members in order to determine theaverage rate of contributions made by a potential developercomparing to other members of the project We do not showeach individualrsquos contribution rate due to the privacy issuesand hence we have summarized the aggregated results of eachproject which is shown in Table 5 For example the averagerate of bugs reported by an immigrant comparing to othermembers who were active during the same time-stamp iscalculated as follows

sumImmigisinImmig

Contributionbugs (Immig commitDate (Immig))sum119888isinC Contributionbugs (119888 commitDate (Immig)) |119888|

(13)

The results in Table 5 can be understand as follows theaverage rate of reporting bugs by a potential developer ofApache Lucene project is 453 times the average rate ofreporting bugs by all other members who were active duringthat time period Although the average rate of contributions

Table 6 Appearance of a potential developer on different softwarerepositories prior to attaining the role of a developer

Apache projectsPatch

submission(no of days)

Bugsreported

(no of days)

Emails(no of days)

Apache Ant 54415 55384 70653Apache Lucene 45741 52632 70622Apache Maven 38500 39631 70971Apache Solr 26930 23792 45246

made by potential developers varies in all the projects underconsideration it is quite obvious from each variable valuethat the contributions made by potential developers are morethan the average contributions of all other members Hencewe can say that they were the most active contributors (ietechnically skilled and higher social status) before attainingthe developer status in the project

RQ-4 Does a potential developer follow onion model inorder to attain the role of a developer

For each potential developer we computed the time-stamp value between hisher first commit date to hisherfirst activity on the different software repositories in terms ofdays Table 6 presents the appearance of a potential developerin terms of average number of days on different softwarerepositories prior to attaining the role of a developer Theresult shows that all the potential developers started fromthe mailing list (cf Table 6) because the email activity is theoldest for all Apache projects under consideration followedby the bugs reportingcommenting and the latest activitybefore attaining the role of a developer was the source-codepatch submissions (ie bugs fixing) The results shown inTable 6 closely match to the onionmodel (see Figure 1) wherea member starts as a reader followed by reporting bugs andlater fixing bugs before attaining the role of a developer

Let ActivityDate(ImmigmL) return the number ofdays between the first commit date of an immigrant (iepotential developer) on the source control repository to hisfirst activity date on the mailing list of a project The averagenumber of days for an immigrant to appear on a mailing listprior to hisher first commit date is calculated as follows

sumImmigisinImmig

ActivityDate (Immigml)1003816100381610038161003816Immig1003816100381610038161003816

(14)

The results (Table 6) also show that it took almost 2years for a potential developer ofApache Ant Apache Luceneand Apache Maven projects to attain the role of a developerHowever we cannot say that it is the standard time as the timevaries dramatically from project to project as it can be seen inthe results of Apache Solr project comparing to other Apacheprojects under consideration

6 Conclusion

In this paper we have investigated in detail the patterns ofcontributions made by those members who have attained

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

10 ISRN Software Engineering

the role of a developer in the project First we investi-gated the significant role played by nondevelopers in thelong-term survival of an OSS project and observed thatnondevelopers who do not have write-access to the sourcecontrol repository participate actively in reporting bugs andemail discussions thus contributing to the maturity of anOSS project Our investigation based on the contribution ofpotential developers before and after attaining the role of adeveloper showed that after attaining a higher position in thecommunity developers tend to contribute more efficientlythan nondevelopers of the project by actively participatingin technical discussions along with fixing bugs Moreoverwe observed that the members who attained the role of adeveloper had more contributions in contrast to the averagenumber of contributions made by other members of theproject who were active during hisher time period Thismakes it obvious that one of the important factors in order toattain the role of a developer is the demonstration of technicalskills and commitment to the project in an efficient manner

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] B Shibuya and T Tamai ldquoUnderstanding the process of par-ticipating in open source communitiesrdquo in Proceedings of theICSE Workshop on Emerging Trends in FreeLibreOpen SourceSoftware Research and Development (FLOSS rsquo09) pp 1ndash6 IEEEComputer Society Washington DC USA May 2009

[2] G Von Krogh S Spaeth and K R Lakhani ldquoCommunityjoining and specialization in open source software innovationa case studyrdquo Research Policy vol 32 no 7 pp 1217ndash1241 2003

[3] C Bird A Gourley P Devanbu A Swaminathan and G HsuldquoOpen borders Immigration in open source projectsrdquo in Pro-ceedings of the 4th International Workshop on Mining SoftwareRepositories (MSR rsquo07) Washington DC USA May 2007

[4] M Antikainen T Aaltonen and J Vaisanen ldquoThe role of trustin OSS communitiesmdashcase Linux Kernel communityrdquo IFIPInternational Federation for Information Processing vol 234 pp223ndash228 2007

[5] Y Ye K Nakakoji Y Yamamoto and K Kishida ldquoThe co-evolution of systems and communities in Free and OpenSource Software Developmentrdquo in FreeOpen Source SoftwareDevelopment pp 59ndash82 Idea Group Hershey Pa USA 2004

[6] httpantapacheorg[7] httpluceneapacheorg[8] httpmavenapacheorg[9] httpluceneapacheorgsolr[10] K Crowston and J Howison ldquoThe social structure of free

and open source software developmentrdquo in Proceedings of theInternational Conference on Information Systems Seattle WashUSA 2003

[11] C Jensen and W Scacchi ldquoModelling recruitment and rolemigration process in oosd projectsrdquo in Proceedings of the 6thInternational Workshop on Software Process Simulation andModeling St Louis Mo USA 2005

[12] A Mockus R T Fielding and J D Herbsleb ldquoTwo case studiesof open source software development apache and mozillardquoACM Transactions on Software Engineering and Methodologyvol 11 no 3 pp 309ndash346 2002

[13] T Dinh-Trong and J M Bieman ldquoOpen source softwaredevelopment a case study of freeBSDrdquo in Proceedings of the 10thInternational Symposium on Software Metrics (METRICS rsquo04)pp 96ndash105 Washington DC USA September 2004

[14] N Ducheneaut ldquoSocialization in an open source softwarecommunity a socio-technical analysisrdquo Computer SupportedCooperative Work vol 14 no 4 pp 323ndash368 2005

[15] D Cox and D OakesAnalysis of Survival Data Monographs onStatistics and Applied Probability Chapman and Hall 1984

[16] I Herraiz G Robles J J Amor T Romera and J M GBarahona ldquoThe processes of joining in global distributedsoftware projectsrdquo in Proceedings of the International Workshopon Global Software Development for the Practitioner (GSD rsquo06)pp 27ndash33 2006

[17] W Scacchi J Feller B Fitzgerald S Hissam and K LakhanildquoUnderstanding freeopen source software development pro-cessesrdquo Software Process Improvement and Practice vol 11 no2 pp 95ndash105 2006

[18] M Fischer M Pinzger and H Gall ldquoPopulating a releasehistory database from version control and bug tracking sys-temsrdquo in Proceedings of the International Conference on SoftwareMaintenance (ICSM rsquo03) pp 23ndash32 IEEE Computer SocietyWashington DC USA September 2003

[19] S Wasserman and K Faust Social Network Analysis Methodsand Applications Cambridge University Press 1994

[20] A Iqbal and M Hausenblas ldquoInterlinking developer identi-ties within and across open source projects the linked dataapproachrdquo ISRN Software Engineering vol 2013 Article ID584731 12 pages 2013

[21] C Bird A Gourley and P Devanbu ldquoDetecting patch submis-sion and acceptance in OSS projectsrdquo in Proceedings of the 4thInternational Workshop on Mining Software Repositories (MSRrsquo07) Washington DC USA May 2007

[22] A Iqbal O Ureche M Hausenblas and G TummarelloldquoLD2SD linked data driven software developmentrdquo in Proceed-ings of the 21st International Conference on Software Engineeringand Knowledge Engineering (SEKE rsquo09) pp 240ndash245 BostonMass USA July 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article Understanding Contributor to Developer Turnover ...downloads.hindawi.com/archive/2014/535724.pdf · Research Article Understanding Contributor to Developer Turnover

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014