21
Can Development Work Describe Itself? Walid Maalej , Technische Universität München Hans‐Jörg Happel, FZI Research Center Karlsruhe MSR’2010, Cape Town, South Africa, Mai 2010

Can Development Work Describe Itself?

  • View
    2.138

  • Download
    4

Embed Size (px)

DESCRIPTION

Work descriptions are informal notes taken by developers to summarize work achieved in a particular session. Existing studies indicate that maintaining them is a distracting task, which costs a developer more than 30 min. a day. The goal of this research is to analyze the purposes of work descriptions, and find out if automated tools can assist developers in efficiently creating them. For this, we mine a large dataset of heterogeneous work descriptions from open source and commercial projects. We analyze the semantics of these documents and identify common information entities and granularity levels. Information on performed actions, concerned artifacts, references and new work, shows the work management purpose of work descriptions. Information on problems, rationale and experience shows their knowledge sharing purpose. We discuss how work description information, in particular information used for work management, can be generated by observing developers' interactions. Our findings have many implications for next generation software engineering tools. Paper: Walid Maalej and Hans-Jörg Happel, Can Development Work Describe Itself? In Proceedings of the 7th IEEE Conference on Mining Software Repositories, IEEE CS, 2010.

Citation preview

Page 1: Can Development Work Describe Itself?

CanDevelopmentWorkDescribeItself?

WalidMaalej,TechnischeUniversitätMünchen

Hans‐JörgHappel,FZIResearchCenterKarlsruhe

MSR’2010,CapeTown,SouthAfrica,Mai2010

Page 2: Can Development Work Describe Itself?

©W.Maalej,Mai2010

ExecuIveSummary

AnalyzingWorkDescripIon–MSR2010 2

GroundedTheoryonWorkDescripIons

Informalnotesthatdescribedevelopers’workcontainwell‐definedsemanKcs,granularitylevels,andinformaKonpaMers

1

Toalargeextent,workdescripKonscanbegeneratedbyobservingtheworkcontextofdevelopersandtheirinteracKons

2

Page 3: Can Development Work Describe Itself?

©W.Maalej,Mai2010 3

Outline

WorkDescripIonAutomaIon

ResearchSeRng

Research Results

MoIvaIon

2

1

3

4

AnalyzingWorkDescripIon–MSR2010

Page 4: Can Development Work Describe Itself?

©W.Maalej,Mai2010

WhatAreWorkDescripIons?

AnalyzingWorkDescripIon–MSR2010 4

TimesheetSocialmedia

CommentsCommitmessage ArIfactsincludingwork

descripIons

Personalnote

AworkdescripIonisaninformaltextwriTenbyaknowledgeworkertosummarizeachievementsandothernotableissuesofaparIcularworksession

Page 5: Can Development Work Describe Itself?

©W.Maalej,Mai2010

PreviousStudiesShowedInteresIngProperIesofWorkDescripIons

AnalyzingWorkDescripIon–MSR2010 5

Effort and Quality Issues

  5%ofdevelopers‘Imeisspentfordescribingwork(30min.perday)

  10%ofthesessionshavepseudodescripIons(eithernoImeornotmoIvaIon)

Regularities in Content and Metadata

  Theoverallvocabularyusageseemstobepredictable

  Thevocabularysizeisrathersmall

  Differentprojectshavesimilarrankingofterms

To which extent can developers‘ work descriptions be automated?

Page 6: Can Development Work Describe Itself?

©W.Maalej,Mai2010 6

Outline

WorkDescripIonAutomaIon

ResearchSeRng

Research Results

MoIvaIon

2

1

3

4

AnalyzingWorkDescripIon–MSR2010

Page 7: Can Development Work Describe Itself?

©W.Maalej,Mai2010

ResearchQuesIons

AnalyzingWorkDescripIon–MSR2010 7

ContentofWorkDescripIons

ThesemanIcsofinformaIonincludedinworkdescripIons

InformaIonEnIIes

TextfragmentswithsimilarsemanIcs

InformaIonGranularity

Thelevelsofdetailincluded(abstracIonlevels)

PreferencesOccurrences

Whichinfor‐maIonenIIesareincludedandhowo`en?

CombinaIons

HowaretheseenIIescombined?

DocertaindevelopersprefercertaininformaIon?

Levels

Whataregranularitylevels?

Causes

WhichproperIeseffectthegranularity?

Page 8: Can Development Work Describe Itself?

©W.Maalej,Mai2010

DataSetsCollectedinDifferentContexts

Dataset Summary Period Developers Entries

MyCompDevelopers‘personalnotesataGermansoTwarecompany

2001–2009 25 38,005

ApacheCommitmessagesandcodecommentsofallApacheprojects

2001–2009 1,145 598,418

UnicaseCommitmessagesandcodecommentsoftheunicaseproject

2008–2009 18 5097

EurekaPersonalnotesinaobservaKonalstudyat5companies

2008 21 91

AnalyzingWorkDescripIon–MSR2010 8

Page 9: Can Development Work Describe Itself?

©W.Maalej,Mai2010

TheDataAnalysisProcess

AnalyzingWorkDescripIon–MSR2010 9

Page 10: Can Development Work Describe Itself?

©W.Maalej,Mai2010 10

Outline

WorkDescripIonAutomaIon

ResearchSeRng

ResearchResults

MoIvaIon

2

1

3

4

AnalyzingWorkDescripIon–MSR2010

Page 11: Can Development Work Describe Itself?

©W.Maalej,Mai2010

InformaIonEnIIesandTheirUsageFrequencies

AnalyzingWorkDescripIon–MSR2010 11

Occurrences %

Entity Average Apache Mycomp Unicase Eureka

Activity 71 69 76 71 67

Artifact 55 60 53 49 58

Problem 47 47 47 49 45

Rationale 28 30 29 25 31

New Work 24 24 20 28 22

Status 19 24 20 17 15

Reference 15 15 19 17 10

Solution 15 19 15 16 11

Experience 10 11 6 9 13

Page 12: Can Development Work Describe Itself?

©W.Maalej,Mai2010

FindingsonInformaIonEnIIes

AnalyzingWorkDescripIon–MSR2010 12

ThemajorityofinformaKononperformedacKviKes(82%)iscombinedwithconcernedarKfacts

1

InformaKononproblemsisusedtodescribeworkdone,workneedtobedone,andthecontextofexperiences

2

ThecombinaKonpaMernsshowthatsharingknowledgeandmanagingworkaretwogoalsofworkdescripKons

3

Theretwoclustersofdevelopers:thosewhoprefertousearKfactsandthosewhoprefertouseproblemstodescribework

4

Page 13: Can Development Work Describe Itself?

©W.Maalej,Mai2010

GranularityLevelsandUsageFrequencies

AnalyzingWorkDescripIon–MSR2010 13

Granularity Level

Occurrences %

Average Apache Mycomp Unicase Eureka

Domain

Implementation 54 58 37 62 60

Project 31 29 34 29 30

Requirement 12 10 26 6 7

Object

Method 33 33 49 28 20

Class 29 29 25 31 32

Line 17 17 8 17 27

Component 15 14 16 19 10

Activity

Edit 53 55 41 57 60

SE Process 36 34 42 30 39

Knowledge 12 13 15 11 9

Page 14: Can Development Work Describe Itself?

©W.Maalej,Mai2010

FindingsonInformaIonGranularity

AnalyzingWorkDescripIon–MSR2010 14

ThemajorityofworkdescripKons(62%)includeinformaKonfromasinglegranularitylevel

1

Developersthinkconsistently(inasingleabstracKonlevel)whentakingnotesaboutarKfacts

2

Theshorterthesessionisthemorefine‐grainedarethedescribedarKfacts

3

LevelsofacKvitygranularityoverlap(edit,processandknowledge)

4

Page 15: Can Development Work Describe Itself?

©W.Maalej,Mai2010 15

Outline

WorkDescripIonAutomaIon

ResearchSeRng

Research Results

MoIvaIon

2

1

3

4

AnalyzingWorkDescripIon–MSR2010

Page 16: Can Development Work Describe Itself?

©W.Maalej,Mai2010

TwoMainEnablersForAutomaIngWorkDescripIons

AnalyzingWorkDescripIon–MSR2010 16

SharedsemanKcsofdevelopers’workingcontext,i.e.acKviKes,

arKfacts,andproblems

HeurisKcsderivedfromempiricalfindingsondevelopers’behavior

AutomaKngWorkDescripKon

Page 17: Can Development Work Describe Itself?

©W.Maalej,Mai2010

SharedSemanIcstoAnnotateContext:Developers’InteracIons

AnalyzingWorkDescripIon–MSR2010 17

Page 18: Can Development Work Describe Itself?

©W.Maalej,Mai2010

SharedSemanIcstoAnnotateContextDevelopers’ArIfacts

AnalyzingWorkDescripIon–MSR2010 18

Page 19: Can Development Work Describe Itself?

©W.Maalej,Mai2010

HeurisIcstoGenerateWorkDescripIons

AnalyzingWorkDescripIon–MSR2010 19

1

2

3

4

Four factors to generate work

description

Developers Preferences •  Learn from previous behavior of

developers and which information they describe in which situation

Appropriate Granularity •  Guess the appropriate

level of detail Relevant vs. Irrelevant Context •  Only a subset of artifacts

concerned by the interactions is included in the description

•  Useful metrics are accumulated usage duration, usage age, and usage frequency

Problem-Solution States •  Detect if a developer is

encountering a problem, searching for a solution, or applying a solution

•  Indictors are are error messages, breakpoint usage, searches, or usage of particular keywords

Page 20: Can Development Work Describe Itself?

©W.Maalej,Mai2010 AnalyzingWorkDescripIon–MSR2010 20

•  MostinformaIonenIIescanbecreatedautomaIcallybyobservingdeveloper’scontext

•  ForthatweproposeasetofontologiesandheurisIcstobeused

InformaIonEnIIes

• InformaKononacKviKes,arKfacts,problems,newwork,andstatusisincludedforworkmanagement

• InformaKononsoluKons,raKonale,andexperienceisincludedtocaptureandshareknowledge

InformaIonGranularity

• Therearedifferentlevelsofdomain,object,andacKvitygranularity

• TheseareusedconsistentlyandwithcommonpaMerns

Developers‘Preference

• Developerseitherthinkproblem‐centeredorarKfact‐centeredwhendescribingtheirwork

• TheyusewelldefinedinformaKonpaMernssuchas<acKvityconcernsarKfacts>

SummaryoftheTalk

Page 21: Can Development Work Describe Itself?

©W.Maalej,Mai2010

Feedback,QuesIons,SuggesIonsandCollaboraIonareWelcomed!

AnalyzingWorkDescripIon–MSR2010 21

Hans‐JörgHappelFZI

[email protected]

WalidMaalejTUM

[email protected]