17
Constructing Software Knowledge Graph from Software Text Sirui Li (u5831882) Supervisor:Dr. ZhenchangXing Team Member: Hongwei Li, Jiamou Sun COMP8755 IndividualComputingProject Semester 1, 2018

Constructing Software Knowledge Graph from Software Textcourses.cecs.anu.edu.au/courses/CSPROJECTS/18S1/finalTalks/u... · Evaluation (RQ3: The Improvement of API Caveats Accessibility)

  • Upload
    dobao

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

ConstructingSoftwareKnowledgeGraphfromSoftwareText

Sirui Li(u5831882)Supervisor:Dr.ZhenchangXing

TeamMember:Hongwei Li,Jiamou SunCOMP8755IndividualComputingProject

Semester1,2018

WhatisSoftwareText?

• APIdocumentation

• StackOverflowPost

• BugReport

Motivation(FormativeStudy)• AccessibilityissueofAPIusagedirectives(APIcaveats)

Summary:Ourformativestudyshowsthatmanyprogramming issuescouldactuallybeavoided ifdeveloperswereawareof relevantAPIcaveatsinAPIdocumentation.Unfortunately,developersmostlydiscoverAPIcaveatspostmortemaftersomething wronghappened, ratherthanbewaring oftheAPIcaveatsbeforehand toavoidthemistakesinthefirstplace.

Goal

TotackletheaccessibilityofAPIcaveatsbymininganAPIcaveatsknowledgegraphfrommultiplesourcesofAPIdocumentation.

Approach

1.Preprocessinputdocumentation2. BuildanAPIskeletongraphfromsemi-structuredAPIreferencedocumentation3.ExtractAPIcaveatsentencesfromAPItextualdescriptions4.ConstructtheAPI-caveatsknowledgegraphbylinkingAPIcaveatsentencestorelevantAPIs

Approach(Preprocessinputdocumentation)

• Inputdocumentation:APIreferencedocumentationandAPItutorialse.g.AndroidAPIreference,AndroidDeveloperGuides

• Software-specifictokenizer (Yeetal)+StanfordCoreNLPe.g.“voidsetOnBufferAvailableListener (Alloca- tion.OnBufferAvailableListenercallback)SetanotificationhandlerforUSAGE_IO_INPUT”

Approach(BuildanAPIskeletongraph)

• FromAPIreferencedocumentation1. Entities:classes,interfaces,fields,methodsandparameters.2. Relations:containment,inheritance/implementation,fielddatatype,

methodreturntype,methodparametertype,andmethod-thrown-exception.

Approach(ExtractAPICaveatSentences)Eachpatternisdefinedbyaregularexpression:1.Error/Exception:Youmuststoreastrongreferencetothelistener,otherwiseitwillbesusceptible togarbagecollection2.Recommendation:youarebetteroffusingJobIntentService,whichuses jobsinsteadofservices...3.Alternative:Youmustcallrelease()whenyouaredoneusingthecamera,otherwise itwillremainlockedandbeunavailabletoapplications4.Imperative:Donot confusethismethodwithactivitylifecyclecallbackssuchasonPause()...5.Note:Note:Forallactivities,youmustdeclareyourintentfiltersinthemanifestfile6.Conditional:When usingasubclass ofAsyncTask torunnetworkoperations, youmustbecautious..7.Temporal:Thismaybenull iftheserviceisbeingrestartedafteritsprocesshasgoneaway8.Affirmative:Theidentifierdoesnothavetobeunique inView,butitshould bepositive9.Negative:Anyactivitiesthatarenotdeclaredtherewill notbeseenbythesystemandwillneverberun10.Emphasis:Only objectsrunning ontheUIthreadhaveaccesstootherobjectsonthatthread

Approach(BuildAPICaveatsKnowledgeGraph)

1. Co-referenceResolutiona) StanfordCoreNLPb) Declaration-basedResolution:“thismethod”,“thisclass”,etc.

e.g.Activity.onActionModeStarted:“Activitysubclassesoverridingthismethod shouldcallthesuperclassimplementation.Ifyouoverridethismethodyoumustcallthroughtothesuperclassimplementation.”

2. LinkingAPICaveatstoAPIEntitiesa) Hyperlinkbased

e.g.“ItthrowsActivityNotFoundException if...”ishyperlinkedtothereferencepageof“ActivityNotFoundException”,wecanthenlinkthissentencetothe“ActivityNotFoundException”

b) Declarationbasede.g.View.setId(int):“DonotpassaresourceID”and“Theidentifiershouldbeapositivenumber”.

c) Openlinking(Subject-Verb-Objecttriples)e.g. “Youmustcallrelease()...” linkswith“release()”

Evaluation

• RQ1:WhatistheabundanceofdifferentsubcategoriesofAPIcaveatsinAPIdocumentation?• RQ2:CanourapproachaccuratelyextractAPIcaveatsentences,re-solveco-referencesinthesesentences,andlinkAPIcaveatsentencestoAPIentities?• RQ3:CanourAPIcaveatsknowledgegraphandAPIcaveatssearchimprovetheaccessibilityofAPIcaveats,comparedwithtraditionaldocumentationsearch?

Evaluation(RQ1:TheAbundanceofAPICaveat)

• ACFORscale(Abundant,Common,Frequent,Occasional,Rare)1

1http://en.wikipedia.org/wiki/Abundance_(ecology).

Conditional, Affirmative,NegativeandEmphasisAPIcaveatsaredominant inourknowledgegraph.ThefactthatmostofAPIcaveatsdonothaveexplicitcaveatindicatorscouldhelptoexplainwhyAPIcaveatsarehardtonoticeintheAPIdescriptions, especiallythelengthyones.

Evaluation(RQ2:TheQualityofAPICaveatsKnowledgeGraph)• AccuracyofExtractingAPICaveatSentences

a) Cohen’skappametric:88%->almostperfectagreementb) Accuracy:100%

• AccuracyofCo-referenceResolutiona) Cohen’skappametric:97%->almostperfectagreementb) Accuracy:74.22%

• AccuracyofCaveat-Sentence-APILinkinga) Cohen’skappametric:99.74%(declaration-based),98.70%(openlinking)->almostperfect

agreementb) Accuracy:99.48%(declaration-based),98.44%(openlinking)

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

Participants:12third- andfourth-yearunder- graduatestudents fromourschool(NoneofthemhaveAndroiddevelopmentexperience.)

1.ControlGroup:useGooglesearchenginetosearchforAPIcaveatsontheAndroidDeveloperswebsite2.ExperimentalGroup:useoursearchtooltosearchforAPIcaveatsintheAndroidAPIcaveatsknowledgegraphweconstruct.

ExperimentProcedure:Ansimpleapplicationrecordstheiranswers,thetimeandthedifficultyofthequestionandparticipants’confidenceinthesubmittedanswerusingthe5-pointLikertscale.

DataAnalysis:question-completion-timestatisticsandthequestion-difficultyandanswer-confidenceratingsofthetwogroups

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

Theparticipantsoftheexperimentalgroupcompletethequestions fasterthanthoseofthecontrolgroup(86.13±22.26secondsversus142.33±50.02seconds),andtheAPIcaveatsthattheexperimentalgroupfindsaremoreaccuratethanthosefoundbythecontrolgroup (62.91±6.76%versus40.94±15.42%).TheWilcoxonRankSumTestshowsthatboth thedifferencebetweenquestion-completion timeandthedifferenceofcorrect-API-caveatspercentagebetweenthetwogroupsarestatisticallysignificantatp-value<0.05.

Evaluation(RQ3:TheImprovementofAPICaveatsAccessibility)

(a)QuestionDifficulty

(b)AnswerConfidence

Theobjectiveperformance resultsandthe“surprising”subjectratingsrevealtheaccessibilityissueofAPIcaveatsinAPIdocumentationwhichcouldcreateanillusionofalreadyknowing therightAPIusage.AsourknowledgegraphmakestheAPIcaveatsmoreeasilyaccessible,bewaring oftheAPIcaveatswouldmakethedevelopersrealizethatusinganAPIproperlymaynotbeaseasyasitlooks.ThisimprovedawarenessofAPIcaveatscouldmakethedevelopersmorecautiouswhenusinganAPI,andthuspotentiallyavoidingsomemistakesinthefirstplace.

Questions?Suggestions?Thankyou! ☺