23
Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen, Tien Nguyen, Gary Leavens, Samantha Khairunnesa, John Singleton, Hung Phan, Robert Dyer, and Vasant Honavar

Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

TacklingLackofSoftwareSpecificationsASustained,SustainabilityandProductivityCrisis

HrideshRajanCollaborationwithHoan A.Nguyen,Tien Nguyen,GaryLeavens,SamanthaKhairunnesa, JohnSingleton,HungPhan,RobertDyer,andVasant Honavar

Page 2: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Toproducecriticalsoftwareinfrastructuresoitis:– ofhighestqualityandfreeofdefects,– producedethicallyandwithinbudget,and– maintainable,upgradeable,portable,scalable,secure.

• Pervasivenessofsoftwareinfrastructuresinsuchcriticalareasaspower,bankingandfinance,airtrafficcontrol,telecommunication,transportation,nationaldefense,andhealthcareneedustoaddressthischallenge.

Page 3: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Softwarespecifications*canhelpachievethissustainabilityandproductivitychallenge.

*Softwarespecifications:formal,oftenmachinereadable,descriptionofsoftware’sintendedbehavior,e.g.{Pre}S{Post} behavioralspecifications

Page 4: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.–Maintenanceofcodecanbecomeeasier– Lowercostofcodeunderstanding&totallifecyclecost– Specification-guidedcodeoptimization– Preventintroducingnewbugsduringmaintenance– Codereuse– Specification-guidedsynthesis–Modularanalysisandverification,scalabletools

Page 5: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.– Maintenanceofcodecanbecomeeasier,becauseengineerswillnotneedtospendtimereverseengineeringcode.

– Lowercostofcodeunderstanding,lowertotallifecyclecost.– Optimizationofcodewillbegreatlyfacilitated– Preventintroducingnewbugsduringmaintenance– Codereuse– Synthesisofcode– Modularanalysisandverificationleadingtoscalabletools

Despitethesebenefitsuseful,non-trivial

specificationsaren’twidelyavailable

Page 6: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.– Maintenanceofcodecanbecomeeasier,becauseengineerswillnotneedtospendtimereverseengineeringcode.

– Lowercostofcodeunderstanding,lowertotallifecyclecost.– Optimizationofcodewillbegreatlyfacilitated– Preventintroducingnewbugsduringmaintenance– Codereuse– Synthesisofcode– Modularanalysisandverificationleadingtoscalabletools

Whyaren’tsoftwarespecificationswidely

available?

Page 7: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.– Maintenanceofcodecanbecomeeasier,becauseengineerswillnotneedtospendtimereverseengineeringcode.

– Lowercostofcodeunderstanding,lowertotallifecyclecost.– Optimizationofcodewillbegreatlyfacilitated– Preventintroducingnewbugsduringmaintenance– Codereuse– Synthesisofcode– Modularanalysisandverificationleadingtoscalabletools

CostEducationTools

Libraries

Page 8: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.– Maintenanceofcodecanbecomeeasier,becauseengineerswillnotneedtospendtimereverseengineeringcode.

– Lowercostofcodeunderstanding,lowertotallifecyclecost.– Optimizationofcodewillbegreatlyfacilitated– Preventintroducingnewbugsduringmaintenance– Codereuse– Synthesisofcode– Modularanalysisandverificationleadingtoscalabletools

CostEducationTools

Libraries

Page 9: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.– Maintenanceofcodecanbecomeeasier,becauseengineerswillnotneedtospendtimereverseengineeringcode.

– Lowercostofcodeunderstanding,lowertotallifecyclecost.– Optimizationofcodewillbegreatlyfacilitated– Preventintroducingnewbugsduringmaintenance– Codereuse– Synthesisofcode– Modularanalysisandverificationleadingtoscalabletools

Unspecifiedlibrariesarerootcause- increasecostofspecification

- makeeducationharder- maketoolsupportdifficult

- makespecifyinglibrariesharder

Page 10: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.– Maintenanceofcodecanbecomeeasier,becauseengineerswillnotneedtospendtimereverseengineeringcode.

– Lowercostofcodeunderstanding,lowertotallifecyclecost.– Optimizationofcodewillbegreatlyfacilitated– Preventintroducingnewbugsduringmaintenance– Codereuse– Synthesisofcode– Modularanalysisandverificationleadingtoscalabletools

HowtoSolveit?Specifykeylibraries- decreasecostofspecification

- makeeducationeasier(examples)- maketoolsupporteasier

- makespecifyinglibrarieseasier

Page 11: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Sustainabilityandproductivitychallenge

• Ifspecificationsarewidelyavailable,awidevarietyoftechniquesforaddressingthesustainabilityandproductivitycrisiscanbeenabled.– Maintenanceofcodecanbecomeeasier,becauseengineerswillnotneedtospendtimereverseengineeringcode.

– Lowercostofcodeunderstanding,lowertotallifecyclecost.– Optimizationofcodewillbegreatlyfacilitated– Preventintroducingnewbugsduringmaintenance– Codereuse– Synthesisofcode– Modularanalysisandverificationleadingtoscalabletools

HowtoSolveit?SpecifykeylibrariesChallenge#1:lowermanualcostof

specifyinglibraries,infermostChallenge#2:inferrich,butpracticalspecifications,allowcodeevolution

Page 12: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

KeyIdeas

PreconditionscanbeminedfromguardedconditionsatthecallsitesofthecodeusingtheAPIs

Preconditionsminedfrommultipleprojects inalarge-scalecodecorpuscanbeusedtofilteroutchaff

voidm(…){…if(pred)lib.api();

…}

MiningPreconditionsofAPIsinLarge-scaleCodeCorpus,FSE’14.

Hoan NguyenRobertDyer* Tien N.Nguyen

Page 13: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Client codeofAPIString.substring(int,int) inprojectSeMoA atrevision1929

completePath_.substring(servletPathStart,extraPathStart)

servletPathStart >=0extraPathStart >=0servletPathStart <=completePath_.length()extraPathStart <=completePath_.length()servletPathStart <=extraPathStart

KeyIdeas

PreconditionscanbeminedfromguardedconditionsatthecallsitesofthecodeusingtheAPIsPreconditionsminedfrommultipleprojects inalarge-scale codecorpuscanbeused tofilteroutchaff

Page 14: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

completePath_.substring(servletPathStart,extraPathStart)

completePath_.charAt(servletPathStart)==‘/’

completePath_.charAt(extraPathStart)==‘/’

KeyIdeas

PreconditionscanbeminedfromguardedconditionsatthecallsitesofthecodeusingtheAPIsPreconditionsminedfrommultipleprojects inalarge-scale codecorpuscanbeused tofilteroutchaff

Page 15: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

ClientmethodM1

Conditions

0<=startstart<=endend<=lengthcontains(‘@’)

BuildCFG

Extractand

Normalize

Infer0<=startstart<=endend<=length

ClientmethodMN

0<startstart<=endend<=lengthends(‘\n’)

ClientmethodM2

...

Preconditions

0=startstart<=endend<=lengthstarts(‘/’)

api(...)

BuildCFG

Extractand

Normalize

BuildCFG

Extractand

Normalize

FilterandRank

CandidatePreconditions

0<=startstart<=endend<=lengthcontains(‘@’)

api(...)

api(...)

KeyIdeas

PreconditionscanbeminedfromguardedconditionsatthecallsitesofthecodeusingtheAPIsPreconditionsminedfrommultipleprojects inalarge-scale codecorpuscanbeused tofilteroutchaff:a.infer, b.filterandrank

Page 16: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Evaluation– Accuracy

DatacollectionSourceForge Apache

Projects 3,413 146

Totalsourcefiles 497,453 132,951

Totalclasses 600,274 173,120

Totalmethods 4,735,151 1,243,911

TotalSLOCs 92,495,410 25,117,837

TotalusedJDKclasses 806(63%) 918(72%)

TotalusedJDKmethods 7,592(63%) 6,109(55%)

Totalmethodcalls 22,308,251 5,544,437

TotalJDKmethodcalls 5,588,487 1,271,210

Almost120millionsSLOCs

Extractedpreconditions frompublished formalspecification forJDKAPIsonJMLwebsite• 797Methods• 1155preconditions

www.jmlspecs.org

GroundTruth

/*@ public normal_behavior@ requires 0 <= beginIndex@ && beginIndex <= endIndex@ && endIndex <= length();@ …

/*@ public behavior@ …@ signals (NoSuchElementException) isEmpty();@*/

Page 17: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

AccuracyofPreconditionsMining

17

Precision Recall Time

SourceForge 84% 79% 17h35m

Apache 82% 75% 34m

Both 83% 80% 18h03m

Performance- ~1minute/condition- 5preconditionsare

newlyfoundfortheJDKAPImethodsthathasalreadyhadJMLspecifications

- Effectivefornewspecs

Class Method Suggest Accept

StringBuffer delete(int,int) 3 Y

replace(int,int,String) 2 Y*

setLength(int) 1 Y

subSequence(int,int) 3 Y

substring(int,int) 3 Y

LinkedList add(int,Object) 2 Y

addAll(int,Collection) 3 Y

get(int) 2 Y

listIterator(int) 2 Y

remove(int) 2 Y

set(int,Object) 2 Y

2classes 11methods 25

Page 18: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

Accuracybysize

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

100%

Datasize(projects)

Precision Recall Fscore

0% 10% 20% 30% 40% 50% 60% 70% 80% 90%

100%

1 2 4 8 16 32 64 Full

Datasize(projects)

Precision Recall Fscore

SourceForge Apache

Page 19: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

63% 19%

18%

Correctness

Correct

GoodStartingPoint

Incorrect

33%

48%

13% 6%

Usefulness

StronglyAgree

Agree

Disagree

StronglyDisagree

UsefulnessEvaluationWeb-basedSurveyhttp://boa.cs.iastate.edu/jml

Page 20: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

KeyIdeas

AdditionallabelscanbeminedfromimplicitbeliefsatthecallsitesofthecodeusingtheAPIs

Implicitbeliefsminedfrommultipleprojects inalarge-scalecodecorpuscanbeusedtostrengthenexplicitlabels

voidm(…){…Oo=newO()lib.api(o);…

}

ExploitingImplicitBeliefstoResolveSparseUsageProbleminUsage-basedSpecification

Mining,OOPSLA’17.Hoan NguyenS.Khairunnessa Tien N.Nguyen

Problem:Sparselabelsinminedcodecorpus

Page 21: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

KeyIdeas

Strongestpostconditioninferenceproducesimplicitlyparallelformulas

Flattening,andrecombiningparallelformulascanleadtomuchsimplerinferredspecifications.

AnAlgorithmandTooltoInferPracticalPostconditions,

Ongoingwork.JohnSingleton GaryT.Leavens

Problem:Usingextantwork,e.g.strongestpostcondition(sp),forpostcondition inferenceproducesimpracticalspecs

sp (IF B THEN S1 ELSE S2) P = (sp S1(P ^B)) _ (sp S2(P ^ ¬B))

Page 22: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

SpecificationReduction

Impact: 84%ofspecifications<¼pageinlength

Page 23: Tackling Lack of Software Specifications...Tackling Lack of Software Specifications A Sustained, Sustainability and Productivity Crisis Hridesh Rajan Collaboration with Hoan A. Nguyen,

23

Weareovercominglackofsoftwarespecifications,acriticalhurdleforhighassuranceSE,bycombining

programanalysisanddatamining.

boa.cs.iastate.edu

[email protected]