APISan: Sanitizing API Usages through Semantic Cross-checking - … · 2019-12-18 · APISan:...

Preview:

Citation preview

APISan:SanitizingAPIUsagesthrough

SemanticCross-checking

Insu Yun,Changwoo Min,Xujie Si, Yeongjin Jang,Taesoo Kim,Mayur Naik

GeorgiaInstituteofTechnology

1

APIsintoday’ssoftwareareplentifulyetcomplex

•Example:OpenSSL- 3841 APIsin[v1.0.2h]- 3718in[v1.0.1t] ->3841in[v1.0.2h](+123 APIs)- OpenSSHuses158 APIsofOpenSSL

2

ComplexAPIsresultinprogrammers’mistakes

•Problemsindocumentation- Incomplete:e.g.,lowdetailsinhostnameverification- Long:e.g.,43K linesinOpenSSL documentation- Lack:e.g.,internalAPIs

•Lackofautomatictoolsupport- e.g.,missingformalspecificationandprecisesemantics

3

Problem:APImisusecancausesecurityproblems

4

àMITM

à Codeexecutionà PrivilegeEscalation

Today’spracticestohelpprogrammers

•Formalmethod- Problem:lackofspecification

•Modelchecking- Problem:manual,lackofsemanticcontext

•Symbolicexecution- Problem:failedtoscaleforlargesoftware

5

Promisingapproach:findingbugsbyusingexistingcode

•“Bugsasdeviantbehavior”[OSDI01]- Syntactictemplate:e.g.,checkNULLonmalloc()

•“Juxta”[SOSP15]-Inferringcorrectsemanticsfrommultipleofimplementations-Filesystemspecificbugfindingtool

6

Researchgoal:canweapplythismethodtoany kindofsoftwarewithoutmanualefforts?

Ouridea:comparingAPIusagesinvariousimplementation

•Example:findingOpenSSL APImisuses

7

APISan

Majorityuses(Likelycorrect)

Deviantuses(Likelybug)

…curlcurlcurlnmapcurlnginx

nginxcurlnmapnginxcurlhexchat

Ourapproachisverypromising

•EffectiveinfindingAPImisuses-76newbugs

•Scaletolarge,complexsoftware-Linuxkernel,OpenSSL,PHP,Python,etc.-Debian packages

8

TechnicalChallenges

•APIusesaretoodifferentfromimpl.toimpl.

•SubtlesemanticsofthecorrectAPIuses

•Large,complexcodeusingAPIs

9

Example:OpenSSL APIuses

• SSL_get_verify_result()-Getresultofpeercertificateverification-nopeercertificateà alwaysreturnsX509_V_OK

10

if(SSL_get_verify_result()==X509_V_OK){…}if(SSL_get_verify_result()==X509_V_OK&&SSL_get_peer_certificate()!=NULL ){…}

Example:acorrectimplementationusingOpenSSL API

11

cert =SSL_get_peer_certificate(handle);if (!cert){…}err =SSL_get_verify_result(handle);if (err ==X509_V_OK){…}

curl

CorrectSemanticallysamewithcorrectusage

if(SSL_get_verify_result()==X509_V_OK&&SSL_get_peer_certificate()!=NULL ){…}

Example:providingvariousimplementationsusingOpenSSL

12

cert =SSL_get_peer_certificate(handle);if(!cert){…}err =SSL_get_verify_result(handle);if(err ==X509_V_OK){…}

curl

if(SSL_get_verify_result(conn)!=X509_V_OK)returnNGX_OK;

cert=SSL_get_peer_certificate(conn);if(cert){…}

nginx

cert=SSL_get_peer_certificate(ssl);if(cert==NULL)return0;

if(SSL_get_verify_result(ssl)!=X509_V_OK){…}

nmap

err=SSL_get_verify_result(ssl);switch(err){caseX509_V_OK:cert=SSL_get_peer_certificate(ssl);

hexchat

Correct

Correct

Correct

Incorrect

//if(cert)ismissed

Canwedistinguishbetweencorrect implementationsandbuggy implementations?

Challenge1:APIusagesaredifferentfromeachother

13

cert =SSL_get_peer_certificate(handle);if(!cert){…}err =SSL_get_verify_result(handle);if(err ==X509_V_OK){…}

curl

if(SSL_get_verify_result(conn)!=X509_V_OK)returnNGX_OK;

cert=SSL_get_peer_certificate(conn);if(cert){…}

nginx

cert=SSL_get_peer_certificate(ssl);if(cert==NULL)return0;

if(SSL_get_verify_result(ssl)!=X509_V_OK){…}

nmap

err=SSL_get_verify_result(ssl);switch(err){caseX509_V_OK:cert=SSL_get_peer_certificate(ssl);

hexchat

//if(cert)ismissed

Correct

Correct

Correct

Incorrect

Challenge2:subtlesemanticsofthecorrectAPIusages

14

cert =SSL_get_peer_certificate(handle);if(!cert){…}err =SSL_get_verify_result(handle);if(err ==X509_V_OK){…}

curl

if(SSL_get_verify_result(conn)!=X509_V_OK)returnNGX_OK;

cert=SSL_get_peer_certificate(conn);if(cert){…}

nginx

cert=SSL_get_peer_certificate(ssl);if(cert==NULL)return0;

if(SSL_get_verify_result(ssl)!=X509_V_OK){…}

nmap

err=SSL_get_verify_result(ssl);switch(err){caseX509_V_OK:cert=SSL_get_peer_certificate(ssl);

hexchat

//if(cert)ismissed

Correct

Correct

Correct

Incorrect

Challenge3:Large,complexcodeusingAPIs

•Onaverage,morethan100KLoC-curl:110KLoC-nginx :127KLoC-nmap:169KLoC-hexchat:61KLoC

•Linux:>1MLoC

15

Challenge3:Large,complexcodeusingAPIs

16

cert =SSL_get_peer_certificate(handle);if(!cert){…}...len =BIO_get_mem_data(mem, (char**)&ptr);infof(data, "start date:%.*s\n",len,ptr);rc =BIO_reset(mem);…err =SSL_get_verify_result(handle);if(err ==X509_V_OK){…}

curl

cert =SSL_get_peer_certificate(handle);if(!cert){…}err =SSL_get_verify_result(handle);if(err ==X509_V_OK){…}

curl(simplified)

OverviewofAPISan

17

Returnvaluechecker

Argumentchecker

Causalitychecker

Conditionchecker

4Checkers

SourcecodeSourcecodeSourcecode

APIs Arguments

Constraints

Symbolicexecutiondatabase

RelaxedSymbolicExecution

:minor,butnotbug

:minorandbug

…Minorityuses

...Rankedminorityuses

OverviewofAPISan

18

Returnvaluechecker

Argumentchecker

Causalitychecker

Conditionchecker

4Checkers

SourcecodeSourcecodeSourcecode

APIs Arguments

Constraints

Symbolicexecutiondatabase

RelaxedSymbolicExecution

:minor,butnotbug

:minorandbug

…Minorityuses

...Rankedminorityuses

SymbolicexecutioncanberelaxedinfindingAPIcontexts

•Symbolicexecutionisnotscalable-Pathexplosion-SMTisexpensive,naturallyNP-complete

•Methodstorelaxsymbolicexecution-Limitinginter-proceduralanalysis-Removingbackedges-Range-based

19

Method1:Limitinginter-proceduralanalysis

•HowAPIsareused O•HowAPIsareimplemented X

20

cert =SSL_get_peer_certificate(handle);if(!cert){…}err =SSL_get_verify_result(handle);if(err !=X509_V_OK){…}

Method2:Removingbackedges•APIcontextscanbecapturedwithinloops-e.g.,malloc()andfree()arematchedinsidealoop

21

for(…){cert =SSL_get_peer_certificate(handle);if(!cert){…}err =SSL_get_verify_result(handle);if(err !=X509_V_OK){…}}

Method3:Range-based•Mostofarguments&returnvaluesareinteger

•Clangusesrange-basedsymbolicexecution

22

cert!=NULL∧ err==X509_V_OK

cert= {[-MAX,-1],[1,MAX]}err={[X509_V_OK,X509_V_OK]}

Buildingper-pathsymbolicabstractions

•Path-sensitive,context-sensitive

•Recordsymbolicabstractions-APIcalls-Symbolicexpressionofarguments-Constraints

23

Examples:Buildingper-pathsymbolicabstractionsfromsourcecode

24

Call SSL_get_peer_certificate(handle)

Constraint SSL_get_peer_certificate(handle)={[-MAX,-1],[1, MAX]}

Call SSL_get_verify_result(handle)

Constraint SSL_get_verify_result(handle)={[X509_V_OK, X509_V_OK]}

cert =SSL_get_peer_certificate(handle);if (!cert){…}err =SSL_get_verify_result(handle);if (err ==X509_V_OK){…}

Sourcecode

Symbolicabstractions

Examples:Buildingper-pathsymbolicabstractionsfromsourcecode

25

cert =SSL_get_peer_certificate(handle);if (!cert){…}err =SSL_get_verify_result(handle);if (err ==X509_V_OK){…}

Sourcecode

SymbolicAbstractions#1

….

SymbolicAbstractions#2

SymbolicAbstractions#3

OverviewofAPISan

26

Returnvaluechecker

Argumentchecker

Causalitychecker

Conditionchecker

4Checkers

SourcecodeSourcecodeSourcecode

APIs Arguments

Constraints

Symbolicexecutiondatabase

RelaxedSymbolicExecution

:minor,butnotbug

:minorandbug

…Minorityuses

...Rankedminorityuses

Foursemanticcontextshavesecurityimplications

•Orthogonal,essential,security-relatedcontexts-Returnvalue-Arguments-Causality-Condition

27

Context1:Returnvalue

•Returncomputationresultorexecutionstatus

•NULLdereference•Privilegeescalation-e.g,Windows,CVE-2014-4113

28

ptr =malloc(size)if(!ptr){…}

Context2:Arguments

• InputsforcallingAPIsandtheirrelationship

•Formatstringbug•Memorycorruption

29

printf(buf);

ptr =malloc(size1);memcpy(ptr,src,size2);

Context3:Causality

•CausalrelationshipbetweenAPIs

•Deadlock•Memoryleak

30

lock();unlock();

malloc();free();

Context4:Condition

• Implicitpre- andpostconditionforcallingAPIs

•MITM

31

if(SSL_get_verify_result()==X509_V_OK&&SSL_get_peer_certificate()!=NULL)

Extractcontextsfromsymbolicabstractions

•Symbolicabstractionscontains{APIs,Arguments,Constraints}

•Returnvalue ß Constraints•Arguments ß Arguments•Causality ß APIs•Condition ß Constraints+APIs

32

Example:extractconditioncontextsfromsymbolicabstractions

33

Call SSL_get_peer_certificate(handle)

Constraint SSL_get_peer_certificate(handle)={[-MAX,-1],[1, MAX]}

Call SSL_get_verify_result(handle)

Constraint SSL_get_verify_result(handle)={[X509_V_OK,X509_V_OK]}

curl

Event Line

SSL_get_verify_result={[X509_V_OK,X509_V_OK]} {curl}

Constraint Line

SSL_get_peer_certificate={[-MAX,-1],[1,MAX]} {curl}

… ….

Anyconstraintorcall

Linenumberswheneventiscalled

Example:extractconditioncontextsfromsymbolicabstractions

34

Event Line

SSL_get_verify_result={[X509_V_OK,X509_V_OK]} {curl,nginx}

Constraint Line

SSL_get_peer_certificate={[-MAX,-1],[1,MAX]} {curl,nginx}

… ….

Call SSL_get_verify_result(conn)

Constraint SSL_get_verify_result(handle)== {[X509_V_OK,X509_V_OK]}

Call SSL_get_peer_certificate(conn)

Constraint SSL_get_peer_certificate(conn)!= {[-MAX,-1],[1, MAX]}

nginx

Example:extractconditioncontextsfromsymbolicabstractions

35

Call SSL_get_peer_certificate(ssl)

Constraint SSL_get_peer_certificate(ssl)={[-MAX,-1],[1, MAX]}

Call SSL_get_verify_result(ssl)

Constraint SSL_get_verify_result(ssl)={[X509_V_OK,X509_V_OK]}

nmap

Event Line

SSL_get_verify_result={[X509_V_OK,X509_V_OK]} {curl,nginx,nmap}

Constraint Line

SSL_get_peer_certificate={[-MAX,-1],[1,MAX]} {curl,nginx,nmap}

… ….

Example:extractconditioncontextsfromsymbolicabstractions

36

Call SSL_get_verify_result(ssl)

Constraint SSL_get_verify_result(ssl)={[X509_V_OK,X509_V_OK]}

Call SSL_get_peer_certificate(ssl)

hexchat

Event Line

SSL_get_verify_result={[X509_V_OK,X509_V_OK]}

{curl,nginx,nmap,hexchat}

Constraint Line

SSL_get_peer_certificate={[-MAX,-1],[1,MAX]} {curl,nginx,nmap}

… ….

Example:findmajority&minorityusagesfromcontexts

37

Event Line

SSL_get_verify_result={[X509_V_OK,X509_V_OK]}

{curl,nginx,nmap,hexchat,…}

Constraint Line

SSL_get_peer_certificate={[-MAX,-1],[1,MAX]} {curl,nginx,nmap,…}

… ….

Majorityuses(Likelycorrect)

Deviantuses(Likelybug) =total_event – majority_use ={hexchat,…}

OverviewofAPISan

38

Returnvaluechecker

Argumentchecker

Causalitychecker

Conditionchecker

4Checkers

SourcecodeSourcecodeSourcecode

APIs Arguments

Constraints

Symbolicexecutiondatabase

RelaxedSymbolicExecution

:minor,butnotbug

:minorandbug

…Minorityuses

...Rankedminorityuses

Falsepositivescanbehappenedinmajorityanalysis

•Lackofinter-proceduralanalysis-e.g.,checkareturnvalueofmalloc()insideafunction

•Correlation≠ Causation-e.g.,fprintf()isusedforprintingdebugmessageswhenopen()isfailed

•Correctminoruses-e.g.,strcmp()==0,strcmp()>0

39

Rankingcanmitigatefalsepositives

•Moremajoritypatternrepeated,morebug-likely-e.g.,999majority,1minority>10majority,1minority

•Generalinformation-e.g.,mostofallocationfunctionshave“alloc”intheirnamesandarerequiredtochecktheirreturnvalues

•Domainspecificknowledge-e.g.,SSLAPIsstartwithastring“SSL”

40

Ourapproachisformalizedasageneralframework

41

ImplementationofAPISan

•9KLoC intotal-Symbolicdatabasegeneration:6KLoC ofC/C++(Clang3.6)-APISan library:2KLoC ofPython

•Checkers:1KLoC ofPython-Returnvaluechecker:131LoC-Argumentchecker:251LoC-…

42

Evaluationquestions

•HoweffectiveisAPISan infindingnewbugs?

•Howeasytouseandeasytoextend?

•HoweffectiveisAPISan’s rankingsystem?

43

APISan iseffectiveinfindingbugs

•Found76newbugs inlarge,complexsoftware-Linuxkernel,OpenSSL,PHP,Python,andDebian packages

•Securityimplication-e.g.,CVE-2016-5636:Pythonzipimporterheap overflow(CodeexecutioninGoogleAppEngine)

44

APISan iseasytousewithoutanymanualannotation

•Togeneratesymboliccontextdatabase$apisan make#useexistingbuildcommand

•Runachecker$apisan --checker=cpair #cpair :causalitychecker

•Runachecker(inter-application)$apisan --checker=cpair --db=app1,app2

45

APISan iseasytoextend

•e.g.,Integeroverflowcheck• IntegeroverflowsensitiveAPIs-Havesecurityimplicationswhenintegeroverflowhappens-e.g.,memoryallocationfunctions

• Integeroverflowß Arguments+Constraints-Ifargumentscontainsbinaryoperatorsà checkintegeroverflowwithingivenconstraints

46

CheckintegeroverflowwithAPISan

•Collectallintegeroverflows•Rankingstrategy-Moreintegeroverflowpreventedbyconstraintsà APIsarelikelyintegeroverflowsensitive

-Incorrectconstraints>Missingconstraints;Missingconstraintscanbecausedbylimitedanalysis

•Found6integeroverflows(167LoC)

47

APISan’s rankingsystemiseffective

• LinuxKernelwithReturnValueChecker

• Total2,776reports• Audited445reports• Found54bugs

48

30bugs in20APIs 24bugs in3APIs

15bugsin1APIs

Limitation•Nosoundness&Nocompleteness•Highfalsepositiverate:>80%•Tooslowtofrequentlyanalyze-32-coreXeonserverwith256GBRAM-ForLinuxkernel,

Generatingdatabase:8hoursEachchecker:6hours

•Notfullyresolvepathexplosion-stoppedinfunctionswhichhavepathexplosion

49

Conclusion

•APISan:anautomaticwayforfindingAPImisuse-Effective:Finding76newbugs-Scalable:TestedwithLinuxkernel,Debian packages,etc

•APISan *WILL*bereleasedasopensource-https://github.com/sslab-gatech

50

Thankyou!

Questions?

51