46
2020 AppSec California – Santa Monica, CA Choosing the right static code analyzers based on hard data This material is based on research sponsored by the Department of Homeland Security (DHS) Science and Technology Directorate, Cyber Security Division (DHS S&T/CSD) via contract number HHSP233201600062C. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Department of Homeland Security. 24 Jan 2020

AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

2020AppSec California – Santa Monica, CA

Choosing the right static code analyzers based on

hard data

This material is based on research sponsored by the Department of Homeland Security (DHS) Science and Technology Directorate, Cyber Security Division (DHS S&T/CSD) via contract number HHSP233201600062C. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the Department of Homeland Security. 24 Jan 2020

Page 2: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

About the speaker

2

Secure Decisions§ Cyber R&D, focusing on application security§ Primarily serving DHS and DoD,

some commercial companies

Chris HornSenior ResearcherSecure Decisions

2

GrammaTech, Inc.§ Prime contractor to DHS Science & Technology for STAMP

Page 3: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Outline of today’s talk

Introduction§ Using static analysis is a good idea

§ Knowing which analyzer to use isn’t easy

Comparing static software analyzers§ Seven (7) categories of capabilities

§ How we collect information for Kompar

Kompar system§ Progress

§ Plans

PART I

PART II

PART III

3

Page 4: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

IntroductionStatic software analysis is a good idea, butknowing which analyzer to use isn’t easy

4

Page 5: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Static analysis is a way to examine softwarewithout running / executing it§ Typically analyzes source code

§ Some analyzers work on compiled binaries

§ Goal is to find quality issues

Analyzers for most languages / formats§ Open source

§ Commercial / proprietary

What is this static analysis?

ALSO KNOWN AS

Static code analysisStatic program analysisSource code analysis

5

Page 6: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Like help from an expert

Photo: https://en.wikipedia.org/wiki/Pair_programming#/media/File:Pair_programming_1.jpg

6

Page 7: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

What types of issues can static analysis find?

Finds reliability, security, performance, and maintainability issues§ Capabilities depend on the analysis technology

§ Better at finding implementation issues than design issues

– Buffer handling

– Code quality

– Control flow management

– Encryption and randomness

– Error handling

– File handling

– Formatting

Willis, Chuck, and Kris Britton. “Sticking to the Facts.” presented at the Black Hat 2011, Las Vegas, Nevada, August 4, 2011.https://media.blackhat.com/bh-us-11/Willis/BH_US_11_WillisBritton_Analyzing_Static_Analysis_Tools_Slides.pdf.US-CERT https://www.us-cert.gov/bsi/articles/tools/source-code-analysis/source-code-analysis-tools---overview

− Hardcoded secrets− Information leaks− Initialization and shutdown− Injection− Number handling− Pointer & reference handling

7

Page 8: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Using analyzers improves code quality & security

“…automated static analysis is an economical complement to other

verification and validation techniques.”

“We have built a successful static analysis infrastructure at Google that prevents

hundreds of bugs per day from entering the Google codebase”

Nortel https://ieeexplore.ieee.org/document/1628970Google https://cacm.acm.org/magazines/2018/4/226371-lessons-from-building-static-analysis-tools-at-google/fulltextCoverity https://cacm.acm.org/magazines/2010/2/69354-a-few-billion-lines-of-code-later/fulltextFacebook https://research.fb.com/publications/moving-fast-with-software-verification/

“If you can find code, and the checked system is big enough, and you can compile

(enough of) it, then you will always find serious errors.”

“INFER is deployed and [run against] every code modification in Facebook’s mobile

apps”

8

Page 9: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Sign me up, you say

Java application12,000 LOCMaven buildIntelliJ IDE

Which analyzers can find SQL injection bugs?

9

Page 10: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data 10

There is no reliable source of informationabout software analyzers

Page 11: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Our vision

11

for software analyzersor

Page 12: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Drive adoption through education§ Benefits of software analysis§ Catalog available analyzers§ Relative strengths & weaknesses§ Set realistic expectations

Improve market transparency§ Straightforward comparison§ Standardized, comprehensive

rating methodology§ Informed consumers create

pressure on proprietary tool makers to disclose performance

Build Kompar into a source of information about software analyzers, beginning with static tools

12

Page 13: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

More secure & reliable software

Kompar will benefit multiple stakeholders

Consumers / adopters of analyzers§ Assurance & security analysts, developers

§ What analyzers are available?

§ How should I evaluate analyzers?

§ Which analyzers will best meet my needs?

Analyzer makers / vendors§ How can I drive adoption of my analyzer?

Analyzer researchers§ NIST, MITRE, NSA, academia

§ Where are today’s analyzers weak?

§ Who is interested in my research?

13

Page 14: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Comparing static software analyzersThere’s more to analyzers than just finding defects

14

Page 15: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

1. Basic information§ Software license§ On-premise vs. cloud hosting§ Last release date

2. Process integration§ Analyzer location§ Software inputs§ Viewing & managing output

3. Coverage§ Prog. language & framework§ Weakness types

4. Speed & scalability§ Scan time duration§ Scannable codebase size§ Number of codebases

5. Results quality§ Recall, precision, etc.§ Usability of warnings

6. Reporting

7. Support

Seven categories of analyzer properties

15

Page 16: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

1. Basic information

Common, foundational information about an analysis tool

16

Page 17: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Is the tool mature & up-to-date?§ Tool first release date § Version release date

Is it run in-house, or off-prem?§ Self-hosted or SaaS§ Which operating system will it

run on?

Do I have to pay for it?

How is it licensed?

Where can I go to learn more?§ Tool website

1. Basic information

17

Page 18: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

2. Process integration

Capabilities supporting different ways of using an analyzer in my development processes

18

Page 19: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

2. Process integration

“Our most important insight is that careful developer workflow integration is key for static analysis tool adoption.”

“…desirable that the programmer does not have to do anything else than his/her normal job, they should see analysis results as part of their normal workflow ratherthan requiring them to switch to a different tool.”

Google https://cacm.acm.org/magazines/2018/4/226371-lessons-from-building-static-analysis-tools-at-google/fulltextFacebook https://research.fb.com/publications/moving-fast-with-software-verification/

19

- Google

- Facebook

Page 20: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Process integration

20

Page 21: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

How will analyzer fit in my dev. environment?

21

ManagersIssue tracker

IDE

Developers

Code

Source control

Build tools

Testing tools

Security &auditors

Page 22: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

When & where will the analyzer run?

22

ManagersIssue tracker

IDE

Developers

Code

Source control

Build tools

Testing tools

Security &auditors

Developer workstation§ Live analysis while

coding in IDE?§ Pre-commit invocation?

Build server§ CI integration?§ Command line

interface (CLI)?

Standalone server§ Scheduled scans?§ Source control

integration?

Page 23: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

What inputs does the analyzer require?

23

ManagersIssue tracker

IDE

Developers

Code

Source control

Build tools

Testing tools

Security &auditors

Incremental changes to code (commit, patch, pull request)

Source code during build process

Full project source code

Compiled binaries

Page 24: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

How do people view findings?

24

ManagersIssue tracker

IDE

Developers

Code

Source control

Build tools

Testing tools

Security &auditors

Directly in IDE

Web-based graphical user interface (GUI)

API integration§ SARIF/XML/JSON/CSV§ Issue tracker§ Requirements

management system§ Vulnerability

management system

Page 25: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

4. Speed & scalability

The ability of an analyzer to handle anticipated workloads

25

Page 26: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

4. Speed & scalability

How large a codebase can the analyzer scan?

How many projects can be scanned?

How long does the analyzer take to work?§ Scan duration for <software project>, using default checkers

– BodgeIt Store v1.4.0, Broadleaf Commerce v3.0.3, dotCMS v5.0.3, hadoop v.1.1.2, Jenkins v1.534, OWASP Benchmark v1.2beta, etc.

Does the analyzer have features to handle more work?§ Parallelize on one host

§ Parallelize across multiple hosts

26

Page 27: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

6. Reporting

Capabilities that support presenting information to create understanding and support decision making

27

Page 28: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Graphical user interface (GUI) § Ability to search results § Results remediation workflow § Hierarchical reporting for

multiple projects, teams, departments, etc.

§ Filter results by compliance standard

Centralized reporting§ Role-based access

Results suppression even after code changes

Show differences in results set to previous scan

6. Reporting

28

Page 29: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

7. Support

Indicators of how much assistance and guidance is available

29

Page 30: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

7. Support

Guides & documentation§ Installation

§ User/operator

§ Integration

Open source project health§ A major undertaking itself

– Linux Foundation CHAOSS (Community Health Analytics OSS)

– Black Duck Open Hub

30

Page 31: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

3. Coverage

The extent to which an analyzer can examine my software

31

Page 32: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Coverage is mostly about two questions

Will the analyzer work on my software?§ Programming language & framework

§ Binary format

Can the analyzer find the issues I care about?§ Which issues are claimed can be detected?

32

Page 33: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Static analyzers have limited weakness coverage

Kris Britton and Chuck Willis, “Sticking to the Facts: Scientific Study of Static Analysis Tools”, Sept 2011: http://vimeo.com/32421617

On average,one static analyzer

will only detect

of all weakness types

14%

33

Page 34: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

5. Results quality

How well an analyzer can detect issues and the utility of the reported warnings (how useful they are)

34

Page 35: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Results quality is also mostly about two questions

Can people understand, trust, and use the generated warnings?§ Usability of discovered issues / warnings

– Explanation of warning & suggested severity

– Confidence information about warning

– Context around warning

» Source code

» Control flow

» Data flow

How well does the analyzer detect issues?§ Technical performance measures derived using test suites

– Precision, recall, discrimination rate

35

Page 36: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Kompar platformProgress & plans

36

Page 37: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Kompar is more than just a website

Kompar is a set of technologies that work to collect, generate, store, and present information about software analyzers

§ Test suites used to determine

– Execution speed

– Results quality(precision, recall, etc.)

§ Automation to orchestrate execution of analyzers against test suites

§ Automation to label results with true/false positive status

§ Collection of tool properties

§ System to crowdsource collection of analyzer information

§ Content to educate consumers

§ Functionality to help people learn which analyzers meet their needs

– Metrics and scores logic

– Side-by-side comparisons

– More

37

Page 38: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Kompar system overview

Analyzer info

Analyzer info crowdsourcing

Analyzer benchmarking

Test suites

Analyzer presentation

Find

View

Compare

Ratings & reviews

Requests to catalog

User analytics & reporting

38

Page 39: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Kompar website as of Jan 2020

Publicly available§ https://kompar.tools

§ Catalogs up to 70 analyzer properties

Seven categories§ Basic information

§ Process integration

§ Coverage

§ Speed & scalability

§ Results quality

§ Reporting

§ Support

Landing page

Analyzer list

Analyzer details

73 tools

39

Page 40: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Challenges ahead

Still much work to do:§ Information collection

– Refine questionnaire

– Collect more details about each analyzer (esp. weakness coverage)

§ Results quality benchmarking

– Select & augment test suites

– Run analyzers against test suites

– Automation to score analyzer warnings

» “Give credit” for detecting real issues

» “Deduct credit” for detecting non-issues

§ Expand documentation & comparison functionality on Kompar website

§ Make Kompar self-sustaining

40

Page 41: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

DRAFT Analyzer info. collection, in-progress submissions

41

Page 42: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

DRAFT Analyzer info. collection, analyzer question form

42

Page 43: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

DRAFT Analyzer info. collection, weakness coverage

43

Page 44: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

DRAFT Analyzer info. collection, submit for review

44

Page 45: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Candidate benchmark test suites

45

Programming language

Candidate test suites to generate results quality info.

Source

Java Juliet 1.3BenchmarkApplication CVEs

NIST SARDOWASPNIST SARD, Vulnerability History Project

C/C++ Juliet 1.3Multiple othersApplication CVEs

NIST SARDNIST SARD NIST SARD, Vulnerability History Project

Python Application CVEs Needs curation

C# C# Vulnerability Test Suite NIST SARD

PHP PHP Vulnerability Test Suite NIST SARD

Visual Basic .NET Unknown

JavaScript Juice Shop OWASPNIST SARD test suites https://samate.nist.gov/SARD/testsuite.php

Vulnerability history project (curated CVEs) http://vulnerabilityhistory.org/OWASP Benchmark https://owasp.org/www-project-benchmark/

Page 46: AppSec California –Santa Monica, CA 2020 Choosing the right … · 2020-01-27 · 2020 AppSec California–Santa Monica, CA Choosing the right static code analyzers based on hard

Choosing the right static code analyzers based on hard data

Make a contribution

Komparhttps://kompar.tools

Request addition of an analyzerhttps://www.surveygizmo.com/s3/4890321/Kompar-Tool-Request

Submit details about your analyzerhttps://www.surveygizmo.com/s3/4885264/Kompar-Detailed-Tool-Request

Backlog for managing analyzer cataloging activityhttps://trello.com/b/gzrRyvAE

Strike up a [email protected]

46