38
METRICS FOR SECURITY EFFORT PRIORITIZATION Christopher Theisen

Metrics for Security Effort Prioritization

Embed Size (px)

Citation preview

METRICS FOR

SECURITY EFFORT

PRIORITIZATION

Christopher Theisen

AGENDA

◦ Motivate Security Metrics

◦ Solution: Attack Surfaces

◦ Results

◦ Undergraduate Contributions

◦ Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Where are the

vulnerabilities?

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

[2] https://arstechnica.com/information-technology/2017/02/microsoft-hosts-the-windows-source-in-a-monstrous-300gb-git-repository/

Windows: 300GB repository, millions of source code files

Motivation | Solution | Results | Undergraduates | Future Work

780,000Cybersecurity Jobs in the U.S. - 2017

3,500,000Estimated Cybersecurity openings worldwide - 2021

350,000Cybersecurity openings worldwide - 2017

https://cybersecurityventures.com/jobs/

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Tons of code…

Rare, expensive vulnerabilities…

Not enough people to find them…

How do we prioritize code for security testing/inspection?

http://www.classroomnook.com/2017/08/dealing-with-teacher-overwhelm.html

ATTACK SURFACE

Definition:

◦ All paths for data and commands in a software

system

◦ The data that travels these paths

◦ The code that implements and protects both

Concept used for security effort prioritization.

Hard to measure in practice…

Motivation | Solution | Results | Undergraduates | Future Work

RISK-BASED ATTACK SURFACE APPROXIMATION

Crashes represent activity that put the system under stress.

Stack Traces tell us what happened.

foo!foobarDeviceQueueRequest+0x68

foo!fooDeviceSetup+0x72

foo!fooAllDone+0xA8

bar!barDeviceQueueRequest+0xB6

bar!barDeviceSetup+0x08

bar!barAllDone+0xFF

center!processAction+0x1034

center!dontDoAnything+0x1030

Pull out individual code

artifacts from traces.

If code appears on a

crash dump stack

trace, it’s on the attack

surface.

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Crashes are used by attackers!

Great source of forensic data.

HYPOTHESIS

◦ Crashes are empirical evidence of…

▫ Paths through the system with flaws

▫ Data paths through software

◦ Code appearing on crashes is therefore…

▫ More likely to have vulnerabilities

▫ Vulnerabilities more likely to be exploited

Expectation: High percentage of code with

vulnerabilities also crashes.

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Motivation | Solution | Results | Undergraduates | Future Work

Use known vulnerabilities as an “oracle” to measure effectiveness

CRASHES – WINDOWS, FIREFOX, FEDORA

◦ Covering majority of vulnerable files

▫ Focus testing and review effort on crashing files

▫ Large amount of code that is irrelevant for

security review

Code

Coverage

Vulnerability

Coverage

Windows (Binaries) 48.4% 94.8%

Firefox (Source Files) 14.8% 85.6%

Fedora (Packages) 8.9% 63.3%

Motivation | Solution | Results | Undergraduates | Future Work

DATA SCALE

◦ Windows – 10 million crashes

◦ Firefox – 1.2 million crashes

◦ Fedora – 47 million crashes

◦ Common complaint:

▫ “We don’t have that much data!”

◦ How does this approach scale down?

Motivation | Solution | Results | Undergraduates | Future Work

DATA SCALE

71%

72%

73%

74%

75%

10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Perc

en

t o

f V

uln

era

bil

itie

s o

n a

Sta

ck T

race

Random Sample Size

COMPARING VULNERABILITY PREDICTION MODELS

What about other approaches?

◦ Crash Dump Stack Traces

◦ Software Metrics

▫ Lines of Code, Code Churn, # of Developers…

◦ Text Mining

▫ Do specific words/strings mean more

vulnerabilities?

Motivation | Solution | Results | Undergraduates | Future Work

COMPARING VULNERABILITY PREDICTION MODELS

True

Positives

Vulnerability

Coverage

Crashes 5% 86%

Text Mining 1% 74%

Software Metrics 13% 42%

True

Positives

Vulnerability

Coverage

Crashes+Text Mining 22% 34%

Crashes+Software Metrics 18% 33%

Text Mining+Software Metrics 1% 85%

Crashes+Text+S. Metrics 23% 36%

Motivation | Solution | Results | Undergraduates | Future Work

CONCLUSIONS

◦ Other approaches need an “oracle” of vulnerabilities

to build their model.

▫ Known set of vulnerabilities so the model “knows”

what a vulnerability looks like.

◦ Crashes have no such restriction; single metric beats

out or equals models with 10’s or 100’s of metrics

◦ Better to optimize for Vulnerability Coverage.

▫ Professionals at Microsoft, etc. agree!

Motivation | Solution | Results | Undergraduates | Future Work

UNDERGRADUATE RESEARCH

UNDERGRADUATE RESEARCH

Motivation | Solution | Results | Undergraduates | Future Work

What kinds of vulnerabilities are we covering?

Dawson Tripp – 3rd year undergraduate from NCSU

◦ Learned some basic exploits

◦ Developed classification scheme (CWE w/ caveats)

◦ Mined vulnerabilities from Firefox

◦ Classified vulnerabilities with two graduate students

UNDERGRADUATE CONTRIBUTIONS

Motivation | Solution | Results | Undergraduates | Future Work

What kinds of vulnerabilities are we covering?

Dawson Tripp – 3rd year undergraduate from NCSU

Still using his code today!

UNDERGRADUATE CONTRIBUTIONS

Motivation | Solution | Results | Undergraduates | Future Work

Practical application: visualize the data

Dalisha Rodriguez – 2nd year undergraduate from

Hanover (now Fayetteville State)

Motivation | Solution | Results | Undergraduates | Future Work

NEXT STEPS

Motivation | Solution | Results | Undergraduates | Future Work

◦ More public vulnerability datasets

◦ We need metrics that model attacker behavior

◦ Dynamic (runtime) metrics!

▫ Think of crashes as dynamic: generated at

runtime

▫ Count “messages” to/from/between objects

(AspectJ)

▫ Combine with techniques like fuzzing

◦ Chamoli et al. – dynamic metrics for defect prediction

▫ Better on vulnerabilities…?

Motivation | Solution | Results | Undergraduates | Future Work

Tons of code…

Rare, expensive vulnerabilities…

Not enough people to find them…

http://www.classroomnook.com/2017/08/dealing-with-teacher-overwhelm.html

MASSIVELY OPEN ONLINE COURSES

◦ “Flipped” NCSU’s Software Security course

▫ Iteration 1: ~400 students

▫ Iteration 2: ~1100 students

◦ Lessons Learned:

▫ Amplifies gulf between veterans and newbies

▫ Assume no more than 2-3 hours of work/week

▫ Important to “seed” discussions

Motivation | Solution | Results | Undergraduates | Future Work

MASSIVELY OPEN ONLINE COURSES

◦ “Flipped” NCSU’s Software Security course

▫ Iteration 1: ~400 students

▫ Iteration 2: ~1100 students

◦ Lessons Learned:

▫ Amplifies gulf between veterans and newbies

▫ Assume no more than 2-3 hours of work/week

▫ Important to “seed” discussions

Motivation | Solution | Results | Undergraduates | Future Work

SoftSec Materials publicly available!

https://tinyurl.com/ncsu-softsec

SoftSec Materials publicly available!

https://tinyurl.com/ncsu-softsec

[email protected]

theisencr.github.io

RELATED WORKData in this talk:

In Submission: [SECDEV ‘18] Chris Theisen, Hyunwoo Song, Dawson Tripp, Laurie Williams,

“What Are We Missing? Vulnerability Coverage by Type”

In Submission: [FSE ‘18] Chris Theisen, Hyunwoo Song, Dalisha Rodriguez, Laurie Williams,

“Better Together: Comparing Vulnerability Prediction Models”

[ICSE – SEIP ‘17] Chris Theisen, Kim Herzig, Laurie Williams, “Risk-Based Attack Surface

Approximation: How Much Data is Enough?”

[ICSE - SEIP ‘15] Chris Theisen, Kim Herzig, Pat Morrison, Brendan Murphy, and Laurie

Williams, “Approximating Attack Surfaces with Stack Traces”, in Companion Proceedings of the

37th International Conference on Software Engineering

Other works:

Revising: [IST] Chris Theisen, Nuthan Munaiah, Mahran Al-Zyoud, Jeffery C. Carver, Laurie

Williams, “Attack Surface Definitions”

[HotSoS ’17] Chris Theisen and Laurie Williams, “Advanced Metrics for Risk-Based Attack

Surface Approximation”, Proceedings of the Symposium and Bootcamp on the Science of Security.

[FSE – SRC ’15 – 1st Place] Chris Theisen, “Automated Attack Surface Approximation”, in the

23rd ACM SIGSOFT International Symposium on the Foundations of Software Engineering -

Student Research Competition, 2015

[SRC Grand Finals ’15 – 3rd Place] Chris Theisen, “Automated Attack Surface Approximation”, in

the 23rd ACM SIGSOFT International Symposium on the Foundations of Software Engineering -

Student Research Competition, 2015

IDENTIFYING VULNERABLE CODE

#include <stdio.h>

int main(int argc, char **argv)

{

char buf[8]; // buffer for eight characters

gets(buf); // read from stdio (sensitive!)

printf("%s\n", buf); // print out data in buf

return 0; // 0 as return value

}

https://www.owasp.org/index.php/Buffer_overflow_attack

Motivation | Solution | Results | Undergraduates | Future Work