73
The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus <[email protected]> Thomas Zimmermann <[email protected]>

The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus Thomas Zimmermann

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

The Beauty and the BeastVulnerabilities in Red Hat’s Packages

Stephan Neuhaus <[email protected]>Thomas Zimmermann <[email protected]>

Page 2: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Vulnerabilities are important because fixing them costs a lot of money (2005 FBI study: 67 Bn $). There are 3241 packages (or were, by August 2008) offered by Red Hat. (There are certainly more being offered for Red Hat!)

Page 3: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Vulnerabilities are important because fixing them costs a lot of money (2005 FBI study: 67 Bn $). There are 3241 packages (or were, by August 2008) offered by Red Hat. (There are certainly more being offered for Red Hat!)

Page 4: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Vulnerabilities are important because fixing them costs a lot of money (2005 FBI study: 67 Bn $). There are 3241 packages (or were, by August 2008) offered by Red Hat. (There are certainly more being offered for Red Hat!)

Page 5: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Explain colours: white = no vulnerabilities, blue -> red: progressively more

Page 6: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Explain colours: white = no vulnerabilities, blue -> red: progressively more

Page 7: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Explain colours: white = no vulnerabilities, blue -> red: progressively more

Page 8: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Explain colours: white = no vulnerabilities, blue -> red: progressively more

Page 9: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Explain colours: white = no vulnerabilities, blue -> red: progressively more

Page 10: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Explain colours: white = no vulnerabilities, blue -> red: progressively more

Page 11: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann
Page 12: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann
Page 13: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Distribution of RHSAs

Number of RHSAs

Num

ber

of

Pac

kag

es

0 8 18 30 41 73 88 112 129

110

100

600

kernel, kernel-doc

php-related

top not shown2/3 of packages

Note logarithmic y-axis. 3241 packages in total, about 2/3 with no known vulnerabilities.

Page 14: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Properties of packages, not properties of the software in the package

Page 15: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Properties of packages, not properties of the software in the package

Page 16: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Are there properties thatincrease or decrease the risk?

Properties of packages, not properties of the software in the package

Page 17: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Are there properties thatincrease or decrease the risk?

Can we predict whether a packagecontains unknown vulnerabilities?

Properties of packages, not properties of the software in the package

Page 18: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Are there properties thatincrease or decrease the risk?

Can we predict whether a packagecontains unknown vulnerabilities?

✔ Dependencies

Properties of packages, not properties of the software in the package

Page 19: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Are there properties thatincrease or decrease the risk?

Can we predict whether a packagecontains unknown vulnerabilities?

✔ Dependencies

✔ Beauties and Beasts

Properties of packages, not properties of the software in the package

Page 20: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Are there properties thatincrease or decrease the risk?

Can we predict whether a packagecontains unknown vulnerabilities?

✔ Dependencies

✔ Machine Learning

✔ Beauties and Beasts

Properties of packages, not properties of the software in the package

Page 21: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Dependencies

Page 22: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

amanda-server

Dependencies

Page 23: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

amanda-server

glibc

Dependencies

Page 24: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

amanda-server

readline

amanda glibc xinetd

gnuplot

grep

libtermcapcoreutils

perl

Dependencies

Page 25: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Dependencies and Vulnerabilities

• Dependency A → B exists because A wants to use the services offered by B

• Vulnerability exists in A if

• A is in an insecure domain (domains are characterised by dependencies)

• B is insecure and fix in B spills over to A; or

• B is difficult to use securely

Packages in same domain will tend to have same dependencies.Domain examples are: compilers, games, office applications,

Page 26: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Red Hat Dependencies

Page 27: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

010

020

030

040

0

Distribution of Package Dependencies

Number of Packages

Num

ber o

f Dep

ende

ncie

s

0 4 8 13 19 25 31 37 43 50 56 62 75 81 88 96

kdebase

development packagescontaining headers

Distribution is apparently logarithmic with a long tail. This is not transitive closure. kdebase has 14 RHSAs (but 96 dependencies), kernel has 129 (but 0 dependencies), so number of dependencies is not a good predictor of number of RHSAs

Page 28: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Are there properties thatincrease or decrease the risk?

Can we predict whether a packagecontains unknown vulnerabilities?

✔ Dependencies

✔ Machine Learning

✔ Beauties and Beasts

Page 29: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Where does the addition of dependencies significantly increase/

decrease the risk?

Page 30: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Where does the addition of dependencies significantly increase/

decrease the risk?

1. Data structure: concept lattice

Page 31: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Where does the addition of dependencies significantly increase/

decrease the risk?

1. Data structure: concept lattice

2. Compute change in risk

Page 32: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Where does the addition of dependencies significantly increase/

decrease the risk?

1. Data structure: concept lattice

2. Compute change in risk

3. Include only statistically significant changes

Page 33: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Step 1: Data Structure

Start with no knowledge about dependencies (top node contains all packages). Add knowledge of glibc (node contains all packages depending on glibc), then qt (node contains all packages depending on qt and glibc), then xorg-x11-libs (node contains all packages depending on xorg-x11-libs and qt and glibc). Since we know the packages contained in each node, we can compute the probability of a package in this node being vulnerable.

Page 34: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Step 1: Data Structure

Start with no knowledge about dependencies (top node contains all packages). Add knowledge of glibc (node contains all packages depending on glibc), then qt (node contains all packages depending on qt and glibc), then xorg-x11-libs (node contains all packages depending on xorg-x11-libs and qt and glibc). Since we know the packages contained in each node, we can compute the probability of a package in this node being vulnerable.

Page 35: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Block 1: All packages depending on glibc

glibc

Step 1: Data Structure

Start with no knowledge about dependencies (top node contains all packages). Add knowledge of glibc (node contains all packages depending on glibc), then qt (node contains all packages depending on qt and glibc), then xorg-x11-libs (node contains all packages depending on xorg-x11-libs and qt and glibc). Since we know the packages contained in each node, we can compute the probability of a package in this node being vulnerable.

Page 36: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Block 1: All packages depending on glibc

kdelibs

glibc

Step 1: Data Structure

Start with no knowledge about dependencies (top node contains all packages). Add knowledge of glibc (node contains all packages depending on glibc), then qt (node contains all packages depending on qt and glibc), then xorg-x11-libs (node contains all packages depending on xorg-x11-libs and qt and glibc). Since we know the packages contained in each node, we can compute the probability of a package in this node being vulnerable.

Page 37: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Block 1: All packages depending on glibcBlock 2: All packages depending on glibc, qtBlock 3: All packages depending on glibc, qt, xorg-x11-libs

kdelibs

qt

xorg-x11-libs

glibc

Step 1: Data Structure

Start with no knowledge about dependencies (top node contains all packages). Add knowledge of glibc (node contains all packages depending on glibc), then qt (node contains all packages depending on qt and glibc), then xorg-x11-libs (node contains all packages depending on xorg-x11-libs and qt and glibc). Since we know the packages contained in each node, we can compute the probability of a package in this node being vulnerable.

Page 38: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Block 1: All packages depending on glibcBlock 2: All packages depending on glibc, qtBlock 3: All packages depending on glibc, qt, xorg-x11-libs

kdelibs

qt

xorg-x11-libs

glibc

Step 1: Data Structure

Start with no knowledge about dependencies (top node contains all packages). Add knowledge of glibc (node contains all packages depending on glibc), then qt (node contains all packages depending on qt and glibc), then xorg-x11-libs (node contains all packages depending on xorg-x11-libs and qt and glibc). Since we know the packages contained in each node, we can compute the probability of a package in this node being vulnerable.

Page 39: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

∅32.9% vulnerable

(1065 out of 3241)

glibc33.5% vulnerable(692 out of 2066)

kdelibs85.6% vulnerable(143 out of 167)

glibc, qt77.4% vulnerable(120 out of 155)

glibc, qt, xorg-x11-libs79.4% vulnerable

(27 out of 34)

Step 2:Compute Risk Change

Question: Is the rise of 43.9% when going from {glibc} to {glibc, qt} just some random fluctuation? We test this using statistical tests (Chi^2 or Fischer exact) and discard the “random fluctuation” hypothesis when the probability of such a increase happening by chance is 1% or less. So we expect that we wrongly attribute an increase to an actual effect 1% of the time.

Page 40: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

∅32.9% vulnerable

(1065 out of 3241)

glibc33.5% vulnerable(692 out of 2066)

kdelibs85.6% vulnerable(143 out of 167)

glibc, qt77.4% vulnerable(120 out of 155)

glibc, qt, xorg-x11-libs79.4% vulnerable

(27 out of 34)

+0.6% +52.7%

+43.9%

+2.0%

Step 2:Compute Risk Change

Question: Is the rise of 43.9% when going from {glibc} to {glibc, qt} just some random fluctuation? We test this using statistical tests (Chi^2 or Fischer exact) and discard the “random fluctuation” hypothesis when the probability of such a increase happening by chance is 1% or less. So we expect that we wrongly attribute an increase to an actual effect 1% of the time.

Page 41: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

∅32.9% vulnerable

(1065 out of 3241)

glibc33.5% vulnerable(692 out of 2066)

kdelibs85.6% vulnerable(143 out of 167)

glibc, qt77.4% vulnerable(120 out of 155)

glibc, qt, xorg-x11-libs79.4% vulnerable

(27 out of 34)

+0.6% +52.7%

+43.9%

+2.0%

Risk change by adding qtonly when already dependenton glibc! (glibc is the context)

Step 2:Compute Risk Change

Question: Is the rise of 43.9% when going from {glibc} to {glibc, qt} just some random fluctuation? We test this using statistical tests (Chi^2 or Fischer exact) and discard the “random fluctuation” hypothesis when the probability of such a increase happening by chance is 1% or less. So we expect that we wrongly attribute an increase to an actual effect 1% of the time.

Page 42: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

• Risk changes with significance p < 0.01

• No significant and more general context exists for this dependency

• Risk goes up: “beast”

• Risk goes down: “beauty”

Step 3: Include Only Significant Changes

Page 43: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Context Dependency Risk before Risk after Change

∅ openoffice.org-core 0.329 1.000 0.671

∅ kdelibs 0.329 0.856 0.527

∅ cups-libs 0.329 0.774 0.445

∅ libmng 0.329 0.769 0.440

glibc qt 0.335 0.774 0.439

glibc krb5-libs 0.335 0.769 0.434

Selected BeastsThe complete list can be found in the paper

Explain packages, don’t just list names

Page 44: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Context Dependency Risk before Risk after Change

glibc xorg-x11-server-Xorg 0.335 0.015 -0.320compat-

glibc, glibc, zlib

audiofile 0.613 0.359 -0.254

glibc, glibc-debug, zlib audiofile 0.590 0.351 -0.239

∅ gnome-keyring 0.329 0.101 -0.228

glibc, zlib gnome-libs 0.456 0.281 -0.175

∅ python 0.329 0.132 -0.197

Selected BeautiesThe complete list can be found in the paper

Explain possible consequences: new applications: choose less risky dependencies

Page 45: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Are there properties thatcorrelate with vulnerabilities?

Are there properties thatincrease or decrease the risk?

Can we predict whether a packagecontains unknown vulnerabilities?

✔ Dependencies

✔ Machine Learning

✔ Beauties and Beasts

Page 46: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Is it possible to predict…

• from the dependencies which packages are vulnerable (classification)?

• which packages will have the most vulnerabilities (ranking)?

Page 47: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Experiment

X Y

Dep

ende

ncie

s

Vuln

erab

ilitie

s

Repeat 50xThis “self-testing” is a standard evaluation technique for machine learning methods

Page 48: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Experiment

X Y

Dep

ende

ncie

s

Vuln

erab

ilitie

s

Repeat 50xThis “self-testing” is a standard evaluation technique for machine learning methods

Page 49: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Experiment

f

X YTrain

Model

Dep

ende

ncie

s

Vuln

erab

ilitie

s

Repeat 50xThis “self-testing” is a standard evaluation technique for machine learning methods

Page 50: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Experiment

f

X YTrain

Test

Y’

Model

Dep

ende

ncie

s

Vuln

erab

ilitie

s

Repeat 50xThis “self-testing” is a standard evaluation technique for machine learning methods

Page 51: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Experiment

f

X YTrain

Test

Y’

Model

Dep

ende

ncie

s

Vuln

erab

ilitie

s

Repeat 50xThis “self-testing” is a standard evaluation technique for machine learning methods

Page 52: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicators

Don’t mention -1. We want values near 1.

Page 53: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

IndicatorsC

lass

ifica

tion

Don’t mention -1. We want values near 1.

Page 54: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicatorsprecision =

true positives

true positives + false positives

recall = true positives

true positives + false negativesCla

ssifi

catio

n

Don’t mention -1. We want values near 1.

Page 55: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicatorsprecision =

true positives

true positives + false positives

recall = true positives

true positives + false negativesCla

ssifi

catio

n

0

1

Don’t mention -1. We want values near 1.

Page 56: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicatorsprecision =

true positives

true positives + false positives

recall = true positives

true positives + false negativesCla

ssifi

catio

nR

anki

ng

0

1

Don’t mention -1. We want values near 1.

Page 57: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicatorsprecision =

true positives

true positives + false positives

recall = true positives

true positives + false negatives

1234

1234

Cla

ssifi

catio

nR

anki

ng

0

1

1

Don’t mention -1. We want values near 1.

Page 58: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicatorsprecision =

true positives

true positives + false positives

recall = true positives

true positives + false negatives

1234

1234

1234

2413

Cla

ssifi

catio

nR

anki

ng

0

1

0 1

Don’t mention -1. We want values near 1.

Page 59: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicatorsprecision =

true positives

true positives + false positives

recall = true positives

true positives + false negatives

1234

4321

1234

1234

1234

2413

Cla

ssifi

catio

nR

anki

ng

0

1

-1 0 1

Don’t mention -1. We want values near 1.

Page 60: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Indicatorsprecision =

true positives

true positives + false positives

recall = true positives

true positives + false negatives

1234

4321

1234

1234

1234

2413

Cla

ssifi

catio

nR

anki

ng

0

1

-1 0 1

Don’t mention -1. We want values near 1.

Page 61: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●● ●

0.4 0.5 0.6 0.7 0.8 0.9

0.4

0.5

0.6

0.7

0.8

0.9

Precision versus Recall

Recall

Prec

ision

● SVMDecision Tree

Results of 50 random splits: train with 2/3 of the packages, predict with the rest, record precision and recall.

Page 62: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●● ●

0.4 0.5 0.6 0.7 0.8 0.9

0.4

0.5

0.6

0.7

0.8

0.9

Precision versus Recall

Recall

Prec

ision

● SVMDecision Tree

Results of 50 random splits: train with 2/3 of the packages, predict with the rest, record precision and recall.

Page 63: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●● ●

0.4 0.5 0.6 0.7 0.8 0.9

0.4

0.5

0.6

0.7

0.8

0.9

Precision versus Recall

Recall

Prec

ision

● SVMDecision Tree

Results of 50 random splits: train with 2/3 of the packages, predict with the rest, record precision and recall.

Page 64: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●● ●

0.4 0.5 0.6 0.7 0.8 0.9

0.4

0.5

0.6

0.7

0.8

0.9

Precision versus Recall

Recall

Prec

ision

● SVMDecision Tree

Decision Trees worse than SVMs

Results of 50 random splits: train with 2/3 of the packages, predict with the rest, record precision and recall.

Page 65: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●● ●

0.4 0.5 0.6 0.7 0.8 0.9

0.4

0.5

0.6

0.7

0.8

0.9

Precision versus Recall

Recall

Prec

ision

● SVMDecision Tree

Decision Trees worse than SVMs

Results of 50 random splits: train with 2/3 of the packages, predict with the rest, record precision and recall.

Page 66: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●● ●

0.4 0.5 0.6 0.7 0.8 0.9

0.4

0.5

0.6

0.7

0.8

0.9

Precision versus Recall

Recall

Prec

ision

● SVMDecision Tree

Decision Trees worse than SVMs

Results of 50 random splits: train with 2/3 of the packages, predict with the rest, record precision and recall.

Page 67: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

●●

●●●

●●

●●

●●

●●

●●

● ●●●

●●

● ●

●●

●● ●

0.4 0.5 0.6 0.7 0.8 0.9

0.4

0.5

0.6

0.7

0.8

0.9

Precision versus Recall

Recall

Prec

ision

● SVMDecision Tree

Predictions are correct83% of the time

65% of all vulnerablepackages predicted

Results of 50 random splits: train with 2/3 of the packages, predict with the rest, record precision and recall.

Page 68: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

0.52 0.54 0.56 0.58 0.60 0.62

0.0

0.2

0.4

0.6

0.8

1.0

Cumulative Rank Correlation

Rank Correlation Coefficient

Frac

tion

of S

plits

●●

●●

●●

●●●

●●●●●●●●

●●●

●●●●●●●

●●●●●●

●●●●●●●●

●●●

●●

●●

●●

Even though “self-evaluation” is a standard technique, what we realy want to know is if the method is able to predict the future... (next slide)

Page 69: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

January 1, 2008 August 31, 2008

predict evaluate

Top 25 out of 2181 73 new vulnerable

Package Name

mod_php

php-dbg

php-dbg-server

perl-DBD-Pg

kudzu

irda-utils

hpoj

libbdevid-python

mrtg

evolution28-evolution-data-server

lilo

ckermit

dovecot

kde2-compat

gq

vorbis-tools

k3b

taskjuggler

ddd

tora

libpurple

libwvstreams

pidgin

linuxwacom

policycoreutils-newrole

2156 further packages

Page 70: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

January 1, 2008 August 31, 2008

predict evaluate

Top 25 out of 2181 73 new vulnerable

Package Name

mod_php

php-dbg

php-dbg-server

perl-DBD-Pg

kudzu

irda-utils

hpoj

libbdevid-python

mrtg

evolution28-evolution-data-server

lilo

ckermit

dovecot

kde2-compat

gq

vorbis-tools

k3b

taskjuggler

ddd

tora

libpurple

libwvstreams

pidgin

linuxwacom

policycoreutils-newrole

2156 further packages

Page 71: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

January 1, 2008 August 31, 2008

predict evaluate

Top 25 out of 2181 73 new vulnerable

Patch published 2009-05-12

Package Name

mod_php

php-dbg

php-dbg-server

perl-DBD-Pg

kudzu

irda-utils

hpoj

libbdevid-python

mrtg

evolution28-evolution-data-server

lilo

ckermit

dovecot

kde2-compat

gq

vorbis-tools

k3b

taskjuggler

ddd

tora

libpurple

libwvstreams

pidgin

linuxwacom

policycoreutils-newrole

2156 further packages

Page 72: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Consequences

• When building new applications, choose less risky dependencies

– use GNU-SASL instead of cyrus-sasl, Gnome instead of KDE

• When maintaining existing applications, prioritise resources

– look at krb5-libs, not at gkermit

Page 73: The Beauty and the Beast...The Beauty and the Beast Vulnerabilities in Red Hat’s Packages Stephan Neuhaus  Thomas Zimmermann

Conclusions

• Vulnerabilities correlate with dependencies

• Identification of risky dependencies

• Prediction with high precision, recall, correlation

http://research.microsoft.com/projects/esm/http://www.artdecode.de/

* Have we worked with Red Hat: yes, have received positive feedback* Usage Data: nonexistent* Explain Correlation: See previous slide: domains* This is not causation: true, but we have high predictive value, so who cares?* Base Set: future work