benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt...

Preview:

Citation preview

7/23/18

1

MarcoVieiramvieira@dei.uc.pt

DepartmentofInformaticsEngineeringUniversityofCoimbra- Portugal

BENCHMARKINGTHE SECURITY OF SOFTWARE SYSTEMS OR

TO BENCHMARK OR NOT TO BENCHMARK

QRS 2018Lisbon, PortugalJuly 19th, 2018

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 2

BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsaccording to specific qualityattributes

§ Performancebenchmarking– Wellestablishedbothintermsofresearchandapplication– SupportedbyorganizationslikeTPCandSPEC– Mostlyformarketing

§ Dependabilitybenchmarking– Wellestablishedfromaresearchperspective– Noendorsementfromtheindustry

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 3

BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsaccording to specific qualityattributes

§ Securitybenchmarking– Severalworkscanbefound– Nocommonapproachavailableyet

2017

Performance benchmarks Dependability benchmarks

Security benchmarks

CIS

2000

Whetstone Wisconsin BenchTP1DebitCreditOrange Book

TPC & SPEC SIGDeBCommon Criteria

1972 1983 1985 1988 1999

EMBC

1987

Release of commercial performance benchmarks… Research projects on dependability & security benchmarks

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 4

OUTLINE

§ Thepast:Performance&DependabilityBenchmarking

§ Thepresent:SecurityBenchmarking

§ BenchmarkingtheSecurityofSystems– Approach:Qualification+TrustworthinessAssessment– Example:BenchmarkingWebServiceFrameworks

§ BenchmarkingSecurityTools– Approach:VulnerabilityandAttackInjection– Example:BenchmarkingIntrusionDetectionSystems

§ ChallengesandConclusions

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 5

PERFORMANCE BENCHMARKING

Assessingandcomparingcomputer systems and/or components

in terms of performance

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 6

PERFORMANCE BENCHMARKING

SUB MetricsWorkload

§ Workload:– Setofrepresentativeoperations

§ Metrics:– Throughput– Responsetime– Latency– …

7/23/18

2

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 7

TPC-C(1992)

§ Workload:– Databasetransactions

§ Metrics:– Transactionrate(tpmC)– Pricepertransaction($/tpmC)

Althoughsomeintegritytestsareperformed,it assumes thatnothingfails

DBMS MetricsWorkload

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 8

DEPENDABILITY BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsconsidering dependabilityattributes

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 9

DEPENDABILITY BENCHMARKING

SUB

ExperimentalmetricsWorkload

Faultload

§ Faultload:– Setofrepresentativefaults,injectedintothesystem

§ Metrics:– Performanceand/ordependability

• Bothbaselineandinthepresenceoffaults

– Unconditionaland/ordirect

Unconditionalmetrics

Models

Parameters(faultrates,MTBF,etc.)

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 10

§ Workload:– TPC-Ctransactions

§ Faultload:– Operatorfaults+Softwarefaults+HWcomponentfailures

§ Metrics:– Performance:tpmC,$/tpmC,Tf,$/Tf– Dependability:Ne,AvtS,AvtC

DBENCH-OLTP(2005)

SUB

ExperimentalmetricsWorkload

Faultload

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 11

DBENCH-OLTP(2005)

Faultload:Operatorfaults

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 12

DBENCH-OLTP(2005)

Baseline Performance

0

1000

2000

3000

4000

A B C D E F G H I J K

tpmC

0

10

20

30

$tpmC$/tpmC

Performance With Faults

0

1000

2000

3000

4000

A B C D E F G H I J K

Tf

0

10

20

30

$Tf$/Tf

Availability

50

60

70

80

90

100

A B C D E F G H I J K

% AvtS (Server)AvtC (Clients)

Doesnottakeintoaccountmaliciousbehaviors(faults=vulnerability+attack)

7/23/18

3

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 13

SECURITY BENCHMARKING

Assessingandcomparingcomputer systems and/or components

considering securityaspects

§ BenchmarkingtheSecurityofSystems/Components– Systemsthatshouldimplementsecurityrequirements– OS,middleware,serversoftware,etc.

§ BenchmarkingSecurityTools– Toolsusedtoimprovethesecurityofsystems– Penetrationtesters,staticanalyzers,IDS,etc.

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 14

BENCHMARKING SECURITY OF SYSTEMS

§ Attackload:– Representativeattacks

§ Metrics:– Performance+dependability– Security(e.g.,numbervulnerabilities,attackdetection)

SUB

ExperimentalmetricsWorkload

Attackload

Unconditionalmetrics

Models

Parameters(vulnerabilityexposure,meantimebetweenattacks,etc.)

Attackingwhat?Doweknowthevulnerabilities?Whatarerepresentativeattacks?

Doesnotworkifonewantstobenchmarkhowsecuredifferentsystemsare!

e.g.doesthenumberofvulnerabilitiesofasystemrepresent anything?

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 15

ADIFFERENT APPROACH…

SUBsSecurity

Qualification

Unacceptable

Security=0

§ SecurityQualification:– Applystate-of-the-arttechniquesandtoolstodetectvulnerabilities

– SUBswithvulnerabilitiesare:• Disqualified!• Orvulnerabilitiesarefixed…

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 16

ADIFFERENT APPROACH…

TrustworthinessAssessment

MetricsAcceptable

§ TrustworthinessAssessment:– Gatherevidencesonhowmuchonecantrust– e.g.,bestcodingpractices,developmentprocess,badsmells

SUBsSecurity

Qualification

Unacceptable

Security=0

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 17

ADIFFERENT APPROACH…

§ Metrics:– Portraytrustfromauserperspective– Dynamic:maychangeovertime– Dependonthetypeofevidencesgathered– Differentmetricsfordifferentattackvectors

TrustworthinessAssessment

MetricsAcceptableSUBs

SecurityQualification

Unacceptable

Security=0

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 18

EXAMPLE:WEB SERVICE FRAMEWORKS

Assessment(CPU+mem.)

Trust.Score

AcceptableWSFs

Qualification(testing)

Unacceptable

Security=0

§ Qualification– DoS Attacks– CoerciveParsing,MalformedXML,MaliciousAttachment,etc.

§ TrustworthinessAssessment:– Qualitymodeltocomputeascore

7/23/18

4

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 19

QUALITY MODEL

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 20

SYSTEMS UNDER BENCHMARKING

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 21

TRUSTWORTHINESS RESULTS

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 22

§ Faultload:– Vulnerabilitiesareinjected– Attackstargettheinjectedvulnerabilities

§ Datacanbecollectedforbenchmarkingsecuritytools– Penetrationtesters,staticanalyzers,IDS,etc.

BENCHMARKING SECURITY TOOLS

SUB

ExperimentalmetricsWorkload

Faultload(vulnerabilities+attacks)

Sec.Tool

Data

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 23

VULNERABILITY AND ATTACK INJECTION

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 24

EXAMPLE:BENCHMARKING IDS§ Securityrequiresadefenseindepthapproach

– Codingbestpractices– Testing– Staticanalysis– …

§ Vulnerability-freecodeishard(orevenimpossible)toachieve...

§ Intrusiondetectiontoolssupportapost-deploymentapproach– Forprotectingagainstknownandunknownattacks

7/23/18

5

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 25

EVALUATION APPROACH

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 26

EXAMPLES OF VULNERABILITIES INJECTED

Original PHP code Code with injected vulnerability Operation performed

$id=intval($_GET['id']); $id=$_GET['id']; Removed the “intval” function allowing also non numeric values (i.e. SQL commands) in the “$id” variable

$page = urlencode($page); $page = $page; Removed the “urlencode” function allowing also alphanumeric values (i.e. SQL commands) in the “$page” variable

… … …

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 27

EXAMPLES OF ATTACKS

Attack payloads Expected result ' Modifies the structure of the query; usually results in an error

or 1=1 Modifies the structure of the query. Overrides the query restrictions by adding a statement that is always true.

' or 'a'='a Modifies the structure of the query. Overrides the query restrictions by adding a statement that is always true.

+connection_id()-connection_id() Modifies the query result to 0

+1-1 Modifies the query result to 0 +67-ASCII('A') Modifies the query result to 0 +51-ASCII(1) Modifies the query result to 0 … …

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 28

SYSTEMS UNDER BENCHMARKING

Tool Architectural Level monitored

Detection Approach

Data Source Known

Technology Limitations

ACD Application Anomaly Based Apache Log Only GET method Apache Scalp Application Signature Based Apache Log Only GET method ModSecurity Application Signature Based HTTP traffic - Snort (v2.8 and v2.9)

Network Signature Based Network Trafic -

GreenSQL Database Signature Based SQL Proxy Trafic MySQL data

DB IDS Database Anomaly Based SQL Sniffer Trafic

MySQL and Oracle data

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 29

EXPERIMENTAL SETUP

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 30

MAIN RESULTS

P N Pop TP TN FN FP

ACD 1275 376 174 675 50 0.883 0.358 0.088 0.135

Scalp 1275 206 224 845 0 1.000 0.196 0.210 0.196

ModSecurity 826

225 1051 236 225 590 0 1.000 0.286 0.276 0.286

Net Snort 2.8 1275 0 817 458 0 - 0.000 - 0.000

GreenSQL 1275 244 813 214 4 0.984 0.533 0.775 0.528

DB IDS 1275 451 384 7 433 0.510 0.985 0.492 0.455

Net Snort 2.9 173

878 1051 0 878 173 0 - 0.000 - 0.000

458

817

DB

App

1051

224

Alllvl Tool

Review ReportedPrec. Infor.Mark.Recall

7/23/18

6

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 31

WHAT IS WRONG?§ Establishedbenchmarksaremostlyformarketing!

§ Strictbenchmarkingconditions– Fixedworkload&faultload +Smallsetofmetrics

§ Workload&faultload:– Maynotberepresentativeoftheuserscenario

§ Metrics:– Fixed!Maynotsatisfytheuserneeds– Decisionbasedonseveralmetricsisdifficult!

Nosecuritybenchmarkendorsedbyanyorganization or industry

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 32

FIXED!

§ Example:– Benchmarkingvulnerabilitydetectiontools– Typicalmetric:F-Measure– Isthisgoodinallscenarios?

• Businesscritical:recall• Besteffort:F-Measure• Minimumeffort:Markedness

SUB MetricsActivation

Fixed!

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 33

APOTENTIAL APPROACH…§ Benchmarkingconditionsadaptabletotheuserneeds

§ Includemultipleusagescenarios:– Metricsdependonthescenario– Adaptableworkloadandfaultload

§ Usequalitymodelsinsteadofindependentmetrics– Qualitymodelsshouldalsoadapttothescenario

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 34

SCENARIOS AND QUALITY MODELS

Howtodefinescenarios?Howtodefinequalitymodels?Howtoadaptworkloadsandfaultloads to

thescenarios?

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 35

CHALLENGES§ Satisfyindustryrequirements

– Representativeness,portability,scalability,non-intrusiveness,lowcost,…

– Prevent“gaming”

§ Satisfyuserrequirements– Representativeness,usefulness,simplicityofuse…– Adaptable– allow“gaming”

§ EndorsementbyTPC,SPEC,…– Howto?

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 36

IS THERE A FUTURE?§ ResilienceBenchmarking

– Assessandcomparethebehaviorofcomponentsandcomputersystemswhensubjectedtochanges

– Whichresiliencemetrics?• Comparable,consistent,understandable,meaningful,…

– Changeloads:• Representative,practical,portable,…

§ TrustworthinessBenchmarking– Whatevidencestocollect?– Whatmetrics?– Dynamicityofperception… socialtrust...

7/23/18

7

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 37

CONCLUSIONS§ Thebenchmarkingconceptiswellestablished!

§ Acceptanceby“big”industrydependsonperceivedutilityformarketing

§ Acceptancebyusersrequires“adaptability”

§ Fromaresearchperspective,performanceanddependabilitybenchmarkingarewellknown

§ Securitybenchmarkingapproachesareweak

§ Newtypesofbenchmarkswillbringadditionalchallenges!

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 38

QUESTIONS?

Marco VieiraDepartment of Informatics EngineeringUniversity of Coimbramvieira@dei.uc.pt

http://eden.dei.uc.pt/~mvieira

Recommended