benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt...

7/23/18

MarcoVieiramvieira@dei.uc.pt

DepartmentofInformaticsEngineeringUniversityofCoimbra- Portugal

BENCHMARKINGTHE SECURITY OF SOFTWARE SYSTEMS OR

TO BENCHMARK OR NOT TO BENCHMARK

QRS 2018Lisbon, PortugalJuly 19th, 2018

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 2

BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsaccording to specific qualityattributes

§ Performancebenchmarking– Wellestablishedbothintermsofresearchandapplication– SupportedbyorganizationslikeTPCandSPEC– Mostlyformarketing

§ Dependabilitybenchmarking– Wellestablishedfromaresearchperspective– Noendorsementfromtheindustry

BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsaccording to specific qualityattributes

§ Securitybenchmarking– Severalworkscanbefound– Nocommonapproachavailableyet

Performance benchmarks Dependability benchmarks

Security benchmarks

Whetstone Wisconsin BenchTP1DebitCreditOrange Book

TPC & SPEC SIGDeBCommon Criteria

1972 1983 1985 1988 1999

Release of commercial performance benchmarks… Research projects on dependability & security benchmarks

OUTLINE

§ Thepast:Performance&DependabilityBenchmarking

§ Thepresent:SecurityBenchmarking

§ BenchmarkingtheSecurityofSystems– Approach:Qualification+TrustworthinessAssessment– Example:BenchmarkingWebServiceFrameworks

§ BenchmarkingSecurityTools– Approach:VulnerabilityandAttackInjection– Example:BenchmarkingIntrusionDetectionSystems

§ ChallengesandConclusions

PERFORMANCE BENCHMARKING

Assessingandcomparingcomputer systems and/or components

in terms of performance

PERFORMANCE BENCHMARKING

SUB MetricsWorkload

§ Workload:– Setofrepresentativeoperations

§ Metrics:– Throughput– Responsetime– Latency– …

7/23/18

TPC-C(1992)

§ Workload:– Databasetransactions

§ Metrics:– Transactionrate(tpmC)– Pricepertransaction($/tpmC)

Althoughsomeintegritytestsareperformed,it assumes thatnothingfails

DBMS MetricsWorkload

DEPENDABILITY BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsconsidering dependabilityattributes

DEPENDABILITY BENCHMARKING

ExperimentalmetricsWorkload

Faultload

§ Faultload:– Setofrepresentativefaults,injectedintothesystem

§ Metrics:– Performanceand/ordependability

• Bothbaselineandinthepresenceoffaults

– Unconditionaland/ordirect

Unconditionalmetrics

Models

Parameters(faultrates,MTBF,etc.)

§ Workload:– TPC-Ctransactions

§ Faultload:– Operatorfaults+Softwarefaults+HWcomponentfailures

§ Metrics:– Performance:tpmC,$/tpmC,Tf,$/Tf– Dependability:Ne,AvtS,AvtC

DBENCH-OLTP(2005)

Faultload

DBENCH-OLTP(2005)

Faultload:Operatorfaults

DBENCH-OLTP(2005)

Baseline Performance

A B C D E F G H I J K

$tpmC$/tpmC

Performance With Faults

$Tf$/Tf

Availability

% AvtS (Server)AvtC (Clients)

Doesnottakeintoaccountmaliciousbehaviors(faults=vulnerability+attack)

7/23/18

SECURITY BENCHMARKING

Assessingandcomparingcomputer systems and/or components

considering securityaspects

§ BenchmarkingtheSecurityofSystems/Components– Systemsthatshouldimplementsecurityrequirements– OS,middleware,serversoftware,etc.

§ BenchmarkingSecurityTools– Toolsusedtoimprovethesecurityofsystems– Penetrationtesters,staticanalyzers,IDS,etc.

BENCHMARKING SECURITY OF SYSTEMS

§ Attackload:– Representativeattacks

§ Metrics:– Performance+dependability– Security(e.g.,numbervulnerabilities,attackdetection)

Attackload

Unconditionalmetrics

Models

Parameters(vulnerabilityexposure,meantimebetweenattacks,etc.)

Attackingwhat?Doweknowthevulnerabilities?Whatarerepresentativeattacks?

Doesnotworkifonewantstobenchmarkhowsecuredifferentsystemsare!

e.g.doesthenumberofvulnerabilitiesofasystemrepresent anything?

ADIFFERENT APPROACH…

SUBsSecurity

Qualification

Unacceptable

Security=0

§ SecurityQualification:– Applystate-of-the-arttechniquesandtoolstodetectvulnerabilities

– SUBswithvulnerabilitiesare:• Disqualified!• Orvulnerabilitiesarefixed…

TrustworthinessAssessment

MetricsAcceptable

§ TrustworthinessAssessment:– Gatherevidencesonhowmuchonecantrust– e.g.,bestcodingpractices,developmentprocess,badsmells

SUBsSecurity

Qualification

Unacceptable

Security=0

§ Metrics:– Portraytrustfromauserperspective– Dynamic:maychangeovertime– Dependonthetypeofevidencesgathered– Differentmetricsfordifferentattackvectors

TrustworthinessAssessment

MetricsAcceptableSUBs

SecurityQualification

Unacceptable

Security=0

EXAMPLE:WEB SERVICE FRAMEWORKS

Assessment(CPU+mem.)

Trust.Score

AcceptableWSFs

Qualification(testing)

Unacceptable

Security=0

§ Qualification– DoS Attacks– CoerciveParsing,MalformedXML,MaliciousAttachment,etc.

§ TrustworthinessAssessment:– Qualitymodeltocomputeascore

7/23/18

QUALITY MODEL

SYSTEMS UNDER BENCHMARKING

TRUSTWORTHINESS RESULTS

§ Faultload:– Vulnerabilitiesareinjected– Attackstargettheinjectedvulnerabilities

§ Datacanbecollectedforbenchmarkingsecuritytools– Penetrationtesters,staticanalyzers,IDS,etc.

BENCHMARKING SECURITY TOOLS

Faultload(vulnerabilities+attacks)

Sec.Tool

VULNERABILITY AND ATTACK INJECTION

EXAMPLE:BENCHMARKING IDS§ Securityrequiresadefenseindepthapproach

– Codingbestpractices– Testing– Staticanalysis– …

§ Vulnerability-freecodeishard(orevenimpossible)toachieve...

§ Intrusiondetectiontoolssupportapost-deploymentapproach– Forprotectingagainstknownandunknownattacks

7/23/18

EVALUATION APPROACH

EXAMPLES OF VULNERABILITIES INJECTED

Original PHP code Code with injected vulnerability Operation performed

$id=intval($_GET['id']); $id=$_GET['id']; Removed the “intval” function allowing also non numeric values (i.e. SQL commands) in the “$id” variable

$page = urlencode($page); $page = $page; Removed the “urlencode” function allowing also alphanumeric values (i.e. SQL commands) in the “$page” variable

… … …

EXAMPLES OF ATTACKS

Attack payloads Expected result ' Modifies the structure of the query; usually results in an error

or 1=1 Modifies the structure of the query. Overrides the query restrictions by adding a statement that is always true.

' or 'a'='a Modifies the structure of the query. Overrides the query restrictions by adding a statement that is always true.

+connection_id()-connection_id() Modifies the query result to 0

+1-1 Modifies the query result to 0 +67-ASCII('A') Modifies the query result to 0 +51-ASCII(1) Modifies the query result to 0 … …

SYSTEMS UNDER BENCHMARKING

Tool Architectural Level monitored

Detection Approach

Data Source Known

Technology Limitations

ACD Application Anomaly Based Apache Log Only GET method Apache Scalp Application Signature Based Apache Log Only GET method ModSecurity Application Signature Based HTTP traffic - Snort (v2.8 and v2.9)

Network Signature Based Network Trafic -

GreenSQL Database Signature Based SQL Proxy Trafic MySQL data

DB IDS Database Anomaly Based SQL Sniffer Trafic

MySQL and Oracle data

EXPERIMENTAL SETUP

MAIN RESULTS

P N Pop TP TN FN FP

ACD 1275 376 174 675 50 0.883 0.358 0.088 0.135

Scalp 1275 206 224 845 0 1.000 0.196 0.210 0.196

ModSecurity 826

225 1051 236 225 590 0 1.000 0.286 0.276 0.286

Net Snort 2.8 1275 0 817 458 0 - 0.000 - 0.000

GreenSQL 1275 244 813 214 4 0.984 0.533 0.775 0.528

DB IDS 1275 451 384 7 433 0.510 0.985 0.492 0.455

Net Snort 2.9 173

878 1051 0 878 173 0 - 0.000 - 0.000

Alllvl Tool

Review ReportedPrec. Infor.Mark.Recall

7/23/18

WHAT IS WRONG?§ Establishedbenchmarksaremostlyformarketing!

§ Strictbenchmarkingconditions– Fixedworkload&faultload +Smallsetofmetrics

§ Workload&faultload:– Maynotberepresentativeoftheuserscenario

§ Metrics:– Fixed!Maynotsatisfytheuserneeds– Decisionbasedonseveralmetricsisdifficult!

Nosecuritybenchmarkendorsedbyanyorganization or industry

FIXED!

§ Example:– Benchmarkingvulnerabilitydetectiontools– Typicalmetric:F-Measure– Isthisgoodinallscenarios?

• Businesscritical:recall• Besteffort:F-Measure• Minimumeffort:Markedness

SUB MetricsActivation

Fixed!

APOTENTIAL APPROACH…§ Benchmarkingconditionsadaptabletotheuserneeds

§ Includemultipleusagescenarios:– Metricsdependonthescenario– Adaptableworkloadandfaultload

§ Usequalitymodelsinsteadofindependentmetrics– Qualitymodelsshouldalsoadapttothescenario

SCENARIOS AND QUALITY MODELS

Howtodefinescenarios?Howtodefinequalitymodels?Howtoadaptworkloadsandfaultloads to

thescenarios?

CHALLENGES§ Satisfyindustryrequirements

– Representativeness,portability,scalability,non-intrusiveness,lowcost,…

– Prevent“gaming”

§ Satisfyuserrequirements– Representativeness,usefulness,simplicityofuse…– Adaptable– allow“gaming”

§ EndorsementbyTPC,SPEC,…– Howto?

IS THERE A FUTURE?§ ResilienceBenchmarking

– Assessandcomparethebehaviorofcomponentsandcomputersystemswhensubjectedtochanges

– Whichresiliencemetrics?• Comparable,consistent,understandable,meaningful,…

– Changeloads:• Representative,practical,portable,…

§ TrustworthinessBenchmarking– Whatevidencestocollect?– Whatmetrics?– Dynamicityofperception… socialtrust...

7/23/18

CONCLUSIONS§ Thebenchmarkingconceptiswellestablished!

§ Acceptanceby“big”industrydependsonperceivedutilityformarketing

§ Acceptancebyusersrequires“adaptability”

§ Fromaresearchperspective,performanceanddependabilitybenchmarkingarewellknown

§ Securitybenchmarkingapproachesareweak

§ Newtypesofbenchmarkswillbringadditionalchallenges!

QUESTIONS?

Marco VieiraDepartment of Informatics EngineeringUniversity of Coimbramvieira@dei.uc.pt

http://eden.dei.uc.pt/~mvieira

benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt...

Documents

TC – DEI, 2005/2006. Representação de Informação -- Texto -- Paulo Marques pmarques@dei.uc.pt pmarques Tecnologia dos Computadores

Benchmarking and Performance Analysis of Optimization Softwareted/files/talks/Benchmarking-CPAIOR10.pdf · Introduction Benchmarking Performance Analysis Conclusions Benchmarking

Benchmarking

Sistemas Operativos Paulo Marques Departamento de Eng. Informática Universidade de Coimbra pmarques@dei.uc.pt 2006/2007 Apresentação

Benchmarking Social Media Tools/ Benchmarking herramentas Social Media

INTERESSE BIOLÓGICO Divulgação - QNsubmission.quimicanova.sbq.org.br/qn/qnol/2002/vol25n6B/16.pdf · Quim. Nova, Vol. 25, No. 6B, 1165-1171, 2002 Divulgação *e-mail: mvieira@quimica.ufjf.br

Indiana University East€¦ · benchmarking), between successive years (longitudinal benchmarking), and between groups (internal benchmarking). Benchmarking allows you to identify

BENCHMARKING REGULATORIO DE LAS EPS 2014 (Datos 2014) › transparencia › benchmarking › ... · 2019-12-15 · BENCHMARKING REGULATORIO DE LAS EPS 2014 Puesto en el benchmarking

Designing vulnerability testing tools for web services ... › ~mvieira › 2016_IJIS_Tools.pdfDesigning vulnerability testing tools for web services: approach, components, and tools

Urban benchmarking USESPON Workshop „Urban Benchmarking”

BENCHMARKING - dinus.ac.iddinus.ac.id/.../docs/ajar/2016_BPK_07_-_Pengenalan_Benchmarking.pdf · •Benchmarking Advantages & Disadvantages . What is Benchmarking? Benchmarking is

Mazda Creative Benchmarking May 2011. About Newspaper Creative Benchmarking

MAINTENANCE BENCHMARKING INSTRUMENT - cou.fi · MAINTENANCE BENCHMARKING INSTRUMENT ... benchmarking, maintenance, maintenance costs, ... best practices benchmarking can be …

VR-KostenManager und gfb-Benchmarking Benchmar… · Das Benchmarking-Portfolio VR-KostenManager und gfb-Benchmarking Seite 3 gfb-Benchmarking Das Leistungspaket gfb-Benchmarking

Localização em Sistemas Ubíquos Ubiquitous Computing Carlos Bento :: bento@dei.uc.pt Marco Veloso :: mveloso@dei.uc.pt Coimbra, 2004

A linguagem C# Paulo Marques Departamento de Eng. Informática Universidade de Coimbra (pmarques@dei.uc.pt)

Extending the control of remote laboratories using domotic devices Ricardo Costa rjc@isep.ipp.ptrjc@isep.ipp.pt /rjc@dei.uc.pt/rjc@dei.uc.pt rjc

Agenda Introduction Benchmarks Benchmarking Survey data and benchmarking

Benchmarking Africa’s Costs and Competitivenesssiteresources.worldbank.org/EXTAFRSUMAFTPS/Resources/chapter4.pdf · Benchmarking Africa’s Costs and Competitiveness ... Benchmarking

SAP Benchmarking - axeba.ch SAP-Benchmarking V2.6.pdf · • Benchmarking: Benchmarking des Gebietes Electronic Workplace und SAP inkl. Aufzeigen von Aufzeigen von Optimierungspotenzial