Upload
priyanka-aash
View
8
Download
0
Embed Size (px)
Citation preview
SESSIONID:SESSIONID:
#RSAC
MarkRussinovich
AdvancesinCloud-ScaleMachineLearningforCyber-Defense
EXP– T11
CTO,MicrosoftAzureMicrosoftCorporation@markrussinovich
#RSAC
SQLServer+R MicrosoftRServer Hadoop+R Spark+R MicrosoftCNTK
AzureMachineLearning
CortanaIntelligenceSuite
RTools/PythonToolsforVisualStudio
AzureNotebooks(JuPyTer) CognitiveServices BotFramework Cortana
Office365 Bing Skype Xbox360 Dynamics365HoloLens
Intelligenceineverysoftware
#RSAC
Microsoft’sdailycloudsecurityscale
10sofPBsoflogs
300+millionactiveMicrosoftAccountusers
Detected/reflectedattacks
>10,000location-detected
attacks
1.5millioncompromiseattemptsdeflected
1.3+billionAzureActive
Directorylogons
#RSAC
BlueTeamKillChain
Recon Delivery Persist Move ElevateFoothold Exfiltrate
Gather Detect Triage Context PlanAlert Execute
forAttackDetection
#RSAC
BlueTeamKillChain
Recon Delivery Persist Move ElevateFoothold Exfiltrate
Gather Detect Triage Context PlanAlert Execute
forAttackDisruption
#RSAC
FalsePositivesLoseabilitytotriage
Recon Delivery Persist Move ElevateFoothold Exfiltrate
Gather Detect Triage Context PlanAlert Execute
#RSAC
FalsePositivesFACTYoucannot salvageafalsepositivewithjustVisualization.Youneedbettersolutions.
#RSAC
FalsePositivesEvolutionofsecuritydetectiontechniques
DataProgram/Rules
Output
TRADITIONALPROGRAMMING
MACHINELEARNING
Hand-craftedrulesbysecurityprofessionalsCon:Rulesarestatic,anddon’tchangewithchangesinenvironment=>FalsePositives!
DataOutput/Labels
Program
Systemadaptstochangesinenvironmentasnewdataisprovided,andre-trained
OursupervisedlearningapproachenablesdetectionwithoutgeneratingmanyFPs
#RSAC
LabelsinMicrosoftForsupervisedlearning,Azuregetslabeleddatathrough:
Domainexperts,customerswhoprovidefeedback
fromAlerts
AutomatedAttackbots
Labelsfromotherproduct
groups(includingO365,WindowsSeville)
SurgicalRedteamexercises(OneHunt)
BugBountyMSRC
#RSAC
ManualTriage
Gather Detect Triage Context PlanAlert Execute
ForAttackDisruption,weneedtothinkbeyonddetection
Weneedtochangethis
#RSAC
Adaptable
SuccessfulDetection
Explainable Actionable
Properties ofaSuccessfulMachineLearningSolution
#RSAC
EVOLVINGATTACKS
Constantlychangingenvironmentsleadstoconstantlychangingattacks• Newservices• Newfeaturesforexistingservices
EVOLVINGLANDSCAPE
FrequentdeploymentsNewservicescomingonlineUsagespikes
AdaptableAdaptableinCloudisDifficultWhy?
#RSAC
AdaptableExplainabilityWhy?Surfacingasecurityeventtoanend-usercanbeuselessifthereisnoexplanation
Explainability ofresultsshouldbeconsideredatearliestpossiblestageofdevelopment
Explainable
Bestdetectionsignalwithnoexplanationmightbedismissed/overlooked
<Example– Howdoyouexplainthistoananalyst>UserId Time EventId Feature1 Feature2 Feature3 Feature4 … Score
1a4b43 2016-09-0102:01 4688 0.3 0.12 3.9 20 … 0.2
73d87a 2016-09-0103:15 4985 0.4 0.8 0 11 … 0.09
9ca231 2016-09-0105:10 4624 0.8 0.34 9.2 7 … 0.9
5e9123 2016-09-0105:32 4489 2.5 0.85 7.6 2.1 … 0.7
1e6a7b 2016-09-0109:12 4688 3.1 0.83 3.6 6.2 … 0.1
33d693 2016-09-0114:43 4688 4.1 0.63 4.7 5.1 … 0.019
7152f3 2016-09-0119:11 4688 2.7 0.46 3.9 1.4 … 0.03
Resultswithoutexplanationarehardtointerpret
#RSAC
AdaptableActionableDetections
DetectionsmustresultindownstreamactionGoodexplanationwithoutbeingactionableisoflittlevalue
Explainable
• Policydecisions• Resetuserpassword
EXAMPLES
Actionable
#RSAC
Adaptable
Explainable Actionable
Basic AdvancedSophisticationofAlgorithms
LessUsefulMoreUseful
UsefulnessofAlerts
SuccessfulDetection
Outlier
SuccessfulDetectionsincorporatedomainknowledgethroughdisparatedatasetsandrules
SecurityInterestingAlerts
Framework foraSuccessful Detection
Anomaly
SecurityDo
mainKn
owledge
MoreDo
mainknow
ledge
LittleDom
ainknow
ledge
#RSAC
CaseStudy 1Successfuldetectionthroughcombiningdisparatedatasets
PROBLEMSTATEMENTDetectcompromisedVMsinAzure
HYPOTHESISIftheVMissendingspam,thenitismostlikelycompromised.
SOLUTIONUsesupervisedMachineLearningtoleverageLabeledspamdatafromOffice365andcombinewithIPFIXdatafromAzure.
#RSAC
CaseStudy1
TechniqueOverviewIPFIXFeatures
Azure
SPAM
SpamTagscomefromO365!
EXAMPLESAutomated• Allportswithtraffic• Numberofconnections• WhichTCPflagscombinationexist• Manymore…
#RSAC
IPFIXdataSpamlabeledIPFIX
dataBenignIPFIX data
CaseStudy1
TechniqueOverview
NewCase
AutomatedCompromiseDetection
MachineLearning
ü
ü
#RSAC
Dataset
ExternalIPs
ExternalPorts
TCPflags
Existence(binary)
Counts
Normalizedcounts
FEATURESOURCES FEATURETYPES
CaseStudy1
WHYISNETWORKDATAGOODFORDETECTION?
ü Noinstallationrequired– runningonallAzuretenantsü NooverloadontheVMü Resilient– cannotbemaliciouslyturnedoffü OSindependent
#RSAC
Results
CaseStudy1MachineLearningDeepDive:GradientBoosting
Inputdatafor1st iteration
Weaklearnerat1st iteration
#RSAC
Results
CaseStudy1MachineLearningDeepDive:GradientBoosting
Inputdatafor2nd iteration
Thedatapointsthatwereincorrectlycategorizedbytheweaklearnerinthefirstiteration(thepositiveexamples)arenowweightedmore.
Simultaneously,thecorrectpointsaredownweighted. Learnerat2nd iteration
#RSAC
Results
CaseStudy1MachineLearningDeepDive:GradientBoosting
Inputdatafor3rd iteration
Thedatapointsthatwereincorrectlycategorizedintheseconditeration(thenegativeexamples)arenowweightedmore.
Simultaneously,thecorrectpointsaredownweighted.
Learnerat3rd iteration
+ +
Finalresultisacombinationoflearnersfromeachiteration
#RSAC
ModelPerformanceandProductization
ModeltrainedinregularintervalsSizeofdata:360GBperdayWithinminutes
ClassificationrunsmultipletimesadayCompletedwithinseconds
Dataset TruePositiveRate FalsePositiveRate
OnlyusingAzureIPFIXdata 55% 1%
UsingAzureIPFIXandO365data 81% 1%
26pointsimprovement
CaseStudy1
#RSAC
CaseStudy 2Successfuldetectionthroughcombiningrulesandmachinelearning
PROBLEMSTATEMENTRulebasedmalwaredetectionplacehardconstraintsifsomethingisamalwareornot.Whiletheyarespecific,theyhavealotofFalsePositives,Falsenegativesandarenotadaptable
HYPOTHESISCanwecombinethehardlogicofrulebaseddetectionswiththesoft- logicofmachinelearningsystems?
SOLUTIONBuildtwoMLmodels:1)Model1thatbaselinesmalwarebehavior2)Model2thatincorporatesrulesasfeaturesCombineresultoftwomodels
#RSAC
ConventionalA/V
DetonationChamberSpinupmultipleVMsMultipleOSandOfficeversionsInstrumentattachmentbehavior
SafelinksProtectsagainstmaliciousURLsinRealTime(onclick)
CaseStudy2MALWAREDETECTIONBACKGROUNDATPArchitecture
#RSAC
CombinedVerdict
Combiner
FingerprintModel
BehavioralModel
CaseStudy2
• Hash/FuzzyHash• PEAnalyzer• FileTypeAnalyzer• PhotoSimilarity• …
• SysMon• ETWLogger• APIhooks• Crashdump• …
• YARA• ThreatIntel• NetworkAnalysis• MacroEvidence• …
PRE-ANALYSIS DETONATION POST-ANALYSIS
TechniqueOverview
#RSAC
CaseStudy2
MachineLearningDeepDive:FingerprintModel
Informationgetsmoregranular
CallOrder Level1 Level2 Level3 Level4 Level5
1 Process LoadImage SYSTEM .exe wscript2 Api CallFunction CreateMutexA _!MSFTHISTORY!_3 Api CallFunction CreateMutexW !IETld!Mutex4 Registry SetRegValue Tracing wscript_rasapi32 EnableTracing5 Registry DeleteRegValue InternetOption internetsettings ProxyBypass6 Process CreateProcess NOT_SANDBOX_CHECK LaunchedViaCom7 Network AccessNetWork Wininet_Getaddrinfo8 Api CallFunction CreateMutexW RANDOM_STR9 Network ResolveHost piglyeleutqq.com UNKNOWN10 Api CallFunction Connect UNKNOWN
#RSAC
CaseStudy2
MachineLearningDeepDive:FingerprintModelObservations
BenefitsoftheAction-Chainprototype• ItcanbeRESILIENT tomalwareobfuscationbecauseitcapturestheruntimesemanticsbyconsideringthemore IMPORTANT details
• FeatureextractionisNON-PARAMETRIC• Wouldgeneralizetomanysituations
ModelCurrent:L1LogisticRegressionfollowedbyL2LogisticRegression;weightedsamplesthroughcross-validation
#RSAC
CaseStudy2
MachineLearningDeepDive:BehavioralModel
Incorporatessecuritydomainknowledgeintothemodel
Sourceoffeatures• YARArules• Staticanalysis• AggregatesfromData:
• Registrykeys/valuesthatarechanged/created/deleted• Mutexes created• Numberofspawnprocessesperprocessdetailinfo
Themodelworkswelltodetectnewtypesmalware
#RSAC
ModelPerformanceandProductizationModeltrainedinregularintervalsSizeofdata:270GBperdayCompletedwithinminutes
ClassificationrunsmultipletimesadayCompletedwithinmilliseconds
10pointsimprovement
Dataset TruePositiveRate FalsePositiveRate
YARArulesonly 82.6% 0.0178%
Machine LearningModel1+Model2 93.6% 0.0127%
CaseStudy2
#RSAC
Gather Detect Triage Context PlanAlert Execute
ForAttackDisruption,WeNeedtoThinkBeyondDetection
Weneedtochangethis
#RSAC
incidents,notalerts
AnomalousDLL:rundll32.exelaunchedassposql11onCFE110095
Newprocessuploading:rundll32.exeto40.114.40.133onCFE110095
Largetransfer:50MBto40.114.40.133fromsqlagent.exeonSQL11006
Triageanomalousdll
sposql11
CFE110095rundll32.exe
newprocupload
40.114.40.133sqlagent.exe
largetransfer
SQL11006
alerttype process user host
alerttype process remotehost host
alerttype remotehost process host
#RSAC
AttackDisruptionmeanstoshortenblueteamkillchain
SpeedReal-timedetection
Conclusion
QualityReducefalsepositives
ReactFasttriage
#RSAC
AttackDisruptionChecklist
AzureEventHubs
Combinedifferent datasets
Labels,Labels,Labels
ScalableMLsolutionandexpertise
ExampleAzureservicesyoucanleverage:
AzureMachineLearning
AzureDataLake