Data-Driven Threat Intelligence: Metrics on Indicator ... · Using TIQ-TEST – Data Prep •...

Preview:

Citation preview

Data-DrivenThreatIntelligence:MetricsonIndicatorDisseminationandSharing

(#ddti)

AlexPintoChiefDataScientist

MLSec Project@alexcpsec

@MLSecProject

AlexandreSieiraCTONiddel

@AlexandreSieira@NiddelCorp

• CyberWar… ThreatIntel–Whatisitgoodfor?

• CombineandTIQ-test• Measuringindicators• ThreatIntelligenceSharing• Futureresearchdirection(i.e.willworkfordata)

Agenda

HTto@RCISCwendy

50-ishSlides3KeyTakeaways

2HeartfeltandgenuinedefensesofThreatIntelligenceProviders

1Predictionon“TheFutureofThreatIntelligenceSharing”

PresentationMetrics!!

WhatisTIgoodfor(1)Attribution

WhatisTIgoodforanyway?

TYto@bfist forhisworkonhttp://sony.attributed.to

WhatisTIgoodfor(2)– CyberMaps!!

TYto@hrbrmstr forhisworkonhttps://github.com/hrbrmstr/pewpew

WhatisTIgoodforanyway?

• (3)Howaboutactualdefense?• Strategicandtactical:planning• Technicalindicators:DFIRandmonitoring

AffirmingtheConsequentFallacy

1. IfA,thenB.2. B.3. Therefore,A.

1. Evilmalwaretalksto8.8.8.8.2. Iseetrafficto8.8.8.8.3. ZOMG,APT!!!

ButthisisaData-Driventalk!

CombineandTIQ-Test• Combine(https://github.com/mlsecproject/combine)• GathersTIdata(ip/host)fromInternetandlocalfiles• Normalizesthedataandenrichesit(AS/Geo/pDNS)• CanexporttoCSV,“tiq-testformat”andCRITs• ComingSoon™:CybOX /STIX/SILK/ArcSight CEF

• TIQ-Test(https://github.com/mlsecproject/tiq-test)• RunsstatisticalsummariesandtestsonTIfeeds• Generateschartsbasedonthetestsandsummaries• WritteninR(becauseyoushouldlearnastatlanguage)

• https://github.com/mlsecproject/tiq-test-Summer2015

UsingTIQ-TEST– FeedsSelected• Datasetwasseparatedinto“inbound”and“outbound”

TYto@kafeine andJohnBambenek foraccesstotheirfeeds

UsingTIQ-TEST– DataPrep• Extractthe“raw”informationfromindicatorfeeds• BothIPaddressesandhostnameswereextracted

UsingTIQ-TEST– DataPrep• ConvertthehostnamedatatoIPaddresses:• ActiveIPaddressesfortherespectivedate(“A”query)• PassiveDNSfromFarsight Security(DNSDB)

• ForeachIPrecord(includingtheonesfromhostnames):• Addasnumber andasname (fromMaxMind ASNDB)• Addcountry (fromMaxMind GeoLite DB)• Addrhost (againfromDNSDB)– mostpopular“PTR”

UsingTIQ-TEST– DataPrepDone

NoveltyTestMeasuringaddedanddropped

indicators

NoveltyTest- Inbound

AgingTestIsanyonecleaningthismessup

eventually?

INBOUND

OUTBOUND

PopulationTest• LetususetheASNandGeoIP databasesthatweusedtoenrichourdataasareferenceofthe“true”population.

• But,but,humanbeingsareunpredictable!Wewillneverbeabletoforecastthis!

Isyoursamplingpollasrandomasyouthink?

Canwegetabetterlook?• Statisticalinference-basedcomparisonmodels(hypothesistesting)• Exactbinomialtests(whenwehavethe“true”pop)• Chi-squaredproportiontests(similartoindependence tests)

OverlapTestMoredatacanbebetter,butmake

sureitisnotthesamedata

OverlapTest- Inbound

OverlapTest- Outbound

UniquenessTest

UniquenessTest

• “Domain-basedindicatorsareuniquetoonelistbetween96.16%and97.37%”

• “IP-basedindicatorsareuniquetoonelistbetween82.46%and95.24%ofthetime”

Ihatequotingmyself,but…

KeyTakeaway#1

MORE!=BETTERThreatIntelligenceIndicatorFeeds

ThreatIntelligenceProgram

KeyTakeaway#1

Intermission

KeyTakeaway#2

KeyTakeaway#1"ThesearetheproblemsThreatIntelligenceSharingishereto

solve!”

Right?

HerdImmunity,isit?

Source:www.vaccines.gov

HerdImmunity…

…wouldimplythatothersinyoursharingcommunitybeingimmunetomalwareAmeantyouwouldn’tgetitevenifyouwerestillvulnerable toit.

ThreatIntelligenceSharing

• Howmanyindicatorsarebeingshared?

• Howmanymembersdoactuallyshareandhowmanyjustleech?

• Canwemeasurethat?Whatasuper-deeee-duperidea!

ThreatIntelligenceSharingWewouldliketothankthekindcontributionofdatafromthefinefolksatFacebookThreatExchangeandThreatConnect…

…andalsothesharingcommunitiesthatchosetoremainanonymous.Youknowwhoyouare,andwe❤ youtoo.

ThreatIntelligenceSharing– Data

Fromaperiodof2015-03-01to2015-05-31:- NumberofIndicatorsShared

§ Perday§ Permember

Notsharingthisdata– privacyconcernsforthemembersandcommunities

Updatefrequencychart

OVERLAPSLIDE

OVERLAPSLIDE

UNIQUENESSSLIDE

MATURITY?

“Reddit ofThreat

Intelligence”?

KeyTakeaway#1

'Howcansharingmakemebetterunderstandwhatare

attacksthat“aretargeted”andwhatare“commodity”?'

KeyTakeaway#1

TELEMETRY>CONTENT

KeyTakeaway#3(AlsoPrediction#1)

MoreTakeaways(Ilied)

• Analyzeyourdata.Extractmorevaluefromit!• IfyouABSOLUTELYHAVETObuyThreatIntelligenceordata,evaluateitfirst.

• Trythesampledata,replicatetheexperiments:• https://github.com/mlsecproject/tiq-test-Summer2015• http://rpubs.com/alexcpsec/tiq-test-Summer2015

• Sharedatawithus.I’llmakesureitgetsproperexercise!

Thanks!

• Q&A?• Feedback!

”Themeasureofintelligenceistheabilitytochange."- AlbertEinstein

AlexPinto@alexcpsec

@MLSecProject

Alexandre Sieira@AlexandreSieira@NiddelCorp