16
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffic Analysis. FluxBuster Presented By Rajshekhar PATIL

Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffic Analysis

Embed Size (px)

Citation preview

Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffic Analysis.

FluxBuster

Presented By

Rajshekhar PATIL

•Introduction

•Research goals

•Flux buster

•Cross validation

•Safe browser

•Conclusion

•References

Contents

Introduction Nowadays, internet miscreant and cyber criminals are increasing a lot and detecting and stopping them has become a serious issue. Malicious flux networks have recently started to thrive. Malicious flux -networks are a type a illegitimate content delivery networks(CDNs). These types of networks are set up using fast flux domain names i.e., the set of resolved IP addresses associated to these networks change frequently, often after each DNS query, thereby making it difficult to detect them. To make it even more complicated, these set of resolved set of IP addresses, also known as flux agents, are spread across many different networks. FluxBuster is based on DNS traffic observation from ”above” .

Research Goals • Previous works on flux detection based mainly on active probing– Limited to known bad or suspicious domains– Domains treated independently– Possible data pollution by attackers • Passive Detection– Monitor “behavior” of all domains over time– Only focus on live domains

FluxBuster System Overview

DNS Message AggregatorFlux Buster receives in input a stream of DNS messages as provided by the ISC/SIE framework.ISC/SIE collects raw DNS query/response messages from a large number of RDNS sensors, and rebroadcasts these DNS messages in a deduplicated fashion.

For example, assume that there are three RDNS sensors S1, S2, and S3 that have reported a DNS query/response message regarding a domain name d to ISC/SIE. Suppose that S1 reported the mapping of domain d to three IP addresses, { IP1, IP2, IP3}, S2 reported the mapping of d to two addresses {IP1, IP4}, and S3 reported the mapping of d to one address {IP5}.

These raw messages will be combined within the ISC/SIE framework into a deduplicated message starting that d maps to {IP1, IP2, IP3, IP4, IP5}.

Characteristics of Flux Domain Names

1. short time-to-live;2. high frequency of change of the set of resolved IPs (i.e., the flux agents) returned at each query;

3. the overall set of resolved IPs obtained by querying the same domain name over time is often very large;

4. the resolved IPs are scattered across many different networks.

Message PrefilteringThe Message Prefiltering module performs data volume reduction by discarding domain names that are very unlikely to be part of a flux network.

Therefore only domains with a very large TTL, very low number of resolved Ips and a low value of diversity of the IP set will be discarded.

Summing up, the output of the Message Prefiltering module is a list of candidate flux domains and their related aggregated DNS information (i.e., resolved IP addresses, average TTL, etc.).

Domain Clustering

Group domains that are related to each other

– Hierarchical clustering algorithm– Similarity measure based on resolved IPs

perform domain clustering of flux domains that are related to each other single-linkage hierarchical clustering algorithm is used, which adopts a friends of friends clustering strategy.In order to apply the clustering algorithm to a set of domain names D ={d1, d2, . . . , dn},a measure of similarity between them is defined first.

Given two domains α and β, and their cumulative set of resolved IP addresses collected during an epoch E, respectively, Rand R, compute their similarity score as

|R∩R| 1 ——— · ——————————— € |R∪R| 1+ eγ−min(|R|,|R|)

Sim(α, β) =

Jaccard Index

ǫ[0, 1]

The first factor is the Jaccard index for sets R and R, which intuitively measures the relative overlap between the two cumulative sets of resolved IPs.

The second factor is a sigmoidal weight designed to measure the confidence in estimating the Jaccard index.

Supervised Classifier Input : Clusters of domains – Clusters are translated into feature vectorsSupervised Training: – Need labeled data(ground truth)

– We built a web interface to facilitate semi-manual labeling

Output : new (unlabeled) clusters are Labeled as either Flux or non--‐flux

Cross-Validation

• Label data Dataset

– semi--‐manual labeling process

– If no clear-cut decision exclude Cluster To minimize training noise

– 1,337 clusters labeled as flux

• 100,644 distinct 2LDs(113,580FQDs)

– 5,708 labeled as non-flux

• 2,116 distinct 2LDs(59,215 FQDs)

Safe Browsing

• Take flux domains and

– Check if port 80 is open

– Check for valid HTTP response/content

– Vet against Safe Browsing (SB) and malware BLs

• Most missed by SB are Rogue pharmacies, adult-related sites

• SB Only reports Known phishing And malware sites

Conclusion

The above evaluation showed that Flux Buster is capable of accurately detecting previously unknown flux networks days or even weeks in advanced before they appear in public blacklists.

The experimental results show that FluxBuster is able to accurately detect malicious flux networks with a low false positive rate.

our detection approach is not limited to the analysis of suspicious domainnames extracted from spam emails or precompiled domain blacklists.

REFERENCE

[1] Roberto Perdisci, Igino Corona, and Giorgio Giacinto. Early Detection of Malicious Flux Networks via Large-Scale Passive DNS Traffic Analysis. In IEEE Transactions on Dependable and Secure Computing, VOL. 9, NO. 5, September/October 2012.[2] M. Knysz, X. Hu, and K.G. Shin.Good Guys vs. Bot Guise: Mimicry Attacks against Fast Flux Detection Systems.In Proc. IEEE INFOCOM, 2011.

THANK YOU