Upload
stanley-adams
View
219
Download
1
Embed Size (px)
Citation preview
1
Detecting Malicious Flux Service Networks throughPassive Analysis of Recursive DNS Traces
Speaker: Jun-Yi Zheng2010/03/29
2
Reference
Roberto Perdisci, Igino Corona, David Dagon, and Wenke Lee. " Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces."ACSAC'09
3
Outline
Introduction System Architecture Experiments Conclusions
4
Introduction Fast-flux service networks(FFSNs)
a new ( ~2007) technique to maximize botnets availability
simple idea: add an additional indirection layer (i.e., proxy) between victims and controlling elements
a large number of proxy hosts (flux agent) are used to relay requests to the back-end server (mother-ship)
a decentralized botnet with constantly changing public DNS records
5
Fast-flux botnets Architecture
6
Characteristics of Flux Domain Names Short time-to-live (TTL) The set of resolved IPs returned at each query
changes rapidly The overall set of resolved IPs obtained by
querying the same domain name over time is often very large
The resolved IPs are scattered across many different networks
7
Approach Passive analysis of recursive DNS
Not only email spam and precompiled domain blacklists
Active probing may be detected by the attacker
Classify domains previous works, single domain names are
considered independently from each other
8
System Overview
9
Notation q(d) : a DNS query performed by a user at time ti to
resolve the set of IP addresses owned by domain name d Q(d)
i: the total number of DNS queries related to d ever seen until ti
T(d): the TTL of the DNS response Ť(d)
i: the maximum TTL ever observed for d P(d): the set of resolved Ips returned by the RDNS server prefix(P(d), 16) : the set of distinct /16 network prefixes
extracted from P(d)
R(d)i : the cumulative set of all the resolved IPs ever seen for d
until time ti
G(d)i: a sequence of pairs {(tj , r(d)
j)}j=1..i
where r(d)j = |R(d)
j | − | R(d)j −1|
10
Traffic Volume Reduction (F1) q(d) = (ti, T(d), P(d)) F1-a) T(d) <= 10800 seconds (i.e., 3 hours)
Because such domain names ( TTL >= 10800) are unlikely to be “fluxing”
F1-b) |P(d)| >= 3 OR T(d) <= 30 Because the uptime of each flux agent is not easily
predictable A large set of resolved IPs, or A very low TTL ( equal or close to zero )
F1-c) p = |prefix(P(d),16)| / |P(d)| >= 1/3 Flux agents are often across many different networks
and organizations
11
Periodic List Pruning (F2)
d = (ti , Q(d)i , Ť(d)
i , R(d)i , G(d)
i)
F2-a) Qi > 100 AND |G(d)i | < 3 AND
( |R(d)i | <= 5 OR p <= 0.5 ),
remove from a list of candidate flux domains domain names that do not pass F2 are very
unlikely to be related to flux services
12
Domain Clustering IP-based Domain Clustering
a number of fast-flux domain names all point to the same flux service
single-linkage hierarchical clustering algorithm Input: a similarity matrix; Output: a dendrogram The length of the edges represent the distance
between clusters
( ) ( )
( ) ( )
( ) ( ) min(| |,| |)
| | 1( , ) [0,1], 3
| | 1a b
a b
a b r R R
R Rsim a b r
R R e
13
Service Classifier “Passive” feature -- collected by passively
monitoring the DNS queries Ψ1-Number of resolved IPs Ψ2-Number of domains Ψ3-Avg. TTL per domain Ψ4-Network prefix diversity
the ratio between the number of distinct /16 network prefixes and the total number of IPs
Ψ5-Number of domains per network how many domains can be associated to the IPs in a cluster,
throughout different epochs Ψ6-IP Growth Ratio
( )
( )
1 | |
| |i
d
dd ci
R
C Q
14
Service Classifier “Active” feature -- need some additional external
information to be computed Ψ7-Autonomous System (AS) diversity Ψ8-BGP prefix diversity Ψ9-Organization diversity Ψ10-Country Code diversity Ψ11-Dynamic IP ratio
a reverse (type PTR) DNS lookup for each IP,“dhcp”, “dsl”, “dial-up”, etc.,
Ψ12-Average Uptime Index actively probing each IP in a cluster about six times a day
for a predefined number of days
C4.5 decision-tree classifier
15
Collecting RDNS Traffic 2009/3/1 ~2009/4/14 two traffic sensors in front of two different RDNS servers
of ISP more than 4 million users about 1.3 billion DNS queries of type A and CNAME per
sensor over 2.5 billion DNS queries per day related to hundreds
of millions of distinct domain names
16
Evaluation of the Service Classifier we manually inspected and labeled a fairly
large number of clusters of domains AUC DR FP
All Features 0.992 (0.003) 99.7% (0.36) 0.3% (0.36)
Passive Features 0.993 (0.005) 99.4% (0.53) 0.6% (0.53)
Ψ6, Ψ3, Ψ5 0.989 (0.006) 99.3% (0.49) 0.7% (0.49)
Table I: Classification performance computed using 5-fold cross-validation. AUC=Area Under the ROC Curve; DR=Detection Rate; FP=False Positive Rate. The numbers between parenthesis represent the standard deviation of each measure.
17
Can this Contribute to Spam Filtering?
Intuition if the domain name of the website points to one or
more previously detected flux agents, it is very likely that the website is malicious
18
Detection rate: 90% to 95% that several of the domain names detected as malicious did not
appear to have a “fluxing” behavior themselves, but resolved to a flxed set of IP that partially intersected with the IP of flux agents
19
Conclusions
passive approach for detecting malicious flux service networks in-the-wild Not limited to the analysis of suspicious domain
names extracted from spam emails or precompiled domain blacklists
Our passive detection and tracking of malicious flux service networks may benefit spam filtering applications