View
217
Download
1
Tags:
Embed Size (px)
Citation preview
A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks
Yan Gao, Zhichun Li, Yan Chen
Lab for Internet and Security Technology (LIST)
Northwestern University
Existing Network IDSes Insufficient• Signature based IDS cannot recognize unknown or
polymorphic intrusions• Statistical IDSes for rescue, but
– Flow-level detection: unscalable• Vulnerable to DoS attackse.g. TRW [IEEE SSP 04], TRW-AC [ USENIX Security e.g. TRW [IEEE SSP 04], TRW-AC [ USENIX Security
Symposium 04], Superspreader [NDSS 05] for port Symposium 04], Superspreader [NDSS 05] for port scan detectionscan detection
– Overall traffic based detection: inaccurate, high false positives e.g. Change Point Monitoring for flooding attack e.g. Change Point Monitoring for flooding attack
detection [IEEE Trans. on DSC 04]detection [IEEE Trans. on DSC 04]
• Key features missing– Distinguish SYN flooding and various port scans for effective
mitigation– Aggregated detection over multiple vantage points
Our Solution: HiFIND System
Goal: accurate High-speed Flow-level INtrusion Detection (HiFIND) system• Leverage our data streaming techniques:
reversible sketches• Select an optimal small set of metrics fro
m TCP/IP headers for monitoring and detection
• Design efficient two-dimensional sketches to distinguish different types of attacks
• Aggregate compact sketches from multiple routers for distributed detection
Deployment of HiFIND• Attached to a router/switch as a black box• Edge network detection particularly powerful
Original configuration Monitor each port
separately
Monitor aggregated
traffic from all ports
Router
LAN
Internet
Switch
LAN
(a)
Router
LAN
Internet
LAN
(b)
HiFINDsystem
scan
po
rtsc
an
port
Splitter
Router
LAN
Internet
LAN
(c)
Splitter
HiF
IND
syst
em
Switch
Switch
Switch
Switch
Switch
HiFINDsystem
HiFINDsystem
k-ary sketch
1
j
H
0 1 K-1…
……
hj(k)
hH(k)
h1(k)
Update (k, v): Tj [ hj(k)] += v (for all j)
Estimate v(S, k): sum of updates for key k
The first to monitor and detect flow-level heavy changes in massive data streams at network traffic speeds [IMC 03]
+ =
S=Combine(,S1,,S2):
Reversible Sketch
• Report keys with heavy changes • Significantly improve its usage [IMC 2004, INF
OCOM 2006, ACM/IEEE ToN to appear]• Efficient data recording
For the worst case traffic, all 40-byte packet streams•Software: 526Mbps on a P4 3.2Ghz PC•Hardware: 16 Gbps on a single FPGA broad
INFERENCE(S,t)?
?
Outline
• Motivation• Background on sketches• Design of the HiFIND system
– Architecture
– Sketch-based intrusion detection
– Intrusion classification with 2D sketches
– Feature analysis
• Evaluation• Conclusion
Architecture of the HiFIND system
Reversible sketches & 2D sketches from other routers
Recording stage
Detection stage
Sketch recording
Real traffic stream
Reversible sketch & 2D sketch
Aggregated reversible
sketch
Forecast error
sketch
Threshold based
detection
Forecast sketch
Aggregated
2D sketch
False positive
reductionAttack
mitigation
Detection stage
Recording stage
Time Series
Analysis methods
Intrusion classi-fication
Phase 1 Phase 2 Phase 3
Architecture of the HiFIND system• Threat model
– TCP SYN flooding (DoS attack)– Port scan
• Horizontal scan• Vertical scan• Block scan
• Forecast methods– EWMA
– Holt-Winter Forecasting Algorithm
HORIZONTAL
PORT NUMBER
DESTINATION IP
BLOCK
VERTICAL
Sketch-based Detection Algorithm
• RS({DIP, Dport}, #SYN - #SYN/ACK)– Detect SYN flooding attacks
• RS({SIP, DIP}, #SYN - #SYN/ACK)– Detect any intruder trying to attack a particular IP addre
ss• RS({SIP, Dport}, #SYN - #SYN/ACK)
– Detect any source IP which causes a large number of uncompleted connections to a particular destination port
Keys SYN flooding Hscan Vscan Score
{SIP, Dport} non-spoofed Yes No 1.5
{DIP, Dport} Yes No No 1
{SIP, DIP} non-spoofed No Yes 1.5
{SIP} non-spoofed Yes Yes 2.5
{DIP} Yes No Yes 2
{Dport} Yes Yes No 2
Intrusion Classification• Major challenge
– Can not completely differentiate different types of attacks
– E.g., if destination port distribution unknown, it is hard to distinguish non-Spoofing SYN flooding attacks from vertical scans by RS({SIP, DIP}, #SYN - #SYN/ACK)
• Bi-modal distribution
SYN floodings SYN floodin
gs
Vertical scans
Vertical scans
Two-dimensional (2D) Sketch For example: differentiate vertical scan from SYN flooding attack• The two-dimensional k-ary sketches
• An example of UPDATE operation
• Accuracy analysisExamples: 5 hash tables, 3.2MB memory consumption– Vertical scan detected at least 99.56%– SYN attack classified correctly at least 99.99%
hx(SIP,DIP) hx(SIP,DIP) hx(SIP,DIP)
H two-dimensional hash matrices
hy(Dport)
hy(Dport)
hy(Dport)
Kx columns
Ky row
s
+1
hx(2.3.0.5,9.7.2.3)
hy(80)
Packet {2.3.0.5, 9.7.2.3, 80,SYN}
DoS Resilience Analysis HiFIND system is resilient to various DoS attacks as follo
ws • Send source spoofed SYN packets to a fixed destination
– Detected as SYN flooding attackDetected as SYN flooding attack
• Send source spoofed packet to random destinations – Evenly distributed in the buckets of each hash table, no falEvenly distributed in the buckets of each hash table, no fal
se positivesse positives
• Reverse-engineer the hash functions to create collisions– Difficult to reverse engineering of hash functionsDifficult to reverse engineering of hash functions
• Unknown hash output of each hash function• Multiple hash tables and different hash functions
• Even know the hash functions of sketches– Very hard to find collisions through exhaustive searchVery hard to find collisions through exhaustive search
• E.g. given 6 hash functions, the probability of a collision of two random keys in 5 hash functions is 5.2×105.2×10-18-18
Distributed Intrusion Detection
• Naive solution:Transport all the packet traces or connection states to the central site
• HiFIND: Summarize the traffic with compact sketches at each edge router, and deliver them to the central site
SY
N1 SYN/ACK1
SY
N2
SY
N/A
CK
2
Evaluation Methodology
• Router traffic traces– Lawrence Berkeley National Laboratory
• One-day trace with ~900M netflow records
– Northwestern University• One day experiment in May 2005 with 239M netflo
w records, 1.8TB traffic and 1:1 packet samples
• Evaluation metrics– Detection accuracy– Online performance:
• Speed• Memory consumption• Memory access per packet
Highly Accurate
Aggregated reversible
sketch
Forecast error
sketch
Threshold based
detection
Forecast sketch
Aggregated
2D sketch
False positive
reductionAttack
mitigationRecording
stage
Time Series
Analysis methods
Intrusion classi-fication
Phase 1 Phase 2 Phase 3
Detection Validation• SYN flooding
– Backscatter [USENIX Security Symposium 2001]• Hscans and Vscans
– The knowledge of port numbere.g. 5 major scenarios of the top 10 Hscans
e.g. 5 major scenarios of the bottom 10 Hscans
Anonymized SIP Dport # DIP Cause
204.10.110.38 1433 56275 SQLSnake scan
5.4.247.103 1433 54788 SQLSnake scan
109.132.101.199 22 45014 Scan SSH
95.30.62.202 3306 25964 MySQL Bot scans
15.192.50.153 4899 23687 Rahack worm
Anonymized SIP Dport # DIP Cause
98.198.251.168 135 64 Nachi or MSBlast worm
3.66.52.227 445 64 Sasser and Korgo worm
2.0.28.90 139 64 NetBIOS scan
98.198.0.101 135 64 Nachi or MSBlast worm
165.5.42.10 5554 62 Sasser worm
Online performance evaluation• Small memory access per packet
– 16 memory accesses per packet with parallel recording• Small memory consumption
• Recording speed– Worst case: recording 239M items in 20.6 seconds
i.e., 11M insertions/sec• Detection speed
– Detection on 1430 minute intervals• Average detection time: 0.34 seconds• Maximum detection time: 12.91 seconds
– Stress experiments in each hour interval• Detecting top 100 anomalies with average 35.61 seconds
and maximum 46.90 seconds
Conclusion
Proposed the first online DoS resilient flow-level IDS for high-speed networks
• Scalable to high–speed networks
• Highly accurate
• DoS attack resilient
• Distinguish SYN flooding and various port scans
• Aggregate detection over multiple vantage points
K-ary Sketch
1
j
H
0 1 K-1…
……
hj(k)
hH(k)
h1(k)Update (k, u): Tj [ hj(k)] += u (for all j)
Estimate v(S, k): sum of updates for key k
K
KsumkhT jjj /11
/)]([median
Online data recording & estimation [IMC 2003]Online data recording & estimation [IMC 2003]
+ =
S=COMBINE(,S1,,S2):
Two-dimensional (2D) Sketch
• Accuracy analysis– Given a key k of a vertical scan, the majority of the
H hash matrices will classify k as a vertical scan attack with probability at least ,
where . ( )
– Given a key k of a SYN flooding, the majority of the H hash matrices will classify k as a SYN flooding attack with probability at least ,
where .
B2δ
yp
2
eKφB]Pr[S
H
12H
r
rHr S)(1Sr
H
error)(1K
1KS
f
x
x
H
12H
r
rHr S)(1Sr
H
vf
x
x
K1K
S
Related work• Threshold Random Walk (TRW) for port scan
detection [J. Jung et al. 2004] [J. Jung et al. 2004]– Not DoS resilientNot DoS resilient
• TRW with approximate caches (TRW-AC) [N. Weaver et al. 2004][N. Weaver et al. 2004]– High false negatives under DoS attackHigh false negatives under DoS attack
• Change Point Monitoring (CPM) [H. Wang et al. 200[H. Wang et al. 2002]2]– Detecting port scans as SYN floodingsDetecting port scans as SYN floodings
• Backscatter [D. Moore et al. 2001][D. Moore et al. 2001]– Only targeting randomly spoofed DoS attacksOnly targeting randomly spoofed DoS attacks
• Superspreader [S. Venkataraman et al. 2005][S. Venkataraman et al. 2005]– High false positives with P2P trafficHigh false positives with P2P traffic
• Partial Completion Filters (PCF) [R. Kompella et al. 2[R. Kompella et al. 2004]004]– Not reversibleNot reversible