Upload
georgia-hood
View
220
Download
0
Embed Size (px)
DESCRIPTION
3 Motivation Botnets Worms Attackers Professional attackers exploit the enterprise networks for profit $$$
Citation preview
Towards High Speed Network Defense
Zhichun LiEECS Deparment
Northwestern University
2
Agenda
• Briefly introduce my thesis work• Dive in high performance vulnerability
signature matching• Future research directions
3
Motivation
Botnets
Worms
Attackers
Professional attackers exploit the enterprise networks for profit $$$
4
Network Level Defense
• Network gateways/routers are the vantage points for detecting large scale attacks
• Only host based detection/prevention is not enough for modern enterprise networks– Some users do not apply the host-based schemes
due to the reliability, overhead, and conflicts.– Many users do not update or patch their system on
time. – Enterprises cannot only reply on their end users for
security protection
5
Challenges
• Scalable to high speed networks with a large number of users
• Need to be highly accurate• Adapt fast to the emerging threats• Have good attack coverage.
6
Network-based Intrusion Detection, Prevention, and Forensics System
• Framework(I) Sketch based monitoring & detection
(III) Signature matching engines
(II) Polymorphic worm signature generation
(IV) Network Situational Awareness
Honynethoneyfarms
Packetstreams
Accuracy &adapt fast
Accuracy &adapt fast
Scalability
Accuracy &Scalability & Coverage
77
Network-based Intrusion Detection, Prevention, and Forensics System (I)
• Online traffic monitoring and recording [INFOCOM 2006, ToN 2007] (cited by 30+)– Reversible sketch for data streaming computation– Record millions of flows (GB traffic) in a few hundred KB– Small # of memory access per packet– Scalable to large key space size (232 or 264)
• Online sketch-based flow-level anomaly detection[IEEE ICDCS 2006] [IEEE CG&A, Security Visualization 2006]– Detect TCP SYN flooding, horizontal and vertical scans even w
hen mixed
1
j
H
0 1 K-1…
……
hj(k)
hH(k)
h1(k)
88
• Polymorphic worm signature generation – Token based Signature [IEEE Symposium on Security and
Privacy 2006] (cited by 40+, code requested by Columbia U. UT Austin, Purdue, Georgia Tech, UC Davis, etc)
– Network based Vulnerability Signature [IEEE ICNP 2007] [ NSF Cyber Trust Award]
1010101
10111101
11111100
00010111
Network gatewayInternet
Network-based Intrusion Detection, Prevention, and Forensics System (II)
Our network
99
• NetShield Vulnerability Signature based NIDS/NIPS [under submission] [NSF Cyber Trust Award] (interested by Cisco and Juniper)
Network-based Intrusion Detection, Prevention, and Forensics System (III)
Focus of this talk, details come later
1010
• Large-scale botnet and P2P misconfiguration event situational-aware forensics– Botnet attack target/strategy inference [ASIACCS09]– Root cause analysis of the P2P misconfiguration/poi
soning traffic [under submission]
Network-based Intrusion Detection, Prevention, and Forensics System (IV)
Peers
File Request Flooding
Innocent VictimMisconfigured Traffic
DDoS attack Scenario
11
NetShied: Matching a Large vulnerability Signature Ruleset for High Performance Network Defens
e
12
NetShield Overview NIDS/NIPS (Network Intrusion
Detection/Prevention System) operation
Signature DB
NIDS/NIPS `
`
`
Packets
Securityalerts
• Accuracy• Speed• Attack Coverage
13
State of the art
Pros• Can efficiently match
multiple sigs simultaneously, through DFA
• Can describe the syntactic context
Regular expression (regex) based approachesExample: .*Abc.*\x90+de[^\r\n]{30}
Cons• Limited expressive p
ower• Cannot describe the
semantic context • Inaccurate
14
State of the art
Pros• Directly describe semant
ic context• Very expressive, can ex
press the vulnerability condition exactly
• Accurate
Vulnerability Signature [Wang et al. 04]
Cons• Slow! • Existing approaches all
use sequential matching• Require protocol parsing
Example:BIND:rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00&& context[0].abstract_syntax.uuid=UUID_RemoteActivationBIND-ACK:rpc_vers==5 && rpc_vers_minor==1CALL:rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00&& stub.RemoteActivationBody.actual_length>=40 && matchRE( stub.buffer, /^\x5c\x00\x5c\x00/)
15
Motivation of NetShield
15
Theoretical accuracy limitation of regex
State of the art regex Sig
IDSesNetShield
Existing Vulnerability
Sig IDS
Accuracy HighLowLow
Hig
hS
peed
1616
Motivation• Desired Features for Signature-based
NIDS/NIPS– Accuracy (especially for IPS)– Speed– Coverage: Large ruleset
Regular Expression
Vulnerability
Accuracy Relative Poor
Much Better
Speed Good ??
Memory OK ??
Coverage Good ??
Shield[sigcomm’04]
Focus of this work
Cannot capture vulnerability condition well!
1717
Research Challenges
• Background– Use protocol semantics to express the vulnerability– Defined on a sequence of PDUs & one predicate for each
PDU– Example: ver==1 && method==“put” && len(buf)>300
• Challenges– Matching thousands of vulnerability signatures
simultaneously• Sequential matching match multiple sigs simultaneously
– High speed parsing
1818
Outline
• Motivation• High Speed Matching for Large Rulesets.• High Speed Parsing• Evaluation• Research Contributions
1919
A Vulnerability Signature Example• Data representations
– For all the vulnerability signatures we studied, we only need numbers and strings
– number operators: ==, >, <, >=, <=– String operators: ==, match_re(.,.), len(.).
• Example signature for Blaster wormExample:BIND:rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00&& context[0].abstract_syntax.uuid=UUID_RemoteActivationBIND-ACK:rpc_vers==5 && rpc_vers_minor==1CALL:rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00&& stub.RemoteActivationBody.actual_length>=40 && matchRE( stub.buffer, /^\x5c\x00\x5c\x00/)
2020
Matching Problem Formulation
• Consider single PDU matching first• Suppose we have n signatures, defined on k matching dimensions (matchers)– A matcher is a two-tuple (field, operation) or a
four-tuple for the associative array elements. – Translate the n signatures to a n by k table.
Rule 6: URI.Filename=“fp40reg.dll” && len(Headers[“host”])>300
2121
Matching Problem Formulation
• Challenges for Single PDU matching problem (SPM)– Large number of signatures n– Large number of matchers k– Large number of “don’t cares”– Cannot reorder matchers arbitrarily -- buffering
constraint– Field dependency
• Arrays, associative arrays• Mutually exclusive fields.
2222
Matching Algorithms
Candidate Selection Algorithm1.Pre-computation decides the rule order and
matcher order2.Divide-and-conquer comparison w/
matchers and iteratively combine the results efficiently
2323
Step 1: Pre-Computation• Matcher reoder: Put the non-selective matchers
later based on buffering constraint & field arrival order
• Rule reorder:RB1
RB1 RB2
RB1 RB2 RB3 ...RB4
...
Matcher 1 Don’t care of Matcher 1
Extended by Matcher 2
Don’t care of both Matcher 1 & 2
Don’t care of all Matcher 1 to n
2424
RB1: 1 2 3 RB2: 4 5 6
Step 2: Iterative Matching
RB1: 1 2 3 RB2: 4 5 6 RB3: 7 RB4: 8
RB1: 1 2 3
RB1: 1 2 3 RB2: 4 5 6 RB3: 7
S2 = S1 A2+B2 = {3} {}+{6} = {}+{6} = {6}
S3 = S2 A3+B3 = {6} {}+{} = {6}+{} = {6}
S4 = S3 A4+B4 = {6} {4}+{} = {6}+{} = {6}
RB1: 1 2 3 RB2: 4 5 6 RB3: 7 RB4: 8 RB5: 9S5 = S4 A5+B5 = {6} {6}+{} = {6}+{} = {6}
S1= {3}
PDU={Method=POST, Filename=fp40reg.dll, VARs: name="file"; value~".*\.\./.*", Headers: name="host"; len(value)=450}
2525
Candidate merge operation
1 ii AS
Si1 ii AS
Don’t care matcher i+1
requirematcher i+1 In Ai+1
2626
Refinement and Extension
• SPM improvement– Allow negative conditions– Handle array case– Handle associate array case– Handle mutual exclusive case– Report the matched rules as early as possible
• Extend to Multiple PDU Matching (MPM)– Allow checkpoints.
2727
Outline
• Motivation• High Speed Matching for Large Rulesets.• High Speed Parsing• Evaluation• Research Contribution
2828
Observations
array
PDU• PDU parse tree• Leaf nodes are
integers or strings• Vulnerability signatures
mostly based on leaf nodes
• Observation 1: Only need to parse the fields related to signatures.
• Observation 2: Traditional recursive descent parsers which need one function call per node are too expensive.
2929
Efficient Parsing with State Machines
• Studied eight protocols: HTTP, FTP, SMTP, eMule, BitTorrent, WINRPC, SNMP and DNS as well as their vulnerability signatures.
• Pre-construct parsing state machines based on parse trees and vulnerability signatures.
• Common relationship among leaf nodes.
Varderive
Sequential Branch Loop Derive(a) (d)(c)(b)
VarVar
3030
Example for WINRPC• Rectangles are states• Parsing variables: R0 .. R4• 0.61 instruction/byte for BIND PDU
1 rpc_ver_minor
R4
20*R4
R2++R2£R3
R2 ‹- 0R3 ‹- ncontext
Header BindR0
R0
R1-16
Bind
Bind-ACK
R1
Bind-ACK
1 rpc_vers
1 pfc_flags1 ptype
2 frag_length4 packed_drep
6 merge1
1 n_tran_syn2 ID
16 UUID1 padding
tran_syn4 UUID_ver
1 ncontext8 merge2
3 padding
merge3
3131
Outline
• Motivation• High Speed Matching for Large Rulesets.• High Speed Parsing• Evaluation• Research Contributions
32
Evaluation Methodology
• 26GB+ Traces from Tsinghua Univ. (TH), Northwestern (NU) and DARPA
• Run on a P4 3.8Ghz single core PC w/ 4GB memory.• After TCP reassembly and preload the PDUs in memory• For HTTP we have 794 vulnerability signatures which covers
973 Snort rules.• For WINRPC we have 45 vulnerability signatures which covers
3,519 Snort rules 32
Fully implemented prototype• 11,704 lines of C++ and
2,706 lines of Python• Can run on both Linux and
WindowsDeployed at a university DCwith up to 106Mbps
3333
Parsing Results
Trace TH DNS
TH WINRPC
NU WINRPC
TH HTTP
NU HTTP
DARPA HTTP
Throughput (Gbps) Binpac Our parser
0.313.43
1.4116.2
1.1112.9
2.107.46
14.244.4
1.696.67
Speed up ratio 11.2 11.5 11.6 3.6 3.1 3.9Max. memory per connection (bytes)
15 15 15 14 14 14
3434
Matching Results
Trace TH WINRPC
NU WINRPC
TH HTTP
NU HTTP
DARPA HTTP
Throughput (Gbps) Sequential CS Matching
10.6814.37
9.2310.61
0.342.63
2.3717.63
0.281.85
Matching only timespeed up ratio
4 1.8 11.3 11.7 8.8
Avg # of Candidates 1.16 1.48 0.033 0.038 0.0023Max. memory per connection (bytes)
27 27 20 20 20
35
Other Results
• Memory for 973 Snort rules: DFA 5.29GB (XFA 863 rules1.08MB), NetShield 2.3MB
• Per flow memory: XFA 36 bytes, NetShield 20 bytes.
• Throughput: XFA 756Mbps, NetShield 1.9+Gbps
*XFA [SIGCOMM08][Oakland08]0 200 400 600 800
01
23
4
# of rules used
Thro
ughp
ut (G
bps)
Rule scaling results
PerformancDecreasegracefully
Compare with Regex
3636
Research Contributions
• Demonstrate vulnerability signatures can be applied to NIDS/NIPS, which can significantly improve the accuracy of current NIDS/NIPS
• Propose the candidate selection algorithm for matching a large number of vulnerability signatures efficiently
• Propose parsing state machine for fast protocol parsing
• Implement the NetShield
37
Future work
• Working in process– In collaboration with MSR. Apply the semantic
rich analysis for cloud Web service profiling. To understand why slow and how to improve.
• Future work– Web security (browser security, web server
security)– Data Center security– High Speed Network Intrusion Prevention
System with Hardware Support
38
Long Term Research Challenges
• Combat the professional profit-driven attackers.
• Online applications (including Web 2.0 applications) become more complex and vulnerable.
• Network speed keeps increasing, which demands highly scalable approaches.
39
Q & A
Thanks!
40
Backup Slides
•
4141
Measure Snort Rules
• Semi-manually classify the rules.1. Group by CVE-ID 2. Manually look at each vulnerability
• Results– 86.7% of rules can be improved by protocol semantic
vulnerability signatures. – Most of remaining rules (9.9%) are web DHTML and
scripts related which are not suitable for signature based approach.
– On average 4.5 Snort rules are reduced to one vulnerability signature.
– For binary protocol the reduction ratio is much higher than that of text based ones. • For netbios.rules the ratio is 67.6.
42
Motivation
• Network security has been recognized as the single most important attribute of their networks, according to survey to 395 senior executives conducted by AT&T
• Many new emerging threats make the situation even worse
43
System Framework
Content-based signature matching
Streaming packet data
Data path Control pathModules on the critical path
Token Based Signature Generation (TOSG)
Part IIPolymorphic worm signature generation
Modules on the non-critical path
Honeynets/Honeyfarms
Network Situational Awareness
Length Based Signature Generation (LESG)
Part IVNetwork Situational Awareness
To unused IPblocks
Protocol semantic signature matching
Part IIISignature matching engines
Reversiblek-ary sketch monitoring
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Sent out for aggregation
Remote aggregatedsketchrecords Part I
Sketch-basedmonitoring & detection
Scalability
Accuracy &adapt fast
Accuracy &Scalability & Coverage
Accuracy &adapt fast
Scalability
Accuracy &Scalability & Coverage
Accuracy &adapt fast
Scalability
Accuracy &Scalability & Coverage
Accuracy &adapt fast
Accuracy &adapt fast
Scalability
Accuracy &Scalability & Coverage
44
Example of Vulnerability Signatures• At least 75%
vulnerabilities are due to buffer overflow
Sample vulnerability signature
• Field length corresponding to vulnerable buffer > certain threshold
• Intrinsic to buffer overflow vulnerability and hard to evade
Vulnerable buffer
Protocol message
Overflow!
45
Old Slides
4646
Conclusions
• A novel network-based vulnerability signature matching engine– Through measurement study on Snort ruleset,
prove the vulnerability signature can improve most of the signatures in NIDS/IPS.
– Proposed parsing state machine for fast parsing
– Propose a candidate selection algorithm for matching a large number of vulnerability signature simultaneously
48
Outline
• Motivation• Feasibility Study: a measurement approach• Problem Statement• High Speed Parsing• High Speed Matching for massive
vulnerability Signatures.• Evaluation• Conclusions
49
Outline
• Motivation• Feasibility Study: a measurement approach• Problem Statement• High Speed Parsing• High Speed Matching for massive
vulnerability Signatures.• Evaluation• Conclusions
50
Outline
• Motivation• Feasibility Study: a measurement approach• Problem Statement• High Speed Parsing• High Speed Matching for a large number of
vulnerability Signatures.• Evaluation• Conclusions
51
Outline
• Motivation• Feasibility Study: a measurement approach• Problem Statement• High Speed Parsing• High Speed Matching for massive
vulnerability Signatures.• Evaluation• Conclusions
52
Limitations of Regular Expression Signatures
1010101
10111101
11111100
00010111
Our network
Traffic FilteringInternet
Signature: 10.*01
XX
Polymorphic attack (worm/botnet) might not have exact regular expression based signature
Polymorphism!
53
What we do?
• Build a NIDS/NIPS with much better accuracy and similar speed comparing with Regular Expression based approaches– Feasibility: Snort ruleset (6,735 signatures) 86.7%
can be improved by vulnerability signatures.– High speed Parsing: 2.7~12 Gbps– High speed Matching:
• Efficient Algorithm for matching massive vulnerability rules• HTTP, 791 vulnerability signatures at ~1Gbps
54
Problem Formulation
• Parsing problem formulation– Given a PDU and the protocol specification as
input, output the set of fields which required by matching.
55
Publications• Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and At
tack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE ICNP 2007.
• Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007
• Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao, Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006
• Zhichun Li, Yan Chen and Aaron Beach, Towards Scalable and Robust Distributed Intrusion Alert Fusion with Good Load Balacing, in Proc. of ACM SIGCOMM LSAD 2006
• Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE ICDCS 2006
• Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006
56
Current Status• Part I: Sketch based monitoring & detection
– Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reversible sketches: Enabling monitoring and analysis over high speed data streams, in the IEEE/ACM Transaction on Networking, Volume 15, Issue 5, Oct, 2007
– Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Elliot Parons, Yin Zhang, Peter Dinda, Ming-Yang Kao, and Gokhan Memik, Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluations, and Applications, in the Proc. Of IEEE INFOCOM 2006 (252/1400=18%)
– Yan Gao, Zhichun Li and Yan Chen, A DoS Resilient Flow-level Intrusion Detection Approach for High-speed Networks, In Proc. Of IEEE International Conference on Distributed Computing Systems (ICDCS) 2006 (75/536=14%) (Alphabetical order)
• Part II: Polymorphic worm signature generation– TOSG: Zhichun Li, Manan Sanghi, Brian Chavez, Yan Chen and Ming-Yang Kao,
Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience, in Proc. of IEEE Symposium on Security and Privacy, 2006 (23/251=9%)
– LESG: Zhichun Li, Lanjia Wang, Yan Chen and Zhi (Judy) Fu, Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorohic Worms, in the Proc. of IEEE International Conference on Network Protocols (ICNP) 2007 (32/220=14%)
57
Current Status
• Part III: Signature matching engines– Work in progress, will be focus of this talk– Zhichun Li, Gao Xia, Yi Tang, Jian Chen, Ying He, Yan Chen
and Bin Liu, NetShield : Towards High Performance Network-based Semantic Signature Matching, in submission
• Part IV: Network Situational Awareness– Work in process– Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson, Towards
Situational Awareness of Large-Scale Botnet Events using Honeynets, in preparation
– Zhichun Li, Anup Goyal, Yan Chen and Aleksandar Kuzmanovic, P2P Doctor: Measurement and Diagnosis of Misconfigured Peer-to-Peer Traffic, in submission
58
Current Status
• Part I: Sketch based monitoring & detection– Result in [Infocom06,ToN,ICDCS06]
• Part II: Polymorphic worm signature generation– Result in [Oakland06,ICNP07]
• Part III: Signature matching engines– Work in progress, will be focus of this talk
• Part IV: Network Situational Awareness– Work in process
59
Limitations of Exploit Based Signature
1010101
10111101
11111100
00010111
Our network
Traffic FilteringInternet
Signature: 10.*01
XX
Polymorphic worm might not have exact exploit based signature
Polymorphism!
60
Vulnerability Signature
Work for polymorphic wormsWork for all the worms which target thesame vulnerability
Vulnerability signature traffic filteringInternet
XX Our network
Vulnerability
XX