23
Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by: James Newsome, Brad Karp, Dawn Song

Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Embed Size (px)

Citation preview

Page 1: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Polygraph: Automatically Generating Signaturesfor Polymorphic Worms

Presented by: Devendra Salvi

Paper by: James Newsome, Brad Karp, Dawn Song

Page 2: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Introduction

Why automated signature generation technique ?

Learning from previous worm detection implementations

Polymorphic worm ?

Page 3: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Polymorphic Worm design

Characteristic of a Polymorphic worm Invariant bytes Wildcard bytes Code bytes

Creating a Polymorphic worm Assumptions

Perfectly obfuscated code

Code obfuscation

Page 4: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Polymorphic Worm design

The two chief sources of invariant content Exploit framing (reserved key words) Exploit payload (alter control flow)

Page 5: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Invariant content in polymorphic worm Apache multiple-host-header vulnerability

Apache-Knacker exploit

Unshaded area=wildcard bytes

Lightly shaded =code bytes

Heavily shaded=invariant content byte

Page 6: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Invariant content in polymorphic worm (contd.) BIND TSIG vulnerability

Exploited by the Lion worm.

Unshaded area=wildcard bytes

Lightly shaded =code bytes

Heavily shaded=invariant content byte

Page 7: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Invariant content in polymorphic worm (contd.) CodeRed AdmWorm Slapper Clet polymorphic engine

Boxed bytes are found in at least 20% of Clet’s outputs; shaded bytes are found in all of Clet’s outputs.

Page 8: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Polymorphic Signatures

Substring Signatures Insufficient ? A single invariant substring exists across payload instances for the same

worm; that is, the substring is sensitive, in that it will match all worm instances.

The invariant substring is sufficiently long to be specific; that is, the substring does not occur in any nonworm payloads destined for the same

IP protocol and port. Signature Classes for Polymorphic Worms

Conjunction signatures Token-subsequence signatures Bayes signatures

Page 9: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Polygraph

Polygraph monitor incorporates the Polygraph signature generator.

Page 10: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Polygraph (contd.)

Polygraph Signature Generator Signature quality Efficient signature generation Efficient signature matching Generation of small signature sets. Robustness against noise and multiple worms. Robustness against evasion and subversion.

Page 11: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Algorithm for signature generation Preprocessing: Token Extraction

All of the distinct substrings of a minimum length

are extracted.

e.g.. If there are ‘K’ occurrences of “http”, “ttp” will not be considered distinct unless if it appears in another ‘K’ occurrences and not as a substring of “http”

This is the first step of the algorithm which filters out irrelevant tokens of a suspicious flow.

Page 12: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Algorithm for signature generation (contd.) Generating single signatures

Generating Conjunction Signatures Unordered token list

Generating Token-Subsequence Signatures Ordered token list (regular expression)

E.g.. “.*one.*two.*”. “.*o.*n.*e.*z.*” Generating Bayes Signatures

Pr[L(x) = worm|x] and Pr[L(x) = worm|x]. (Pr[L(x) = worm|x] / Pr[L(x) = worm|x]) =

Pr[L(x) = worm] Õ1in Pr[xi = 1|L(x) = worm] /Pr[L(x) = worm] Õ1in Pr[xi = 1|L(x) = worm]

Page 13: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Practical signatures generation Generating multiple signatures

the suspicious flow pool could contain more than one type of worm, and could contain innocuous flows

Bayes algorithm implementation Conjunction algorithms require clustering

Each cluster contains similar flow Hierarchical clustering

Page 14: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Practical signatures generation Hierarchical Clustering

Cluster are merged iteratively. Two clusters are merged based on what the merged signature would be for each of the O(s2) pairs of clusters.

The two clusters that result in a signature with the lowest false positive rate are merged.

S1 S2 S3 S4 S5 S6

S1 S2-S3 S4 S5-S6

Page 15: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Performance of each Polygraph signature generation algorithm Experimental Setup:

Token-extraction threshold k = 3 , the minimum token length a = 2, and the minimum cluster size to be 3.

All experiments were run on desktop machines with 1.4 GHz Intel R Pentium R III processors, running Linux kernel 2.4.20.

Signatures for polymorphic versions of three real-world exploits are generated. the Apache-Knacker exploit the ATPhttpd exploit the BIND-TSIG exploit

Network traces. several network traces as input for and to evaluate Polygraph signature

generation, HTTP and DNS.

Page 16: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Results

Single polymorphic worm ApacheKnacker signatures.

For each algorithm, the correct signature is generated 100% of the time for all experiments where the suspicious pool size is greater than 2,and 0% of the time where the suspicious pool size is only 2.

Page 17: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Results (contd.)

Single polymorphic worm BINDTSIG signatures.

These signatures were successfully generated for innocuous pools containing at least 3 worm samples.

Page 18: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Results (contd.)

Single Polymorphic Worm Plus Noise

False Negatives: Clusters produce 0% false negatives while Bayes algorithm, beyond 80%, at which point the signatures cause

100%false negatives. Figures (a) and (b) show the additional false positives that result

from the addition of noise.

Page 19: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Results (contd.)

Multiple Polymorphic Worms Plus Noise False Negatives is similar to single polymorphic worms

plus noise

False Positives is very similar to single polymorphic worms plus noise when there is only one type of worm in the suspicious pool.

Page 20: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Potential attacks on Polygraph

Overtraining Attacks The conjunction and token subsequence algorithms are designed to extract the

most specific signature possible from a worm. An attacker may attempt to exploit this property to prevent the generated signature from being sufficiently general.

Innocuous Pool Poisoning An attacker could determine what signatures Polygraph would generate

for it. He could then create otherwise innocuous flows that match these signatures, and try to get them into Polygraph’s innocuous flow pool.

Long-tail Attack: An exploit could have already occurred by the time we see a full signature match.

Page 21: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Strengths

The paper introduces preventive measure, should there be a polymorphic worm.

Signature generation technique is automated Since the algorithms work efficiently for

polymorphic worm as well as in situation where there maybe more than one worm present in the data flow, it is practical too.

Page 22: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Weaknesses

Any of the signature generation algorithm when applied individually can be evaded.

In the time it comes up with a signature, the vulnerable host might be already infected.

Page 23: Polygraph: Automatically Generating Signatures for Polymorphic Worms Presented by: Devendra Salvi Paper by : James Newsome, Brad Karp, Dawn Song

Improvisation

All of the three mentioned algorithms can be implemented simultaneously and use the signature which has the fewest false positives and false negatives