View
257
Download
0
Embed Size (px)
Citation preview
Metamorphic Malware 1
Metamorphic Malware Research
Metamorphic Malware 2
Metamorphic Malware
Metamorphic software changes “shape”o But has instance has same function
o In contrast, most software is “cloned”
Metamorphism used by virus writers to evade signature detection
Lots of interesting research problems We look at some here…
Metamorphic Malware 3
Metamorphic Research
How metamorphic are hacker produced generators?
How to detect metamorphic viruses? The “ultimate” metamorphic
generator? How to make metamorphic that
“carries its own generator” Related questions/issues?
Metamorphic Malware 4
Metamorphic Generators
To analyze metamorphic generators…
First problem is, how to compare code?
We developed a “similarity index”o Based on extracted opcodes
o Can be represented graphically
o Also gives a numerical score
Metamorphic Malware 5
Similarity Suppose we want to compare exe files
o Say, file X and file Y
Extract opcodes from eacho x0, x1, …, xn and y0, y1, …, ym
Compare all 3-opcode subsequenceso If they agree (in any order) plot a point on
the axes at appropriate point
Filter noise with window of length 5
Metamorphic Malware 6
Similarity That is, matches of length 5 or greater
are add to scoreo Lengths were determined experimentally
o Scores range from 0 to 1, where 0 == no match, 1 == perfect match
Gives us a graphical view and a score In graph, what is a perfect match?
o Main diagonal, or segments parallel to it
Metamorphic Malware 7
Normal Files
Similar of typical “normal” files
Metamorphic Malware 8
Metamorphic Generators
A typical “metamorphic” generator
Metamorphic Malware 9
Metamorphic Generators
Highly metamorphic generator
Metamorphic Malware 10
Metamorphic Generators
We measured metamorphism of metamorphic generators
What did we find? Generally, not very metamorphic… We did find one exception:
o Next Generation Virus Creation Kit (NGVCK)
Can we detect NGVCK viruses?
Metamorphic Malware 11
Metamorphic Detection
We “trained” a hidden Markov modelo Based on a bunch of “family” viruses
o Using extracted opcode sequences
Then trained a model for detection Next, we discuss HMMs
o Other techniques could be used
o Neural nets, data mining, etc.
Metamorphic Malware 12
Hidden Markov Models
HMMs --- a machine learning technique
Widely used in speech recognition, bioinformatics, and other areas
We can train an HMM Then use the resulting trained model
to score unknowno High score? Data matches training data
o Low score? Does not match training data
Metamorphic Malware 13
Hidden Markov Models
What are HMMs? Consider an example… Suppose we want to know average
annual temperature in the past We cannot go back in time
o So what to do?
Suppose we know that tree ring size is related to temperature
Metamorphic Malware 14
Hidden Markov Models
We consider 2 possible temperatureso Hot (H) and cold (C)
We consider 3 tree ring sizeso Small (S), medium (M), large (L)
Based on measurements, we find:
Metamorphic Malware 15
HMM
Also, based on historical record:
Then transitions between hot and cold years is a Markov process (order 1)
For the past, we cannot observe temp
But, we can measure tree rings sizes
Metamorphic Malware 16
HMMs
HMM give us efficient algorithms to solve problems like:o Given a series of tree ring sizes, can
we say anything about temperatures?
Metamorphic Malware 17
HMMs
The generic picture is like this…
Note, there is a Markov process And a series of observations
Metamorphic Malware 18
HMMs
HMM model denoted as: λ=(A,B,π)o A is state transition matrix
o B gives probabilities of observations, depending on state of Markov process
o π contains initial state probabilities
For HMMs there are efficient algorithms to solve 3 problemso Next slide…
Metamorphic Malware 19
The 3 HMM Problems1. Given a model and observations, we
can score the sequence of observationso How well does observed data fit model?
2. Given model and observations, we can find optimal state sequenceo Here, we uncover the hidden states
3. Given observation sequence, we can train a model to best fit the datao Only assumption is size of the A matrix
Metamorphic Malware 20
HMM Training:
English Text Example
Assuming 2 hidden states
Here, we show the B matrix…
Metamorphic Malware 21
HMMs and Metamorphic Generators
So, what’s the game plan?
a)Extract opcodes from several metamorphic viruses from same family
b)Train HMM model to on these opcodes (problem 3 from previous slide)
c) Given unknown file, score extracted opcodes using the trained HMM model (problem 1)
Metamorphic Malware 22
HMM Detection of NGVCK Trained model works for detection Effective to the point of practical…
Metamorphic Malware 23
Why Does this Work?
NGVCK viruses are highly metamorphic
But they have some common statistical propertieso This is automatically extracted by HMM
NGVCK differs from normal codeo So HMM can distinguish between the
How to make a “better” metamorphic generator? Hold that thought…
Metamorphic Malware 24
What Next? Can we extract opcodes (or approximation)
efficiently? Are “profile hidden Markov models” better? Similarity index for detection?
o Better ways to measure similarity?
o Statistical tests versus similarity?
HMMs to detect the “undetectable”? HMM compared to other proposed methods? Metamorphism for software watermarking?
Metamorphic Malware 25
Ultimate Metamorphic?
How to evade signature detection and HMM detection?o Metamorphic code evades signature
detection
o But how to also evade HMM detection?
Make the code highly metamorphic and similar to normal codeo Then trained HMM will confuse the two
Metamorphic Malware 26
Ultimate Metamorphic?
Insert dead code from normal programs
Before After
Metamorphic Malware 27
What Now?
How to detect the “ultimate” metamorphic generator?o Remove the dead code
How to remove dead code?o Emulation can help, but…
Can we “improve” the generator? Can we improve the detection? Can we say something more general?
Metamorphic Malware 28
References Revealing introduction to HMMs Hunting for metamorphic engines Profile hidden Markov models Approximate disassembly Detecting “undetectable” metamorphic
viruses Hunting for undetectable metamorphic
viruses And lots more work in progress…