32
Finding Diversity in Remote Code Injection Exploits Justin Ma, John Dunagan, Helen J. Wang, Stefan Savage, Geoffrey M. Voelker *University of California, San Diego *Microsoft Research

Finding Diversity in Remote Code Injection Exploits

  • Upload
    kenny

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

Finding Diversity in Remote Code Injection Exploits. Justin Ma , John Dunagan , Helen J. Wang , Stefan Savage , Geoffrey M. Voelker *University of California, San Diego *Microsoft Research. Encountering new malware. Have I seen this before? - PowerPoint PPT Presentation

Citation preview

Page 1: Finding Diversity in Remote Code Injection Exploits

Finding Diversity in Remote Code Injection Exploits

Justin Ma, John Dunagan, Helen J. Wang,Stefan Savage, Geoffrey M. Voelker

*University of California, San Diego*Microsoft Research

Page 2: Finding Diversity in Remote Code Injection Exploits

2

Encountering new malware

Have I seen this before?

How closely related is it to what I have seen before?

Page 3: Finding Diversity in Remote Code Injection Exploits

3

Practical considerations

?

New defense?

Page 4: Finding Diversity in Remote Code Injection Exploits

4

Theoretical considerations

?

?

Evolutionary relationship?

Page 5: Finding Diversity in Remote Code Injection Exploits

5

Grouping similar malware together…

• Ultimately, construct malware families

• Anti-virus industry is active in this area

Page 6: Finding Diversity in Remote Code Injection Exploits

6

Motivation

710 new families40,000 new variants

Family and variant defined in ad-hoc fashion…

Is there a systematic way to determine the nature of this diversity?

Page 7: Finding Diversity in Remote Code Injection Exploits

7

Exploit diversity

Attacker

MS RPC Request Exploit

Page 8: Finding Diversity in Remote Code Injection Exploits

8

Polymorphism

Attacker

Encrypted

Page 9: Finding Diversity in Remote Code Injection Exploits

9

Behind the encryption…

Attacker

Page 10: Finding Diversity in Remote Code Injection Exploits

10

Differing constants

Attacker

Different IP address

Page 11: Finding Diversity in Remote Code Injection Exploits

11

Functional differences

Attacker

Waiting for a connection

Page 12: Finding Diversity in Remote Code Injection Exploits

12

Different code base

Attacker

Calling “tftp.exe”

Page 13: Finding Diversity in Remote Code Injection Exploits

13

ISystemActivator vulnerability

1,561 exploit attempts How different are they?

90 unique payloads

Page 14: Finding Diversity in Remote Code Injection Exploits

14

Our goal

• Automatically construct phylogeny, or family tree of exploits

Page 15: Finding Diversity in Remote Code Injection Exploits

15

Outline for this talk

• On classifying shellcodes

• Steps for systematically studying shellcodes– Trace collection– Shellcode extraction– Shellcode decryption– Comparing samples– Cluster analysis

• Post-hoc manual inspection to validate– Look at the code!

Page 16: Finding Diversity in Remote Code Injection Exploits

16

Why shellcodes?

• Our study focuses on exploits

• They are packaged with the exploit– First foreign code that executes on a newly

infected machine– Part of exploit with most leeway for variation

• Primary challenge: collecting and analyzing shellcodes

Page 17: Finding Diversity in Remote Code Injection Exploits

17

Remote code injection attacks

Victim

Victim’s stack memory

high

lowMS RPCRequestExploit

Shellcode

Flow of execution

Decryptedshellcode

Vulnerablebuffer

Page 18: Finding Diversity in Remote Code Injection Exploits

18

Trace collection

• Studying 5 vulnerabilities

• Residential– 2-day trace– Windows XP SP2– 29 unused DSL IP addresses– 4,400 exploit samples

• Enterprise Trace– 1 Hour– Active responders– 5x /24 subnets– 1,500 exploit samples

Page 19: Finding Diversity in Remote Code Injection Exploits

19

Shellcode extraction

• Shield (Sigcomm’04)– Framework for specifying network-based

protocols and vulnerabilities

– Extracts shellcodes from raw network packets

Page 20: Finding Diversity in Remote Code Injection Exploits

20

Shellcode decryption

• Shellcode is encrypted– Use shellcode’s own decryption loop!

• Limited emulation– Similar to generic decryption technique used

for viruses

Page 21: Finding Diversity in Remote Code Injection Exploits

21

Comparing samples:Candidate metrics

• Edit distance– Too specific: non-code portions of payload

made related exploits unnecessarily distant

• Structural distance– Control flow graph over basic blocks– Basic blocks summarized with a color/hash– Too general: did not capture subtle instruction

variations between exploit families

Page 22: Finding Diversity in Remote Code Injection Exploits

22

Comparing samples:Final metric

• Exedit distance metric– Edit distance over executed parts of shellcode

• Distinguishes code from data• Maintains instruction-level details

Canonical string for shellcode

Page 23: Finding Diversity in Remote Code Injection Exploits

23

Cluster analysis

• Need to group samples using the exedit distance metric

• Agglomerative clustering– Each iteration, merge closest pair of clusters– Cluster distance = distance of furthest

samples between two clusters

Page 24: Finding Diversity in Remote Code Injection Exploits

24

Results

• Caught exploits for 5 vulnerabilities over traces• Summary for residential trace

Exploits Unique exploits

Families

SQL Resolution 767 2 1

LSASS 1,769 56 5

ISystemActivator 1,561 90 6

RemoteActivation 338 338 2

Page 25: Finding Diversity in Remote Code Injection Exploits

25

ISystemActivator

10% clustering threshold

Need to manually verify this…

6 families

Page 26: Finding Diversity in Remote Code Injection Exploits

26

ISystemActivator

4-byte decoding key

Kernel-address loading function

Function-finding block

Page 27: Finding Diversity in Remote Code Injection Exploits

27

ISystemActivator

4-byte decoding key

Kernel-address loading function

Function-finding block

4-byte encoding key

Kernel base loader

Function finder

Page 28: Finding Diversity in Remote Code Injection Exploits

28

ISystemActivator

Longest payload

Many function blocks in middle of payload

Page 29: Finding Diversity in Remote Code Injection Exploits

29

ISystemActivator

Command-line call to “tftp.exe”

Page 30: Finding Diversity in Remote Code Injection Exploits

30

ISystemActivator

Different instructions in parts, otherwise very similar

Page 31: Finding Diversity in Remote Code Injection Exploits

31

ISystemActivator

“Bind” version“Connect-back” version

Page 32: Finding Diversity in Remote Code Injection Exploits

32

Conclusions

• Systematic method for classifying exploits– Exploit collection– Shellcode extraction and decryption– Shellcode comparison using exedit distance– Group exploits with clustering

• Similarity between samples in computed phylogenies corresponded well with observed differences

• Useful step toward automating malware classification