29
Malware Detection Slides courtesy of Mihai Christodorescu

Malware Detection Slides courtesy of Mihai Christodorescu

Embed Size (px)

Citation preview

Malware Detection

Slides courtesy of Mihai Christodorescu

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 2

The Rising Malware Tide

• Malware is software with unwanted functionality.Viruses, trojans, backdoors, bots, adware, spyware, browser hijackers, downloaders, droppers, keyloggers, password stealers, ...

• “Blended” threats

100,000,000 machines are infected.[Vint Cerf, World Economic Forum 2007]

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 3

Organized Cyber-Crime

• Boom in online fraud:– Spamming– Trade in stolen data– Financial fraud– ID theft

Malware is the tool of the trade.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 4

The Changing Threat Landscape

1995: Hobby malware, for fun• Show programming prowess• Single author

2007: Professional malware, for profit• Collaborative development• Bug-fix releases, code reuse

Botnets: distributed computing has finally arrived.

Creator of the Melissa

worm

?

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 5

Failure of Signature Detectors

Malware detectors still use signatures.

Malware is obfuscated/transformed easily.Software diversity used successfully by malware.

Internet

ac028c0e86009d8edfac0ac075fbe81cfd72ef50b91000f7f15052b90:*:504b03040a0001000800*...*:188420:181779:*:8ad6900f5088cab9356678e43c...3:*:3e3c623e6c696e6b3c2f6...

Virus Scanner

Known Malware

New Malware 1New Malware 2

Paradigm shift in malware creation,

yet no change in malware detection!

Paradigm shift in malware creation,

yet no change in malware detection!

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 6

Focus On Behavior

[Kaspersky Labs, Symantec]

2001 2002 2003 2004 2005 2006

10

1,000

100,000

10,000

1

8,82111,136 20,731

31,726 53,95086,876

New malware & malware families

Time

100

325 335 274 202 (est.)

A family is a collection of behaviors.A behavior can be shared by many families.

Family = malware with a common code base.

Family = malware with a common code base.

Number of families

stays constant.

Number of variants

grows exponentiall

y.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 7

Main thesis

Detection of obfuscated malware requires a semantic analysis of program behavior.

Program verification provides the techniques necessary to perform malware

detection effectively and efficiently.

Detection of obfuscated malware requires a semantic analysis of program behavior.

Program verification provides the techniques necessary to perform malware

detection effectively and efficiently.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 8

Specifying Behavior

Byte signatures allow for fast detection.– But not resilient to obfuscation.

High-level descriptions require expensive detection.– Resilient to obfuscation.

Syntactic Semantic

Execution of program M causes the system to reach a state where a copy of M has been sent by email.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 9

Connect

Send

Connect

Send

Malspec: Self-Propagation by Email

Netsky.B

push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send

push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 10

Connect

Send

Connect

Send

push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send

push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send

Netsky.B

X := Arg1

Arg1 = X &Arg2 = “EHLO.*”

X := Arg1

Arg1 = X &Arg2 = “EHLO.*”

= +Semantic component

describesdependency constraints.

Syntactic component describestemporal

constraints.

Malspec: Self-Propagation by Email

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 11

“Read Own Exe. Image”“Send Email”

Building a Real Malspec

send(X,“DATA”)

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,T)

Y:=read(Z)

Z:=open(S)

S:=process_name()

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 12

“Read Own Exe. Image”“Send Email”

send(X,“DATA”)

Building a Real Malspec

X:=socket()

connect(X)

send(X,“EHLO”)

Y:=read(Z)

send(X,T)

Z:=open(S)

S:=process_name()

send(X,T))),Base64(l(StringEqua YT

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 13

send(X,“DATA”)

Malspec Constraints

X:=socket()

connect(X)

send(X,“EHLO”)

Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Local constraint

Dependence constraint:X after socket = X before connect

Dependence constraint

AutomatingMalspec Creation:Malspec Mining

AutomatingMalspec Creation:Malspec Mining

MalwareSample

BenignProgramBenign

ProgramBenignProgramBenign

Program

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 14

Malspecs Benefits

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Choice of security-sensitive operations

Constraint-based execution order

Dependences free of obfuscation artifacts

Choice of security-sensitive operations

Constraint-based execution order

Dependences free of obfuscation artifacts

Expressive to describe even obfuscated behavior.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 15

Malspec Detection Strategies

• Static analysis

• Dynamic analysis

• Host-based IDS

• Inline Reference Monitors

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Malspecs are independent of detection method.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 16

Detection of Malicious Behavior

BinaryFile

MalwareDetector

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Goal: Find a program path that matches the malspec.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 17

Find A Malicious Program Path

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Interprocedural Control-Flow Graph

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 18

1) Match Malspec Operations

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 19

2) Match Malspec Constraints

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Malspec Constraint:

Z after open = Z before read

Malspec Constraint:

Z after open = Z before read

Program Constraint:

The program fragment preserves the program expression bound to Z.

Program Constraint:

The program fragment preserves the program expression bound to Z.

Like a semantic def-use

constraint.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 20

2) Match Malspec Constraints

Program Constraint:

The program fragment preserves the program expression bound to Z.

Program Constraint:

The program fragment preserves the program expression bound to Z.

Semantic nop wrt E = program fragment preserving an expression E.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 21

2) Match Malspec Constraints

Program Constraint:

The program fragment preserves the program expression bound to Z.

Program Constraint:

The program fragment preserves the program expression bound to Z.

Need an Oracle...

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 22

Advances in Decision Procedures

Dramatic improvements in SAT solvers:– SATO [Zhang, CADE 1997]

– GRASP [Marques-Silva & Sakallah, 1999]

– zChaff [Moskewicz et al., DAC 2001]

– BerkMin [Goldberg & Novikov, DATE 2002]

SAT-based Bounded Model Checking:[Clarke et al., FMSD 2001]

– SAT-specific speedups [Strichman, CHARME 2001]

– Richer logics [Seshia et al., DAC 2003]

A decision procedure can approx. an Oracle.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 23

Using Decision Procedures

Program Constraint:

The program fragment preserves the program expression bound to Z.

Program Constraint:

The program fragment preserves the program expression bound to Z.

Decisionprocedure

P True/False

P

add esp, 0Chpush[ebp+hMem]

add esp, 0Chpush[ebp+hMem]

P

P

4

12

12

01

01

espesp

hMemebpmemoryespmemory

espesp

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 24

Semantics-AwareMalware Detector

Semantics-Aware Detector

DisassemblerCFG

constructor

BinaryFile

CFG

Graphmatching

Malspec

Malspec operations

Malspec constraints

Yes / No

IDA Pro

[Detlefs et al., “Simplify,” 2004][Lahiri & Seshia, CAV 2004]

Constraintsatisfaction

Simplify

UCLID

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 25

Effective Detection

With hard-coded semantic-nop patterns:

With decision procedures:

Commercial AV

SAFE

Known malware 100% 100%

Obfuscated variants

0% 100%

Malspec source

Variants detected

# of AV signatures

# of SAMD malspecs

Netsky.B C,D,O,P,T,W 7 1

Bagle.I J,N,O,P,R,Y 7 1

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 26

Semantic-Nop Detection BenefitsSemantic-Nop features:

• Flow sensitivity

• Binding procedure

• Decision procedures

• Rich constraints

Obfuscation resilience:

• Code reordering

• Register renaming

• Junk code

• Code substitution

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 27

Detection Performance

300–800 s

Powerful decision procedures are expensive.

1–9 s

Simplify theorem proverUCLID

bounded model

checker

SAFE pattern matching

Idea:

Use expensive decision procedures only if cheap decision

procedures do not provide a definitive answer.

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 28

Stack of Decision Procedures

Simplify theorem proverUCLID

bounded model

checker

SAFE pattern matching

Random execution

Average cost, same decision power.

Yes

No

Yes

Yes/No

“No, code does not

satisfy constraint!”

Constraint

Program fragment ?

?

?

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 29

Performance Results

Malware Minimum Average Maximum

Netsky(B,C,D,O,P,T,W)

Bagle(I,J,N,O,P,R,Y)

Bagle(obfuscated

variants)

Detection times in seconds

60.56 99.57 140.08

36.00 56.41 97.13

74.81 140.14 186.50

Test setup: 1 GHz CPU, 1 GB RAMComparison:

Commercial signature-based detector: <1sDecision procedure-based detector: >300s