21
Silvio Cesare Deakin University [email protected]

Simseer - A Software Similarity Web Service

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Simseer - A Software Similarity Web Service

Silvio CesareDeakin University

[email protected]

Page 2: Simseer - A Software Similarity Web Service

Who am I and where did this talk come from?

PhD student at Deakin University.

Research focus includes malware detection and automated vulnerability detection.

Software similarity is the focus of this talk.

This talk is an overview of the core topics, how its approached in academia, and a web service that identifies software similarity.

Page 3: Simseer - A Software Similarity Web Service

Introduction Many applications of software similarity and

classification

Malware Detection

Software Theft Detection

Plagiarism Detection

Software Clone Detection

Page 4: Simseer - A Software Similarity Web Service

Problem Formulation Extract features, fingerprints, or

'birthmarks' from programs p and q.

If birthmark(p) similar to birthmark(q), then programs are similar.

Page 5: Simseer - A Software Similarity Web Service

Software Similarity Problem

Page 6: Simseer - A Software Similarity Web Service

Taxonomy of Program Features Raw Code Abstract Syntax Trees Variables Pointers Instructions Basic Blocks Procedures API Calls Control Flow Graphs Call Graphs Data Flow Procedure Dependency Graphs System Dependency Graphs Object Inheritance and Dependency

Page 7: Simseer - A Software Similarity Web Service

Program Features ExamplesAST (left) and Control Flow (right)

if

== return =

x 0 x 1

condition then else

movl $0x4020a0,(%esp)call 4011b8 <_puts>addl $0x1,-0x8(%ebp)

lea 0x4(%esp),%ecxand $0xfffffff0,%esppushl -0x4(%ecx)push %ebpmov %esp,%ebppush %ecxsub $0x24,%espcall 4011b0 <___main>movl $0x0,-0x8(%ebp)jmp 40115f <_main+0x2f>

add $0x24,%esppop %ecxpop %ebplea -0x4(%ecx),%espret

cmpl $0x9,-0x8(%ebp)jle 40114f <_main+0x1f>

Proc_0

Proc_2

Proc_1

Proc_4

Proc_3

Page 8: Simseer - A Software Similarity Web Service

Taxonomy of Features in Program Binaries

Headers

Object Code

Symbols

Debugging Information

Relocations

Dynamic Linking Information

Page 9: Simseer - A Software Similarity Web Service

Program Transformations Compiler Optimisation and Recompilation

Program Obfuscation

Plagiarism, Software Theft, and Derivative Works

Malware packing, polymorphism and metamorphism

Page 10: Simseer - A Software Similarity Web Service

Traditional Malware Packing

Restoration Routine

Hidden Code = f(Original Code)

Original Code

Remnant Restoration

Routine

Original Code = g(Hidden Code)

Packing Runtime

Original Executable Packed Executable Memory Image at Runtime

Page 11: Simseer - A Software Similarity Web Service

Processing Program Features Treat features or birthmark as a

mathematical object. Strings Vectors Sets Sets of Vectors Trees Graphs

Page 12: Simseer - A Software Similarity Web Service

Software Birthmark Similarity Strings

Edit distance etc

Vectors Cosine Similarity Euclidean distance etc

Set Similarity Jaccard distance etc

Set of Vectors Similarity Minimum matching distance

Trees and Graphs Edit distances etc

Page 13: Simseer - A Software Similarity Web Service

Software Indexing and Searching Nearest neighbour is closest program in

database to query.

Based on 'distance' – a measure of dissimilarity between objects.

Distances that are 'metric' can index and search more efficiently.

Page 14: Simseer - A Software Similarity Web Service

rNN (Range Nearest Neighbour)

q

Query Malicious

Query Benign

distance(p,q)

p

r

Malware

Query

Page 15: Simseer - A Software Similarity Web Service

Wiki on Software Similarity and ClassificationBook on Software Similarity and ClassificationSimseer – A Software Similarity Web Service

Page 16: Simseer - A Software Similarity Web Service

Wiki on Software Similarity and ClassificationReviews of academic papers.

http://www.foocodechu.com/wiki

Page 17: Simseer - A Software Similarity Web Service

Book on ‘Software Similarity and Classification’Academic style survey of the topic.

Published by Springer.

100 pages.

Available in April.

http://www.springer.com/computer/security+and+cryptology/book/978-1-4471-2908-0

Page 18: Simseer - A Software Similarity Web Service

Simseer – A Software Similarity Web ServiceAn online service to identify similarity between

programs.

Performs unpacking.

Renders an evolutionary tree to show program relationships.

Free to use!

http://www.foocodechu.com/?q=simseer-a-software-similarity-web-service

Page 19: Simseer - A Software Similarity Web Service
Page 20: Simseer - A Software Similarity Web Service
Page 21: Simseer - A Software Similarity Web Service

Conclusion Presented a review of software similarity.

Demonstrated a new web service.

Try it!

http://www.foocodechu.com

Questions?