15
R A P Pattern Recognition and Applications Group Department of Electrical and Electronic Engineering University of Cagliari, Italy Machine Learning in Computer Forensics (and the Lessons Learned from Machine Learning in Computer Security) D. Ariu G. Giacinto F. Roli PRA Pattern Recognition and Applications Group AISEC 4° Workshop on Artificial Intelligence and Security Chicago – October 21, 2011

Ariu - Workshop on Artificial Intelligence and Security - 2011

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Ariu - Workshop on Artificial Intelligence and Security - 2011

R AP

Pattern Recognition and Applications Group Department of Electrical and Electronic Engineering University of Cagliari, Italy

Machine Learning in Computer

Forensics (and the Lessons Learned from Machine Learning in

Computer Security)

D. Ariu G. Giacinto F. Roli

PRA Pattern Recognition and Applications Group

AISEC

4° Workshop on Artificial Intelligence and Security

Chicago – October 21, 2011

Page 2: Ariu - Workshop on Artificial Intelligence and Security - 2011

What can be analyzed… (during an investigation)

October 21 - 2011 Davide Ariu - AISEC 2011 2

Page 3: Ariu - Workshop on Artificial Intelligence and Security - 2011

Role of Computer Forensics (with respect to Computer Security)

October 21 - 2011 Davide Ariu - AISEC 2011 3

Prevention Security

Detection Security

(live) Forensics

Truth Assessment Forensics

Cyber Attack (or Crime) Progress

Page 4: Ariu - Workshop on Artificial Intelligence and Security - 2011

October 21 - 2011 Davide Ariu - AISEC 2011 4

Goals

• To provide a small snapshot of ML research

applied to Computer Forensics

• To clarify the ML approach to Computer Forensics

Page 5: Ariu - Workshop on Artificial Intelligence and Security - 2011

Historical Perspective

October 21 - 2011 Davide Ariu - AISEC 2011 5

Computer Security Computer Forensics

•Early ’70s – First Computer Security

research research papers appear

•1988 - The first known internet-

wide attack occur (the “Morris Worm”)

•Early 2000 - Slammer and his friend

in the wild: consequent security issues are on tv channels and

newspapers

•1984 – The FBI Laboratory began

developing programs to examine computer evidence

•1993 – International Law Enforcement Conference on

Computer Evidence

•1999-2007 – Computer Forensics “Golden Age” [Garfinkel,2010]

Page 6: Ariu - Workshop on Artificial Intelligence and Security - 2011

Computer Security Research

• Strong Research Community

– Research groups and centers exist (almost) worldwide

• Well defined main research directions

– Malware and Botnet analysis and detection

– Web Applications Security

– Intrusion Detection

– Cloud Computing

• Well defined methodologies

– Research results can have an immediate practical impact

October 21 - 2011 Davide Ariu - AISEC 2011 6

Page 7: Ariu - Workshop on Artificial Intelligence and Security - 2011

Computer Forensics Research

• Not particularly strong research community (at

least in terms of results achieved)

– Mostly people with a computer security background (as me..)

• Not well defined research directions

• Not well defined approaches and methods

– Difficulty to reproduce digital forensics research

results [Garfinkel, 2009]

October 21 - 2011 Davide Ariu - AISEC 2011 7

Page 8: Ariu - Workshop on Artificial Intelligence and Security - 2011

How can machine learning be useful in Computer Forensics?

• “Machine Learning methods are the best

methods in applications that are too complex for

people to manually design the algorithm” [Mitchell,2006]

• The “reasoning” is a fundamental step during the

investigation

– Computer forensics is conceptually different from Intrusion Detection

• The huge mass of data to be analyzed (TB scale)

makes intelligent analysis methods necessary

– Situations also exist where there is no time for an in-

depth analysis (e.g. Battlefield Forensics)

October 21 - 2011 Davide Ariu - AISEC 2011 8

Page 9: Ariu - Workshop on Artificial Intelligence and Security - 2011

ML applications to CF

• Applications of Machine Learning techniques

have been proposed in several Computer

Forensics applications

– Textual Documents and E-mail forensics

– Network Forensics

– Events and System Data Analysis

– Automatic file (fragment) classification

October 21 - 2011 Davide Ariu - AISEC 2011 9

Page 10: Ariu - Workshop on Artificial Intelligence and Security - 2011

Computer Forensics Research Drawbacks

• The experimental results proposed are not

completely convincing…

– Network forensics solutions evaluated on the DARPA dataset only

– Email forensics algorithms evaluated on a corpus of 156 emails (and 3 different authors)

– Automatic File classification algorithms evaluated on 500MB dataset (best case…)

• In addition, the approach adopted was the same adopted in Computer Security…

October 21 - 2011 Davide Ariu - AISEC 2011 10

Page 11: Ariu - Workshop on Artificial Intelligence and Security - 2011

How to improve existing tools?

• Useful solutions can be developed only if the

focus is:

– On the investigator and on the knowledge of the case that he has

– On the organizazion and categorization of of the

information provided to the investigator

• Data sorting and categorization

• Prioritisation of results[Garfinkel, 2010; Beebe, 2009]

October 21 - 2011 Davide Ariu - AISEC 2011 11

Page 12: Ariu - Workshop on Artificial Intelligence and Security - 2011

Putting knowledge into the tool…

• Computer Security tools (e.g. IDS) are based on

a well defined criteria that is used to detect

attacks

• In other contexts where is difficult to explicitely

define a search criteria the feedback provided

by the user is exploited to achieve more

accurate results

– E.g. Content-based Image Retrieval with relevance

feedback [Zhouand,2003]

• It can be definitely the case of Computer

Forensics applications..

October 21 - 2011 Davide Ariu - AISEC 2011 12

Page 13: Ariu - Workshop on Artificial Intelligence and Security - 2011

Organizing data and results

• Discerning among the huge mass of data

represent a dramatically time-consuming task for

investigators

– E.g. Filtering the results obtained after file carving

– E.g. Inspecting all the pictures found in a laptop

• A tool can be definitely useful even if it is only

able to sort results and contents according to a relevance criteria (most relevant first)

– The tool only assign “scores”, the analyst will inspect

them..

October 21 - 2011 Davide Ariu - AISEC 2011 13

Page 14: Ariu - Workshop on Artificial Intelligence and Security - 2011

To summarize..

• We investigated the problem of applying ML to

Computer Forensics

• We provided a short overview of the literature

related to ML applications in Computer Forensics

• We proposed several guidelines to profitably

apply machine learning to Computer Forensics

October 21 - 2011 Davide Ariu - AISEC 2011 14

Page 15: Ariu - Workshop on Artificial Intelligence and Security - 2011

Question or Comments

Thank you for your attention!

[email protected]

October 21 - 2011 Davide Ariu - AISEC 2011 15