View
843
Download
1
Embed Size (px)
DESCRIPTION
Citation preview
R AP
Pattern Recognition and Applications Group Department of Electrical and Electronic Engineering University of Cagliari, Italy
Machine Learning in Computer
Forensics (and the Lessons Learned from Machine Learning in
Computer Security)
D. Ariu G. Giacinto F. Roli
PRA Pattern Recognition and Applications Group
AISEC
4° Workshop on Artificial Intelligence and Security
Chicago – October 21, 2011
What can be analyzed… (during an investigation)
October 21 - 2011 Davide Ariu - AISEC 2011 2
Role of Computer Forensics (with respect to Computer Security)
October 21 - 2011 Davide Ariu - AISEC 2011 3
Prevention Security
Detection Security
(live) Forensics
Truth Assessment Forensics
Cyber Attack (or Crime) Progress
October 21 - 2011 Davide Ariu - AISEC 2011 4
Goals
• To provide a small snapshot of ML research
applied to Computer Forensics
• To clarify the ML approach to Computer Forensics
Historical Perspective
October 21 - 2011 Davide Ariu - AISEC 2011 5
Computer Security Computer Forensics
•Early ’70s – First Computer Security
research research papers appear
•1988 - The first known internet-
wide attack occur (the “Morris Worm”)
•Early 2000 - Slammer and his friend
in the wild: consequent security issues are on tv channels and
newspapers
•1984 – The FBI Laboratory began
developing programs to examine computer evidence
•1993 – International Law Enforcement Conference on
Computer Evidence
•1999-2007 – Computer Forensics “Golden Age” [Garfinkel,2010]
Computer Security Research
• Strong Research Community
– Research groups and centers exist (almost) worldwide
• Well defined main research directions
– Malware and Botnet analysis and detection
– Web Applications Security
– Intrusion Detection
– Cloud Computing
• Well defined methodologies
– Research results can have an immediate practical impact
October 21 - 2011 Davide Ariu - AISEC 2011 6
Computer Forensics Research
• Not particularly strong research community (at
least in terms of results achieved)
– Mostly people with a computer security background (as me..)
• Not well defined research directions
• Not well defined approaches and methods
– Difficulty to reproduce digital forensics research
results [Garfinkel, 2009]
October 21 - 2011 Davide Ariu - AISEC 2011 7
How can machine learning be useful in Computer Forensics?
• “Machine Learning methods are the best
methods in applications that are too complex for
people to manually design the algorithm” [Mitchell,2006]
• The “reasoning” is a fundamental step during the
investigation
– Computer forensics is conceptually different from Intrusion Detection
• The huge mass of data to be analyzed (TB scale)
makes intelligent analysis methods necessary
– Situations also exist where there is no time for an in-
depth analysis (e.g. Battlefield Forensics)
October 21 - 2011 Davide Ariu - AISEC 2011 8
ML applications to CF
• Applications of Machine Learning techniques
have been proposed in several Computer
Forensics applications
– Textual Documents and E-mail forensics
– Network Forensics
– Events and System Data Analysis
– Automatic file (fragment) classification
October 21 - 2011 Davide Ariu - AISEC 2011 9
Computer Forensics Research Drawbacks
• The experimental results proposed are not
completely convincing…
– Network forensics solutions evaluated on the DARPA dataset only
– Email forensics algorithms evaluated on a corpus of 156 emails (and 3 different authors)
– Automatic File classification algorithms evaluated on 500MB dataset (best case…)
• In addition, the approach adopted was the same adopted in Computer Security…
October 21 - 2011 Davide Ariu - AISEC 2011 10
How to improve existing tools?
• Useful solutions can be developed only if the
focus is:
– On the investigator and on the knowledge of the case that he has
– On the organizazion and categorization of of the
information provided to the investigator
• Data sorting and categorization
• Prioritisation of results[Garfinkel, 2010; Beebe, 2009]
October 21 - 2011 Davide Ariu - AISEC 2011 11
Putting knowledge into the tool…
• Computer Security tools (e.g. IDS) are based on
a well defined criteria that is used to detect
attacks
• In other contexts where is difficult to explicitely
define a search criteria the feedback provided
by the user is exploited to achieve more
accurate results
– E.g. Content-based Image Retrieval with relevance
feedback [Zhouand,2003]
• It can be definitely the case of Computer
Forensics applications..
October 21 - 2011 Davide Ariu - AISEC 2011 12
Organizing data and results
• Discerning among the huge mass of data
represent a dramatically time-consuming task for
investigators
– E.g. Filtering the results obtained after file carving
– E.g. Inspecting all the pictures found in a laptop
• A tool can be definitely useful even if it is only
able to sort results and contents according to a relevance criteria (most relevant first)
– The tool only assign “scores”, the analyst will inspect
them..
October 21 - 2011 Davide Ariu - AISEC 2011 13
To summarize..
• We investigated the problem of applying ML to
Computer Forensics
• We provided a short overview of the literature
related to ML applications in Computer Forensics
• We proposed several guidelines to profitably
apply machine learning to Computer Forensics
October 21 - 2011 Davide Ariu - AISEC 2011 14
Question or Comments
Thank you for your attention!
October 21 - 2011 Davide Ariu - AISEC 2011 15