23
2009/9/15 1 Rishi : Identify Bot Contaminated Hosts By IRC Nickname Evaluation Reporter : Fong-Ruei , Li Machine Learning and Bioinformatics Lab In Proceedings of USENIX Workshop on Hot Topics in Understanding Botnets (HotBots), 2007

2009/9/151 Rishi : Identify Bot Contaminated Hosts By IRC Nickname Evaluation Reporter : Fong-Ruei, Li Machine Learning and Bioinformatics Lab In Proceedings

Embed Size (px)

Citation preview

12009/9/15

Rishi : Identify Bot Contaminated Hosts By IRC Nickname Evaluation

Reporter : Fong-Ruei , Li

Machine Learning and Bioinformatics Lab

In Proceedings of USENIX Workshop on Hot Topics in Understanding Botnets (HotBots), 2007

2

Outline

Introduction Background Communication Channel Detection Results and Evaluation Conclusion

2009/9/15 Machine Learning and Bioinformatics Lab

3

Introduction

Currently stop a given botnet is to disable the

communication channel for the bots However

the hosts stay infected and are in most cases still backdoored, allowing an attacker to reclaim the machine at any time.

2009/9/15 Machine Learning and Bioinformatics Lab

4

Background

Internet Relay Chat(IRC) Each of the different servers hosts a

number of different chat rooms called channels

Every user connected to an IRC server has its own unique username called nickname

2009/9/15 Machine Learning and Bioinformatics Lab

5

Background

BotMaster communicate with the botnet is to use

IRC Bots

join a specific channel on a public or private IRC server

to receive further instructions

2009/9/15 Machine Learning and Bioinformatics Lab

6

Communication Channel Detection

All bots have one characteristic in common: they need a communication channel

Our approach focuses on detecting the communication channel

between the bot and the botnet controller it is possible to detect a bot even before it

performs any malicious actions

2009/9/15 Machine Learning and Bioinformatics Lab

7

Project Rishi

Every captured packet extracts : Time of suspicious connection IP address and port of suspected source

host IP address and port of destination IRC

server Channels joined Utilized nickname

2009/9/15 Machine Learning and Bioinformatics Lab

8

Network setup of Rishi

2009/9/15 Machine Learning and Bioinformatics Lab

9

Basic Concept - Rishi

2009/9/15 Machine Learning and Bioinformatics Lab

10

Scoring Function

Checks for the occurrence of several criteria : suspicious substrings

the name of a bot (e.g., RBOT or l33t-) special characters

like [ , ] , and | long numbers.

nickname consists of many digits: for each two consecutive digits

2009/9/15 Machine Learning and Bioinformatics Lab

1 point

11

Scoring Function

True signs for an infected host raise the final score by more than one point a match with one of the regular

expressions a connection to a blacklisted server the use of a blacklisted nickname

2009/9/15 Machine Learning and Bioinformatics Lab

> 1 points

12

Regular Expression

Each nickname is tested against several regular expressions which match known bot names

For example the following expression: \[[0-9]\|[0-9]{4,} like [0|1234] like |1234

2009/9/15 Machine Learning and Bioinformatics Lab

10 points

13

Whitelisting

The software utilizes : hard coded whitelist dynamic whitelist

Each nickname, which receives zero points is added to the dynamic whitelist

2009/9/15 Machine Learning and Bioinformatics Lab

14

Blacklisting

Two blacklists: the first blacklist is hard coded

in the configuration file the second one is a dynamic list

with nicknames added to it automatically according to the final score

2009/9/15 Machine Learning and Bioinformatics Lab

15

Example

Imagine that the nickname RBOT|DEU|XP-1234 was added to the

blacklist The next captured nickname

RBOT|CHN|XP-5678

2009/9/15 Machine Learning and Bioinformatics Lab

1 point each due to the suspicious substrings RBOT,CHN, and XP

1 points each due to the two occurrences of the special character |

1 point each due to two occurrences of consecutive digits

7points 10 points for more than 50% congruence with a

name stored on the dynamic blacklist

17points

16

Example

1 point each due to the suspicious substrings RBOT,CHN, and XP

1 points each due to the two occurrences of the special character |

1 point each due to two occurrences of consecutive digits

2009/9/15 Machine Learning and Bioinformatics Lab

7points

17points

10 points for more than 50% congruence with a name stored on the dynamic blacklist

17

Results and Evaluation

RWTH Aachen university 30,000 computer users to support Rishi runs on a Quad-CPU Intel Xeon

3,2Ghz system with 3GB of memory installed

we are monitoring a 10 GBit network

2009/9/15 Machine Learning and Bioinformatics Lab

18

Results and Evaluation

2009/9/15 Machine Learning and Bioinformatics Lab

19

Results and Evaluation

2009/9/15 Machine Learning and Bioinformatics Lab

20

Results and Evaluation

2009/9/15 Machine Learning and Bioinformatics Lab

21

Conclusion

Based on characteristics of the communication channel observe protocol messages use n-gram analysis together with a

scoring function black-/whitelists

2009/9/15 Machine Learning and Bioinformatics Lab

22

Bot Nicknames

2009/9/15 Machine Learning and Bioinformatics Lab

23

Thank you for listening

2009/9/15

The end

Machine Learning and Bioinformatics Lab