Life Cycle And Detection Of Bot Infections Through Network Traffic Analysis

Taming Botnets

Life cycle and detection of bot infections through network traffic analysis

agenda

● Introduction● Bots and botnets: short walk-through● Taming botnets: Detection and Evasion● Our approach● Case studies● Conclusion● Disclaimer:

We steal our images

From google image :)

Introduction

● Why we are doing this research?● Objectives● Our data sources● Our environment

bunch of code in node.js

and python. Customized sandboxing platform (cuckoo based). Data indexed in solr

Introduction: bots

● “bot”: a software program, installed on target machine(s) for the purpose of utilizing that machine computational/network resources or collect information

● A typical bot is controlled by external party therefore needs to be able to utilize a communication channel in order to receive commands and pass information

● Bots typically are used for malicious purposes ;-)

Introduction: bots (lifecycle)

● Installation (infection) phase: often by means of a software exploit or a social engineering technique (fake antivirus, fake software update)

● Post-infection phase: communication (C&C, peer etc)

Introduction

● Our basic assumption is that a bot needs to be able to communicate back in order to be useful.

● Our analysis is primarily “blackbox” by observing network traffic of a large network infrastructure in order to identify possible infections and “communication” links

● We also utilize sandboxing techniques to observe behavior (mainly from the network side)

● We do not attempt to reverse engineer (manually or automatically) botnet software

Botnets

● Infection vectors → often targetting enduser machines (clients) in large number of occurrences by exploiting a software vulnerability in browser or related components

● C&C communication:● Remember IRC bots? :)● over HTTP (most common)● Proprietary protocol● Centralized or P2P infrastructure

Botnets: lifecycle● C&C Hosting itself is another interesting

research area ;-)

So how do you get bots on your machine? :)

How do you get bots on your machine? ;-)

● Compromised servers: most widespread, often through silly vulns (i.e. wordpress!), but also high profile web sites are affected, or domains taken over (DNS poisoning and more)

● Placing a javascript iframe on compromised high-traffic machine is way more profitable than defacing (hacktivism is only for hippies? ;)

How do you get bots (pt 2)

● SEO poisoning/manipulation.

How you get bots (pt 3)

● Advertisements and malvertisements: whole new ecosystem:

OpenX is a huge security hole ;)

Anyways

● Once infected, the bot talks back...

Lets look at some real-life cases. (data is very recently, mostly past few months).

Old-school bots (still active. For real! ;-))May/2012: IRC bots still real :-D

Carberp

● Bot Infection: Drive-By-HTTP

● Payload and intermediate malware domains: normal, just registered/DynDNS

● Distributed via: Many many compromised web-sites, top score > 100 compromised resources detected during 1 week.

● C&C domains usually generated, but some special cases below ;-).

● C&C and Malware domains located on the same AS (from bot point of view). Easy to detect.

● Typical bot activity: Mass HTTP Post

Domain URL Referrer Payload Size

beatshine.is-saved.org

/g/18418362672595167.js www.*****press.ru javascript 9414

activatedreplacing.is-very-evil.org

/index.php?28d9000e56c2a63080ff89c6f5357591

www.*****press.ru html 45443


//images/r/785cee8be7f1da9a9d60820cbf8b1840.jar

application/x-jar

4135


/server_privileges.php?91370f5f009a815950578cb539f28b58=3

application/executable

155529

Activity and update

Another attack atempt and update URLs

Time Domain URL IP

10/Apr/2012:10:29:09

nod32-matrosov-pideri.org //images/785cee8be7f1da9a9d60820cbf8b1840.jar

62.122.79.42

10/Apr/2012:10:29:10

nod32-matrosov-pideri.org /expl0it/At00micArray.class 62.122.79.42

10/Apr/2012:10:29:11

nod32-matrosov-pideri.org /expl0it/At00micArray/class.class

62.122.79.42

02/May/2012:08:42:59

rgn7er8yafh89cehuighv.org /bxlkizmfgtlfwcdmljmrjlunqkvsslfiru.tpl

91.228.134.210

02/May/2012:08:42:59

avast-pidersiy-gandon.com /crypt/files/crypted/config.bin 62.122.79.52

02/May/2012:08:43:00

rgn7er8yafh89cehuighv.org /aDHfNt8w43yYGM.tiff 91.228.134.210

Detection during infection and by postinfection activity

● Infection: executable transfer from just registered, example lifenews-sport.org or Dyn-DNS domains, like uphchtxmji.homelinux.com

● Updates: executable transfer from just registered or DynDNS domain

● Postinfection activity: Mass HTTP Post to generated domains like n87e0wfoghoucjfe0id.org, URL ends with different extensions

Netprotocol.exe

● Bot Infection was: Drive-By-FTP,

now: Drive-By-FTP, Drive-By-HTTP

● Payload and intermediate malware domains:Normal, Obfuscated

● Distributed via: compromised web-sites

● C&C domains usually generated, many domains in .be zone.

● C&C and Malware domains located on the different AS. Bot updates payload via HTTP

● Typical bot activity: HTTP Post, payload updates via HTTP.

Domain URL Referrer Payload Size

3645455029 /1/s.html Infected site html 997

Java.com /js/deployJava.js 3645455029 javascript 4923

3645455029 /1/exp.jar application/x-jar

18046

3645455029 /file1.dat application/executable

138352

Attack analysis- Script from www. Java.com used during attack.

- Applet exp.jar loaded by FTP

- FTP Server IP address obfuscated to avoid detection

Interesting modificationsGET http://java.com/ru/download

/windows_ie.jsp?host=java.com%26

returnPage=ftp://217.73.58.181/1/s.html%26

locale=ru HTTP/1.1

Key feature exampleDate/Time 2012-04-20 11:11:49 MSD

Tag Name FTP_Pass

Target IP Address 217.73.63.202

Target Object Name 21

:password Java1.6.0_30@:user anonymous

http://java.com/ru/download

ftp://217.73.58.181/1/s.html%26

Activity exampleDate/Time 2012-04-29 02:05:48 MSDTag Name HTTP_PostTarget IP Address217.73.60.107:serverrugtif.be● :URL

/check_system.phpDomain registered: 2012-04-21

Date/Time 2012-04-29 02:06:08 MSDTag Name HTTP_PostTarget IP Address208.73.210.29:servereksyghskgsbakrys.com:URL/check_system.php

Onhost deteciton and activityPayload: usually netprotocol.exe. Located in

Users\USER_NAME\AppData\Roaming, which periodically downloads other malware

Further payload loaded via HTTP http://64.191.65.99/view_img.php?c=4& k=a4422297a462ec0f01b83bc96068e064

Detection By AV Sample from May 09 2012 Detect ratio 1/42

● (demos, recoreded as videos)


● Infection: .jar and .dat file downloaded by FTP, server name = obfuscated IP Addres, example ftp://3645456330/6/e.jarJava version in FTP password, example Java1.6.0_29@

● Updates: executable transfer from some Internet host, example GET http://184.82.0.35/f/kwe.exe

● Postinfection activity: Mass HTTP Post to normal and generated domains with URL: check_system.php

09:04:46 POST http://hander.be/check_system.php 09:05:06 POST http://aratecti.be/check_system.php09:06:48 POST http://hander.be/check_system.php09:07:11 POST http://aratecti.be/check_system.php

ftp://3645456330/6/e.jar

http://aratecti.be/check_system.php

http://hander.be/check_system.php

Noproblemslove.com, whoismistergreen.com, etc...

● Bot Infection: Drive-By-HTTP● Payload and intermediate malware

domains:Normal /DynDNS● Distributed via: Compromised web-sites. ● C&C domains: normal.● C&C and Malware domains located on the

different AS. Sophisticated attack scheme. Timeout before activity.

● Typical bot activity: Mass HTTP Post

Noproblemslove.com, whoismistergreen.com, etc...

Interesting domains from range 184.82.149.178-184.82.149.180 (Feb 2012)

Domain Name IP

www.google-analylics.com 184.82.149.179

google-anatylics.com 184.82.149.178

www.google-analitycs.com 184.82.149.180

webmaster-google.ru 184.82.149.178

paged2.googlesyndlcation.com 184.82.149.179

googlefilter.ru 184.82.149.179

rambler-analytics.ru 184.82.149.179

site-yandex.net 184.82.149.180

paged2.googlesyndlcation.com 184.82.149.179

www.yandex-analytics.ru 184.82.149.178

googles.4pu.com 184.82.149.178

googleapis.www1.biz 184.82.149.178

syn1-adriver.ru 184.82.149.178

HOSTER RANGE AND AS

www.google-analylics.com looks good,

BUT

Google, Rambler and Yandex together on 184.82.149.176/29 ?

hoster range and autonomous system (AS)

are useful, when you analyze suspicious events.

What happens next?

Other domains but owner is the same

What's commonwhoismistergreen.com

IP-адрес: 213.5.68.105

Create: 2011-07-26

Registrant Name: JOHN ABRAHAM

Address: ul. Dubois 119

City: Lodz

noproblemslove.com

213.5.68.105

Created: 2011-12-07

Registrant Contact:

Whois Privacy Protection Service

Whois Agent [email protected]

noproblemsbro.com

176.65.166.28

Created: 2011-12-07

Registrant Contact:

Whois Privacy Protection Service

Whois Agent [email protected]

patr1ckjane.com

IP Was 176.65.166.28

IP Now 213.5.68.105

Create: 2011-07-21

Registrant Name: patrick jane

Address: ul. Dubois 119

City: Lodz


● Infection: executable transfer from just registered, or Dyn-DNS domains, like fx58.ddns.us

● Updates: application/octet-stream bulk data load from C&C

● Postinfection activity: Mass HTTP Post to seem-normal domains,i.e: noproblemslove.com, whoismistergreen.com, etc...

Detection

Detection

● What we are building ;)

Cross-correlation data sources

● WHOIS (including team cymru whois)● Our own DNS index, also talking to ISC about

possibilities of data swaps● Sandbox farm (mainly to detect compromised

websites automagically and study behavior)● Public “malicious IP address” databases.● Public reputation (I.e ToS) databases.

● (still work in progress)

Detection

● Manual and Automated● Automated detection is largely based on

analysis of network traffic:● Anomaly detection● Pattern based-analysis● Signatures (snort!)● Traffic profiling (DNS traffic profiling, HTTP traffic

profiling etc)

Detection

● Detecting malicious botnet activity is very popular in academia (interesting problem).

● In our research we do not claim extreme novelty but rather will demonstrate our experience and a few practical solutions that seem to work :-)

Detection: loooots of papers!~

Detection: intreresting bits

● Botnet detection evolved from pattern based approach (hardcoded bot CMD patterns and capture then with snort) to a complex field of generic detection of automated “call-back” communication channels..

Detection

● Different “callback” methods, as seen in the wild, possess interesting properties, such as:● Large number of failed DNS requests● Large number of DNS requests for IP addresses,

which are offline● Connection attempts to mostly dead IP addresses● Traffic pattern (differs from regular browsing)

Cat and mouse game

● Of course all of this is easy to evade. Once you know the method. But security is always about 'cat-n-mouse' game ;-)

Detection

● Detecting botnet activities by analyzing DNS traffic● Analyzing DNS names (dictionary-comparison,

alpha numeric characters, detection of “generated” domain names (similarities/patterns)

● Analyzing failed DNS queries● DNS “ranking” (based on whois information)

Detection: rcode: 3 (Non-existing domains)

Row 1 Row 2 Row 3 Row 40

2

4

6

8

10

12

Column 1

Column 2

Column 3

Detection: rcode:2 (server failure)Rcode:2 domains(failed servers)

Detection

● WHOIS cross-correlation – easily automated.

Detection

● Further step: cross-correlation to domain names which have the same WHOIS attributes

● Sandboxing (we use modified version of cuckoosandbox, with user event simulation, not perfect but works)● Challenges:

– Simulate complex user behavior (mouse movements)– Simulate complex user browsing pattern (visiting X with

search engine (image?) as referer)

Detectionflow

Detection (visualization)

● Parallel coordinates (also see recent talk by Alexandre Dulaunoy from CIRCL.LU and Sebastien Tricaud from Picviz Labs at cansectwest)

Detection

● (demos, lets look at some videos :)

Conclusions

● Detection is still trivial, but keep your methods “private” ;-)

● Detecting 'advanced' botnets (name your favourite traffic profiling evasion method!) is out of question here. Unless this becomes wide-spread

● Cat and mouse game is still fun! ;-)

Tips and recommendations

● For infected machines: boot from clean media and periodically do OFFLINE AV checking

● Monitor network traffic for any unusual activity● Default-deny firewall policies + block any active

executable content

questions

● Contact us at:● [email protected] ● [email protected]

http://github.com/fygrave/dnslyzer for some code

mailto:[email protected]

mailto:[email protected]

http://github.com/fygrave/dnslyzer

Technology

Life Cycle And Detection Of Bot Infections Through Network Traffic Analysis