Intrusion Detection and Malware Analysis - Malware …...In Recent Adances in Intrusion Detection (RAID), pages 165–184, 2006. Corrado Leita, Marc Dacier, and Frédéric Massicotte

Intrusion Detection and Malware AnalysisMalware collection

Pavel LaskovWilhelm Schickard Institute for Computer Science

Motivation for malware collection

Understanding vulnerabilities and attack techniquesDevelopment of protection and neutralization toolsUnderstanding the attacker communities and their “businessmodels”.

Malware collection tools

Honeypot: an isolated, unprotected and monitored system,containing seemingly valuable for attacker resources, aimedat collecting examples of malicious activity.Honeyclient: an automated client-side vulnerable systemexecuted in a controlled environment.Honeynet: a distributed collection of honeypots and emailfilters intended for a large-scale collection and observationofmalware

Honeypot taxonomy

Low-interaction: simple daemonssimulating network services; noexploitation.Medium-interaction: emulatedvulnerabilities for attracting andexecuting malware in a controlledenvironment.High-interaction: real systemscommunicating with malware in acontrolled environment.

Low-interaction honeypots

(+) Low security risks due to emulation(+) Simple installation and recovery(+) Suitable for analysis of automatic attacks(+) High scalability(−) Not suitable for detection of interactive attacks due to limited

emulated functionality(−) Hardly suitable for acquisition of malware binaries

High-interaction honeypots

(+) Suitable for detection and acquision of any malware kinds(−) Time and resource consuming installation and maintenance(−) High security risks: additional security mechanisms are

required(−) Virtualization can be detected by malware

Medium-interaction honeypots

(+) A relatively wide exploit coverage(+) Extensive monitoring and collection functionality(+) Full virtualization not necessary(+) Relative ease of deployment and maintenance(+) Low to moderate security risks (egress outbreak)(−) Manual emulation of vulnerabiliies still necessary(−) Detection of novel exploits not always reliable

A honeypot example: Nepenthes170 P. Baecher et al.

Fig. 1. Concept behind nepenthes platform

– Submission modules take care of the downloaded malware, e.g., by savingthe binary to a hard disc, storing it in a database, or sending it to anti-virusvendors.

– Logging modules log information about the emulation process and help ingetting an overview of patterns in the collected data.

In addition, several further components are important for the functionalityand efficiency of the nepenthes platform: shell emulation, a virtual filesystem foreach emulated shell, geolocation modules, sniffing modules to learn more aboutnew activity on specified ports, and asynchronous DNS resolution.

The schematic interaction between the different components is depicted inFigure 1 and we introduce the different building blocks in the next paragraphs.

Vulnerability modules are the main factor of the nepenthes platform. They en-able an effective mechanism to collect malware. The main idea behind these mod-ules is the following observation: in order to get infected by autonomous spreadingmalware, it is sufficient to only emulate the necessary parts of a vulnerable ser-vice. So instead of emulating thewhole service,we only need to emulate the relevantparts and thus are able to efficiently implement this emulation. Moreover, this con-cepts leads to a scalable architecture and the possibility of large-scale deploymentdue to only moderate requirements on processing resources and memory. Often theemulation can be very simple: we just need to provide some minimal information atcertain offsets in the network flow during the exploitation process. This is enoughto fool the autonomous spreading malware and make it believe that it can actu-ally exploit our honeypot. This is an example of the deception techniques used inhoneypot-based research. With the help of vulnerability modules we trigger an in-coming exploitation attempt and eventually we receive the actual payload, whichis then passed to the next type of modules.

Shellcode parsing modules analyze the received payload and extract automat-ically relevant information about the exploitation attempt. Currently, only one

Vulnerability modules: emulate vulnerable parts of networkservices.Shellcode parsers: analyse shellcode to locate its source.Fetch modules: download binaries from remote locations.Submission modules: store binaries in a specified location.

Nepenthes vulnerability modules

Poor man’s implementation of the original vulnerabilitySend N fixed strings, random junks, exploit “stages”Dismiss intermediate received stagesRecord final stage and use in payload

Example:

ConsumeLevel LSASSDialogue::incomingData(Message *msg) {

m_buffer->add(msg->getMsg(), msg->getSize());

char reply[512];

for (int32_t i = 0; i < 512; i++) {

reply[i] = rand() % 256;

}

}

Nepenthes shellcode analysis

Analyze the incoming payload and extract malware locationshellcode-signature module

Signature-controlled shellcode analyzerPerl-compatible RE patterns for commonly seen shellcodeIdentify parameters of shellcode (ports, URIs, . . .)

Shell emulator with arbitrary commands

Nepenthes download module

Download the actual malware from previously generatedURLSeveral modules for various protocol:

HTTP(S), (T)FTP, RCP, . . .“Proprietary” malware protocols:

CSend and CReceive from AgoBotLinkBind and LinkConnectback from linkbot

RFC-incompliant implementations of HTTP and FTP

Honeyclients

Objective: detection of attacks directed at client-sidesoftware, mostly web browsers:

browser exploits“drive-by downloads”typo-squatting

Applications:security analysis of web sitesfinding malicious content distribution sitesdetection of new browser exploitsmalware collection

Browser exploitation via redirection

Source: Yi-Min Wang, Microsoft Research

Honeyclient example: HoneyMonkey

A VM-based high interactionhoneyclient, running a vulnerablebrowser.Automatic detection of redirectionrelationships between contentdistribution sitesDetection of zero-day attacks

HoneyMonkey architecture

Stage 1: N URLs per VM,unpatched WinXP,

no redirection analysis





Stage 2: 1 URL per VM,unpatched WinXP SP2,

redirection analysis

Stage 3: 1 URL per VM,patched WinXP SP2,redirection analysis

“Interesting” URLs

Zero-day exploits

Exploit URLs

Exploit URL topologygraph

Analysis of exploitURL density

Access blocking

Vulnerability patching

HoneyMonkey deployment results

Results were obtained in May-June 2005 on a list of 16,190 URLswith known bad content (pornography, adware distribution, someshopping and freeware screensaver sites).

HoneyMonkey configuration Exploit num./freq.Stage 1, fully unpatched 207 (1.3%)Stage 2, fully unpatched (SP1) 688 (4.2%)Stage 2, fully unpatched (SP2) 204 (1.3%)Stage 3, SP2 partially patched 17 (0.1%)Stage 3, SP2 fully patched 0 (0%)

In July 2005, 27 URLs were discovered that distributed azero-day exploit.

From honeypots to honeynets

Goals:Wide coverage of up-to-date “malware landscape”Fast discovery new malware strains

Challenges:Maintenance: deployment by less qualified administratorsSecurity: avoid potential infection of host systemsAutomation: adjust to potentially unknown vulnerabilitiesScalability: infrastructure for storing massive amounts ofmalwareUtility: interface for analysis toolsStealth: should not be detectable by malware

Honeynet example: SGNET

Message extraction from TCP flowsGeneration and refinement of a finite state machine modelfor a communication protocol used by malwareGeneration of a honeyd-compatible script for implementationof a finite-state machine.Communication interface for interaction with the repositoryand analysis components.

Lessons learned

Malware collection is a crucial prerequisite for understandingnew malware threats and development of appropriateprotection tools.The main difficulty of malware collection lies in having to dealwith highly dynamic and heterogeneous exploitationtechniques.Special attention has to be paid to stealthy operation andsecurity features for malware collection.

Recommended reading

Paul Baecher, Markus Koetter, Thorsten Holz, Maximillian Dornseif, andFelix C. Freiling.The Nepenthes platform: An efficient approach to collect malware.In Recent Adances in Intrusion Detection (RAID), pages 165–184, 2006.

Corrado Leita, Marc Dacier, and Frédéric Massicotte.Automatic handling of protocol dependencies and reaction to 0-day attackswith ScriptGen based honeypots.In Recent Adances in Intrusion Detection (RAID), pages 185–205, 2006.

Niels Provos and Thorsten Holz.Virtual Honeypots: From Botnet Tracking to Intrusion Detection.Addison Wesley, 2007.

Yi-Ming Wang, Doug Beck, Xuxuan Jiang, and Roussi Roussev.Automated web patrol with Strider HoneyMonkeys: Finding web sites thatexploit browser vulnerabilities.In Proc. of Network and Distributed System Security Symposium (NDSS),2006.

Documents

Intrusion Detection and Malware Analysis - Malware …...In Recent Adances in Intrusion Detection (RAID), pages 165–184, 2006. Corrado Leita, Marc Dacier, and Frédéric Massicotte