4
Automatic Detection and Rectification of DNS Reflection Amplification Attacks with Hadoop MapReduce and Chukwa Ancy Sherin Jose Department of Information Technology Rajagiri School of Engineering & Technology Kochi,India [email protected] Binu A Department of Information Technology Rajagiri School of Engineering & Technology Kochi,India [email protected] Abstract—DNS Reflection Amplification Attacks are now a days popular and it is the Distributed Denial of Service Attack. This attack makes use of DNS Servers, which are publicly hosted and accessible by everyone, to give huge responses to victims. The victims thus get overloaded by the huge responses for which they do not have sent requests, thus exhausting the resources of the victim. Chukwa is a hadoop subproject which is designed for distributed log monitoring and analysis. This paper concentrates on detecting the DNS Reflection Amplification Attacks by using chukwa for log collection and HDFS for storing logs and Chukwa Demux jobs which are Map Reduce Jobs for detecting the orphan queries and thus automatically defensing by updating the firewall rules. Keywords-DNS Reflection Amplification Attacks,Hadoop, HDFS, Chukwa, Chukwa Demux Jobs, Map Reduce I. INTRODUCTION DNS Amplification Attack is a Distributed Denial of Service Attack[1]. This attack works by generating huge responses at the victim. The victim get overloaded with the huge responses and the resources at the victim get exhausted. The resources can include CPU, bandwidth, memory etc. Think of a victim like Amazon, that has the requirement of High Availability and whose revenue cuts down significantly, if the site has downtime of 5ms. It is not easy to trace the attacker. There are several counter measures[1] taken to prevent this attack like updating Firewall Rules, Response Rate Limiting(RRL),DNS Dampening, BCP38 etc.This paper concentrates on a counter measure using Firewall. As any machine or node in the organization is susceptible to attacks, every machine needs to be monitored in order to detect the presence of the attack. As logs or tcpdumps from multiple machines has to be monitored for DNS Amplification Attacks, the need for Hadoop arises. Hadoop is a distributed framework which can store large sized data and process those data by using the MapReduce Paradigm[2].HDFS is the filesystem component of Hadoop. Hadoop partitions the storage and processing of data to multiple nodes[8]. Hadoop is capable or limited to process small number of large files. Hadoop framework fits completely to our requirement for processing the tcpdump/IDS logs for detecting DNS Amplification Attack. Again we need to store the large log files to HDFS for processing. Chukwa is Hadoop subproject under Apache License, which is used for monitoring and analysis of logs. Chukwa Agents can be configured to collect the logs from the nodes to Chukwa Collector, which can fetch this huge log files to HDFS. Again Chukwa Demux MapReduce jobs can be used to process this logs with the DNS Amplification Attack Detection logic.If the attack is detected, a script is run, to update the firewall rules. II. DNS AMPLIFICATION REFLECTION ATTACK The DNS Amplification Attack is launched in the following manner[1]. Attacker spoofs the IP Address of Victim[1]. Attacker initiates relatively small DNS requests/queries to the DNS Server. As the attacker has spoofed the IP Address of Victim[Reflection], the traffic gets directed from DNS Server to Victim. Things get easy as DNS works in UDP protocol and no handshaking is required between the client and the server. For redirecting huge replies to the victim the attacker can send ANY, DNSSEC, DNSKEY,NS, or RRSIG queries to DNS Server. DNS Amplification Attack can be of three types, Repeating Queries, Varying Queries, and Distributed Attack. Repeating Query Attacks repeats the same query many times. The Query can be ANY query which requests all the records for a particular domain. The response size will be big to produce a good amplification ratio.Amplification Ratio is calculated by the following formula. Amplification Ratio = Response Size / Request Size The Varying Query attacks are launched by requesting for different domain names, in which case the responses are also unique. Here it is not easy to understand that an attack has taken place. If the domain name donot exist, NXDOMAIN response will be obtained. DNS Security(DNSSEC) is achieved by cryptographically signing the zone files with zonekey. If DNSSEC is configured, the user who requests to DNS Server, can ensure that the information comes from DNS Server, that has signed the zonefile with DNS Server's private key, and the user can use 2014 Fourth International Conference on Advances in Computing and Communications 978-1-4799-4363-0/14 $31.00 © 2014 IEEE DOI 10.1109/ICACC.2014.54 195

[IEEE 2014 Fourth International Conference on Advances in Computing and Communications (ICACC) - Cochin, India (2014.8.27-2014.8.29)] 2014 Fourth International Conference on Advances

  • Upload
    binu

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

Page 1: [IEEE 2014 Fourth International Conference on Advances in Computing and Communications (ICACC) - Cochin, India (2014.8.27-2014.8.29)] 2014 Fourth International Conference on Advances

Automatic Detection and Rectification of DNS Reflection Amplification Attacks with Hadoop MapReduce and Chukwa

Ancy Sherin Jose Department of Information Technology

Rajagiri School of Engineering & Technology Kochi,India

[email protected]

Binu A Department of Information Technology

Rajagiri School of Engineering & Technology Kochi,India

[email protected]

Abstract—DNS Reflection Amplification Attacks are now a days popular and it is the Distributed Denial of Service Attack. This attack makes use of DNS Servers, which are publicly hosted and accessible by everyone, to give huge responses to victims. The victims thus get overloaded by the huge responses for which they do not have sent requests, thus exhausting the resources of the victim. Chukwa is a hadoop subproject which is designed for distributed log monitoring and analysis. This paper concentrates on detecting the DNS Reflection Amplification Attacks by using chukwa for log collection and HDFS for storing logs and Chukwa Demux jobs which are Map Reduce Jobs for detecting the orphan queries and thus automatically defensing by updating the firewall rules.

Keywords-DNS Reflection Amplification Attacks,Hadoop, HDFS, Chukwa, Chukwa Demux Jobs, Map Reduce

I. INTRODUCTION DNS Amplification Attack is a Distributed Denial of

Service Attack[1]. This attack works by generating huge responses at the victim. The victim get overloaded with the huge responses and the resources at the victim get exhausted. The resources can include CPU, bandwidth, memory etc. Think of a victim like Amazon, that has the requirement of High Availability and whose revenue cuts down significantly, if the site has downtime of 5ms. It is not easy to trace the attacker. There are several counter measures[1] taken to prevent this attack like updating Firewall Rules, Response Rate Limiting(RRL),DNS Dampening, BCP38 etc.This paper concentrates on a counter measure using Firewall.

As any machine or node in the organization is susceptible to attacks, every machine needs to be monitored in order to detect the presence of the attack. As logs or tcpdumps from multiple machines has to be monitored for DNS Amplification Attacks, the need for Hadoop arises. Hadoop is a distributed framework which can store large sized data and process those data by using the MapReduce Paradigm[2].HDFS is the filesystem component of Hadoop. Hadoop partitions the storage and processing of data to multiple nodes[8]. Hadoop is capable or limited to process small number of large files. Hadoop framework fits

completely to our requirement for processing the tcpdump/IDS logs for detecting DNS Amplification Attack.

Again we need to store the large log files to HDFS for processing. Chukwa is Hadoop subproject under Apache License, which is used for monitoring and analysis of logs. Chukwa Agents can be configured to collect the logs from the nodes to Chukwa Collector, which can fetch this huge log files to HDFS. Again Chukwa Demux MapReduce jobs can be used to process this logs with the DNS Amplification Attack Detection logic.If the attack is detected, a script is run, to update the firewall rules.

II. DNS AMPLIFICATION REFLECTION ATTACK The DNS Amplification Attack is launched in the

following manner[1]. Attacker spoofs the IP Address of Victim[1]. Attacker initiates relatively small DNS requests/queries to the DNS Server. As the attacker has spoofed the IP Address of Victim[Reflection], the traffic gets directed from DNS Server to Victim. Things get easy as DNS works in UDP protocol and no handshaking is required between the client and the server. For redirecting huge replies to the victim the attacker can send ANY, DNSSEC, DNSKEY,NS, or RRSIG queries to DNS Server.

DNS Amplification Attack can be of three types, Repeating Queries, Varying Queries, and Distributed Attack. Repeating Query Attacks repeats the same query many times. The Query can be ANY query which requests all the records for a particular domain. The response size will be big to produce a good amplification ratio.Amplification Ratio is calculated by the following formula.

Amplification Ratio = Response Size / Request Size The Varying Query attacks are launched by requesting

for different domain names, in which case the responses are also unique. Here it is not easy to understand that an attack has taken place. If the domain name donot exist, NXDOMAIN response will be obtained. DNS Security(DNSSEC) is achieved by cryptographically signing the zone files with zonekey. If DNSSEC is configured, the user who requests to DNS Server, can ensure that the information comes from DNS Server, that has signed the zonefile with DNS Server's private key, and the user can use

2014 Fourth International Conference on Advances in Computing and Communications

978-1-4799-4363-0/14 $31.00 © 2014 IEEE

DOI 10.1109/ICACC.2014.54

195

Page 2: [IEEE 2014 Fourth International Conference on Advances in Computing and Communications (ICACC) - Cochin, India (2014.8.27-2014.8.29)] 2014 Fourth International Conference on Advances

the DNS Server's public key to get the information. The DNS Request can be issued with +dnssec option, which gives the cryptographic information as response, which is a huge response.

Distributed Attacks makes use of multiple DNS Servers

and the multiple zone files, making the attack more difficult to trace.

III. HADOOP Hadoop is a framework written in Java, for storage and

processing of large sized data. Hadoop partitions data and computation across many hosts[2]. Hadoop is scalable in both angles of data and computation by incrementing nodes. HDFS(Hadoop Distributed File System) is the filesystem part of Hadoop. The processing or computation of data is carried out by MapReduce programming. Yahoo developed and contributed to 80 % of Hadoop which included HDFS and MapReduce. Hadoop has many sub-projects developed by many companies all under Apache License.

A. Hadoop and our design This paper requires monitoring many workstations. These

workstations may be the part of the LAN, or may be all the workstations of a organization. These workstations can be monitored by analyzing their Intrusion Detection System logs, or tcpdump output or UDP Ports. As many workstations needs to be monitored, this exceeds the computational capability of single machine. A Cluster with Hadoop Distributed File System can be used to store this large log data, and MapReduce can be used to process this data.

IV. CHUKWA Chukwa is a distributed log collection and analysis

system. MapReduce is the standard tool for processing of data intensive jobs[3]. MapReduce can process the huge data that is loaded on the HDFS or other distributed file systems. But the automated processing of the data, needs the data sources to be connected the HDFS. Again MapReduce works best on small number of large files and the incremental log/alert data is not best fit for MapReduce Processing[4].Incremental Log Data needs to be collected from the nodes that are to be monitored and has to be loaded on HDFS as big files. Chukwa takes the role here. Chukwa can automatically collect the logs from the machines that needs to be monitored and can store the data on a single file in HDFS. MapReduce can process this log data for the desired computation logic.Snort is a very famous Signature Based Intrusion Detection System[5].Snort can be executed in Packet Logger mode and the alert logs can be used for detecting attacks[6].This paper makes use of Chukwa's CharFileTailingAdaptorUTF8 which monitors particular files of victims and makes use of this information for detecting attacks.

A. Chukwa Architecture Chukwa has mainly 4 components[4]. Chukwa Agents,

Chukwa Collectors, MapReduce/Demux Jobs, Hadoop

Infrstructure Care Center(HICC). Chukwa Agents are designed to work on the nodes that are to be monitored.Agents have dynamically controllable modules called Adaptors, which read the log data from application or from the filesystem.The stream of bytes from the Adaptor can correspond to the execution of Unix command, listening to UDP ports, or tailing files.Agents can start the suitable Adaptor,can query the adaptor status and can send the data, across the network over HTTP[5].

Chukwa Collectors receive the data from multiple Agents and write them to a sink file in the sink directory[3]. Chukwa collector thus acts as mediator between the Agents which have incremental log data and HDFS that supports small number of large files. Collectors will close the files periodically, and renames them and marks them as available for processing.

MapReduce Jobs process the data from the available sink files by executing Archiving or Demux jobs.Archiving jobs returns the grouped or ordered output, while the Demux jobs produce key value pairs as output.This project uses custom Demux Mappers and Reducers for detecting orphan responses.

B. Chukwa and our Design In our paper, Chukwa Agent with the

filetailer.CharFileTailingAdaptorUTF8 Module can be used for collecting the Tcpdump logs from the workstations that has to be monitored for DNS Amplification Attacks. Chukwa Agents sent these logs to the Chukwa Collectors which stores the data in HDFS. We can run Demux jobs on this data to detect whether the DNS Amplification Attack has taken place.

Figure 1. Flowchart to detect DNS Amplification Attack

196

Page 3: [IEEE 2014 Fourth International Conference on Advances in Computing and Communications (ICACC) - Cochin, India (2014.8.27-2014.8.29)] 2014 Fourth International Conference on Advances

V. ARCHITECTURE OF THIS SYSTEM This paper concentrates on detecting the DNS

Amplification Attacks by checking the orphan DNS responses at the workstations that are to be monitored. While a DNS Amplification Attack takes place, the targeted victim receives responses for which it has not sent DNS Requests. The one-to-one mapping of the DNS Request Response happens in the case of normal DNS operation[7], while there will be no such mapping, in the case of DNS Amplification Attack. To find the one-to-one mapping we can make use of query Id of DNS Response. If a request and response appears for the query Id, then it is a normal operation. If only response appears for a queryId, then it can be an attack.The flowchart for detecting the DNS Amplification Attack is demonstrated in “Fig 1”.

In this paper, for every workstation that has to be monitored, has installed Chukwa Agents.”Fig 2” gives the architecture of the system. These Agents run Tcpdump command to capture network traffic acroos port 53 and stores them in a file. CharFileTailingAdaptorUTF8 watches this file and this acts as datasource.The Agents send the data to Chukwa Collector.Chukwa Collector periodically stores the files in HDFS. A MapReduce Demux Job can be executed on the arrived data to find the one-to-one mapping of DNS Request-Response. The Demux job will find the key value pair for the request and response for the particular DNS QueryId. If we find orphan queries more than a threshold, we can execute a script to update the firewall rules, to block the traffic from the particular DNS Server.

Figure 2. Architecture of the System

VI. EXPERIMENTAL SETUP The experimental setup contains three Attacker machines

or Botnets-B1,B2,B3.Among these three Botnets, one runs a DNS Server as a separate process. Two victims V1,V2 are placed across a Firewall with Chukwa Agents installed on them. One machine runs Chukwa Collector as a process and Hadoop Single Node Cluster is installed on the same machine. All the boxes run Ubuntu10.04.

A. Installing DNS Server - Bind Bind9 is used as the DNS Server for this project.Bind

installation is straight forward with apt-get install. Only thing is to add zone file to DNS Server to represent the domain. In order to create an amplified response, it is good to configure

DNSSEC.DNSSEC stands for DNSSECurity.DNSSEC can be configured by generating Key Signing Key(KSK) and Zone Signing Key(ZSK), and signing the zonefile with these keys.When DNSSEC is configured it will generate signature information, which will be added to the response which amplifies the response. This can be verified by issuing the command,dig goodmorning.com ANY +dnssec | grep -i "size" ,provided goodmorning.com is the domain hosted. Adding distinct RECORD TYPES to DNS Server also increases the response size when dig ANY command is issued.

B. Simulating DNS Amplification Attack To simulate the DNS Amplification Attack, tools used

are - Tcpdump and Tcpreplay. Tcpdump is the Command Line Interface for analyzing the network traffic.TcpReplay is the tool for replaying the network traffic which is captured with Tcpdump to a .pcap file. The two main steps for generating the DNS Amplification Attack include executing tcpdump command to capture the network traffic to the DNS Server in a shell and executing dig goodmorning.com ANY +dnssec command to DNS Server in another shell. Tcpdump command has options to filter the traffic to be captured across particular ip and ports. Also tcpdump provides options for saving the captured data to .pcap files. Once DNS Requests are saved in the .pcap file, the immediate step is to spoof the requester ip address with the victims ip address.TcpReplay tool allows to do this with tcprewrite option.Similarly requester port has to be changed to victims open port.The open ports at victims machines are obtained by portscanning,Nmap is a tool that does this.These steps can be automated with shell scripts which ultimately create the Bot. These Bots can be run on all the three attacker machines,which will send spoofed requests to the DNS Server.The amplified response will reach the victims which can exhaust the resources at victims.

C. Configuring Hadoop Single Node Cluster,HBase,Pig and Chukwa Collector The Hadoop single node cluster was configured under

new user 'hduser'.Hadoop 0.22.0 installation was used.HBase 0.92.2 compatible with Hadoop 0.22.0 was used. Pig 0.9.1 version was installed. Chukwa 5.0 Collector process was also installed in this machine. The different processes that run in this machine include Hadoop Namenode,Datanode,MapReduce JobTracker, TaskTracker, HBase Master,Slaves,Chukwa Collector.This can be verified by issuing 'jps' command from shell.Integrating Hadoop,HBase, Pig, Chukwa is a little tricky and will require weaving compatible jars in the classpaths.

Chukwa collector class has to be configured to

SeqFileWriter class for streaming data to HDFS or HBaseWriter class for streaming data to HBase. SeqFileWriter class was configured in chukwa-collector-conf.xml with necessary properties and this helps for streaming data to HDFS.Chukwa Collector can be started to receive data through http port 80.

197

Page 4: [IEEE 2014 Fourth International Conference on Advances in Computing and Communications (ICACC) - Cochin, India (2014.8.27-2014.8.29)] 2014 Fourth International Conference on Advances

D. Configuring the Victims There are two things to be done at victims machines.

First thing is to configure the network set up of victims. Victims have to route through the Firewall. All traffic to the victims are routed through firewall. In order to do this, gateway of the Victims has to be set with ipAddress of Firewall.Next is to configure the victims with Chukwa Agents[4].CHUKWA_IDENT_STRING of V1,V2 are set as Victim-1,Victim-2 respectively.Chukwa installation was done under a new user 'hduser'.'sudo tcpdump' privilege was added to user 'hduser'.Next step was to issue tcpdump command in shell to capture the DNS Response traffic.The captured TcpDump output was redirected to a text file called Response.txt. The immediate step was to add Chukwa Adaptors.Adaptors are the part of Chukwa Agents. Chukwa Agents listen on port 9093.Adaptors can be added by 'telnet'-ing to Chukwa Agent port 9093.In this project filetailer.CharFileTailingAdaptorUTF8 was added to watch the file Response.txt with the command - add filetailer.CharFileTailingAdaptorUTF8 DNSResponse Response.txt 0. DNSResponse is the datatype.

So this adaptor will act as a datasource which sends the data to Chukwa Collector.Chukwa Agents should be aware of the collectors, they should send the data.

E. Writing Custom Demux Mappers and Reducers for finding out Orphan Responses at Victims When the Bot script is executed at the attacker

machines,the spoofed requests will reach the DNS Server.DNS Server starts to send responses to the Victims. Victims will write the tcpdump captured network traffic for port 53 to Response.txt. CharFileTailingAdaptorUTF8 Adaptor which is started at Victims, will watch the file Response.txt and will pump the data to the configured collector process through http.Chukwa Collector will receive the tcpdump captured data, and has to detect the orphan DNS responses that has reached the victims. Chukwa has a Demux framework, that helps for processing the arrived data with MapReduce.In Chukwa context these are called Demux Jobs, and can be started by executing command bin/chukwa demux.To configure the demux for our dataType 'DNSResponse',a mapper class has to be written and mapping of mapper class with datatype in this case DNSResponse has to be specified in chukwa-demux-conf.xml.The custom mapper class 'TcpDumpParser' was written in the package org.apache.hadoop.chukwa.extraction.demux.processor.mapper. This mapper class will parse every line of tcpdump data.QueryId of the tcpdump RecordLine will form the ChukwaRecordKey.ChukwaRecord is created with the data from tcpdump.Some of the keys that form ChukwaRecord are SourceIP,SourcePort,DestinationIP,DestinationPort. All the values from the tcpdump RecordLine are collected and substringed using Java substring().Next step is to write custom Demux Reducer.Custom Reducer has to group the data based on ChukwaRecordKey which is the QueryId in this case.The custom Reducer class 'TcpDumpReducer' was added to org.apache.hadoop.chukwa.extraction.demux.processor.redu

cer package.This reducer class will group the data based on the queryId, and will check for one-to-one mapping of DNS Request and DNS Response. As the data is mapped with queryId as the ChukwaRecordKey, the corresponding ChukwaRecord contains the necessary data associated with queryId, the one-to-one mapping of Request and Response becomes easy in Reducer code. Once it is detected that, a response is received for which request is not sent, then it is suspected behavior, and scripts are executed in Firewall to block the traffic from the particular DNS Server. Java Runtime was used for executing the script. Scripts update iptable rules to block the traffic and the victims are saved from attack.

VII. CONCLUSION This paper discussed about detecting the DNS Reflection

Amplification Attack with Chukwa and Hadoop Single Node Cluster. The same project can be extended to multiple node cluster. Also any other distributed file system like GFS,PVFS, Lustre can be considered as alternatives to HDFS. Chukwa can be used to analyze Snort logs, or IPTraf logs for detecting DNS Reflection Amplification Attacks.

ACKNOWLEDGMENT Heartfelt thanks to Rajagiri School of Engineering And

Technology for providing the infrastructure for the project. Sincere thanks to Head of Department Prof. Kuttyamma A.J for her support.Special thanks to Binu.A,Assistant Prfessor,Rajagiri for his valuable suggestions and guidance.Hearty thanks to Varghese Mathew,Bobcares.com, for his all time support.

REFERENCES

[1] Thijs Rozekrans , Matthijs Mekking “Defending Against DNS reflection amplification attacks” University of Amsterdam,System and Network Engineering RP1, pp . 1-29, Feb 14 2013

[2] K.Shvachko, Hairong Kuang, ,Radia.S,Chansler.R “The Hadoop Distributed File System” Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium, May 2010, Pages 1-10

[3] Ariel Rabkin,Randy Katz “Chukwa: a system for reliable large-scale log collection“, ACM LISA'10 Proceedings of the 24th international conference on Large installation system administration

Article No. 1-15 ,2010 [4] Chukwa Design Documentation and admin guide

http://chukwa.apache.org/docs/r0.5.0/design.html https://chukwa.apache.org/docs/r0.5.0/admin.html [5] Prathibha.P.G,Dileesh.E.D,”Design of a Hybrid Intrusion Detection

System using Snort and Hadoop” International Journal of Computer Applications (0975 – 8887) Volume 73– No.10, July 2013

[6] JeongJin Cheon, Tae Young Choe ”Distributed Processing of Snort Alert Log using Hadoop “, International Journal of Engineering andTechnology (IJET) Vol 5 No 3 Jun-Jul 2013

[7] G. Kambourakis, T. Moschos, D. Geneiatakis and S. Gritzalis, "A fair solution to DNS amplification attacks," in Digital Forensics and Incident Analysis, 2007. WDFIA 2007. Second International Workshop on, 2007, pp. 38-47 [8] T. White, “Hadoop: The Definitive Guide” O'Reilly Media, Yahoo!

Press, June 5, 2009.

198