19
DoS and DDoS Detection, Defense and Deterrence Marinos Bernitsas Department of Electrical & Computer Engineering Carnegie Mellon University [email protected] www.bernitsas.com Abstract DoS is the most debilitating attack in the Internet and yet it still remains an open research topic. In this paper, we storm through a sea of publications to examine the sophistication of DoS attacks, schemes to detect them, ways to defend against them and credible means to de- ter their occurrence. As the Internet provides no means of detection, defense, resilience and deterrence for DoS, we describe both commercial solutions and research pro- posals for addressing each of these security properties. We conclude that no incremental modification can effi- ciently address DoS, and emphasize the need for a clean- slate approach that integrates all of the necessary secu- rity properties to eliminate DoS. 1 Introduction Above all else, the primary goal of every network is availability. It is the number one feature users demand from their providers. When the availability of the Inter- net is disrupted critical transactions fail and millions of dollars are lost in business. For instance, in 2008, go- ing dark for 2 hours cost Amazon $31,000 per minute [36]. Today this cost is estimated to be over $100,000 per minute. Denial of Service (DoS) and Distributed Denial of Service (DDoS) are the primary means of attacking availability on the Internet. DoS first became a major scare in February 2000, where a series of massive DoS attacks incapacitated several high-visibility sites, includ- ing Yahoo, Ebay and Etrade [15]. Figure 1 demonstrates the disruption these attacks caused in an Internet-wide basis. In January 2001, Microsoft faced a similar assault and was forced to pay Akamai to host part of its network for years. Since the first public DoS scares, millions of websites have been attacked, ranging from commercial institu- tions to governmental organizations. In the past months, 10 years after the first publicized DDoS attacks, the in- ternational community has been a witness to large-scale DDoS attacks by Hacktivist group “Anonymous” in sup- port of Wikileaks founder Julian Assange [5]. “Opera- tion Payback”, as it was dubbed, performed large-scale DDoS attacks on popular websites such as Mastercard, Visa, Paypal and PostFinance bank, rendering them un- usable for hours. As more and more of our daily lives transition to the digital realm, the effects of such attacks become more and more devastating. But what is it that makes this attack still possible? How have researchers not provided a solution for this problem, after all these years? To answer this question, we need to go back to when the Internet was invented, almost 3 decades ago. When the Internet was designed it was comprised by a number of hosts where each one knew and trusted each other. Sending packets that ap- peared to have come from another host (IP spoofing) was not an issue, because the hosts had no reason to do so. For this reason, the Internet never built in safeguards to prevent malicious hosts from affecting other hosts in the Internet. It also did not provide any accountability mech- anism for reliably identifying where traffic came from. In today’s Internet we are still victims of the universal trust and lack of accountability design decisions. First, a host needs to trust that another host on the network will not attempt to launch an attack against it. Second, a host needs to trust that hosts have the IP addresses their pack- ets claim they have. Today’s Internet is a chaotic mass of devices across different geographical, legal, political and social institutions. As a result, neither of these lev- els of trust is attainable. In addition, packets flow so fast that it is extremely expensive and slow to verify or in- spect each packet in a flow. For all the above reasons, attackers can inflict serious damage with impunity. In the remainder of the paper, we first give the nec- essary background on DoS in section 2. We then exam- ine different types of DoS attacks in section 3. We then proceed to perform a security analysis across three axes: Detection in section 4, Defense and Resilience in section 5 and Deterrence in section 6. We show that unless we move towards a clean-slate approach, by redesigning the Internet [7], DDoS attacks will always be a concern. 1

DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

DoS and DDoS Detection, Defense and DeterrenceMarinos Bernitsas

Department of Electrical & Computer EngineeringCarnegie Mellon University

[email protected]

www.bernitsas.com

Abstract

DoS is the most debilitating attack in the Internet andyet it still remains an open research topic. In this paper,we storm through a sea of publications to examine thesophistication of DoS attacks, schemes to detect them,ways to defend against them and credible means to de-ter their occurrence. As the Internet provides no meansof detection, defense, resilience and deterrence for DoS,we describe both commercial solutions and research pro-posals for addressing each of these security properties.We conclude that no incremental modification can effi-ciently address DoS, and emphasize the need for a clean-slate approach that integrates all of the necessary secu-rity properties to eliminate DoS.

1 IntroductionAbove all else, the primary goal of every network isavailability. It is the number one feature users demandfrom their providers. When the availability of the Inter-net is disrupted critical transactions fail and millions ofdollars are lost in business. For instance, in 2008, go-ing dark for 2 hours cost Amazon $31,000 per minute[36]. Today this cost is estimated to be over $100,000per minute.

Denial of Service (DoS) and Distributed Denial ofService (DDoS) are the primary means of attackingavailability on the Internet. DoS first became a majorscare in February 2000, where a series of massive DoSattacks incapacitated several high-visibility sites, includ-ing Yahoo, Ebay and Etrade [15]. Figure 1 demonstratesthe disruption these attacks caused in an Internet-widebasis. In January 2001, Microsoft faced a similar assaultand was forced to pay Akamai to host part of its networkfor years.

Since the first public DoS scares, millions of websiteshave been attacked, ranging from commercial institu-tions to governmental organizations. In the past months,10 years after the first publicized DDoS attacks, the in-ternational community has been a witness to large-scaleDDoS attacks by Hacktivist group “Anonymous” in sup-

port of Wikileaks founder Julian Assange [5]. “Opera-tion Payback”, as it was dubbed, performed large-scaleDDoS attacks on popular websites such as Mastercard,Visa, Paypal and PostFinance bank, rendering them un-usable for hours. As more and more of our daily livestransition to the digital realm, the effects of such attacksbecome more and more devastating.

But what is it that makes this attack still possible?How have researchers not provided a solution for thisproblem, after all these years? To answer this question,we need to go back to when the Internet was invented,almost 3 decades ago. When the Internet was designedit was comprised by a number of hosts where each oneknew and trusted each other. Sending packets that ap-peared to have come from another host (IP spoofing) wasnot an issue, because the hosts had no reason to do so.For this reason, the Internet never built in safeguards toprevent malicious hosts from affecting other hosts in theInternet. It also did not provide any accountability mech-anism for reliably identifying where traffic came from.

In today’s Internet we are still victims of the universaltrust and lack of accountability design decisions. First, ahost needs to trust that another host on the network willnot attempt to launch an attack against it. Second, a hostneeds to trust that hosts have the IP addresses their pack-ets claim they have. Today’s Internet is a chaotic massof devices across different geographical, legal, politicaland social institutions. As a result, neither of these lev-els of trust is attainable. In addition, packets flow so fastthat it is extremely expensive and slow to verify or in-spect each packet in a flow. For all the above reasons,attackers can inflict serious damage with impunity.

In the remainder of the paper, we first give the nec-essary background on DoS in section 2. We then exam-ine different types of DoS attacks in section 3. We thenproceed to perform a security analysis across three axes:Detection in section 4, Defense and Resilience in section5 and Deterrence in section 6. We show that unless wemove towards a clean-slate approach, by redesigning theInternet [7], DDoS attacks will always be a concern.

1

Page 2: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

Date Average (s) Load Time (s) ∆

7 Feb 5.66 5.98 -5.7%8 Feb 5.53 5.96 -7.8%9 Feb 5.26 6.67 -26.8%10 Feb 4.97 4.86 +2.2%

Figure 1: Internet Performance for 40 important busi-ness sites during Feb 2000. Measurements are loadtimes in seconds. Source: Keynote Systems

2 BackgroundBefore proceeding to our analysis of DoS, it is importantto define several key terms.

DoS DoS attacks occur when a large amount of traf-fic from one or more hosts is directed at some networkresource. The artificially high load denies or severely de-grades service to legitimate users of that resource [21].DoS can either exploit a design flaw in a particular pro-tocol or an implementation flaw (logical bug). The Inter-net, by default, has no way of distinguishing legitimatefrom illegitimate flows and thus provides no detectionof the attack. Since the attackers flood the victim withtraffic which the victim has no way of halting, the In-ternet provides no defense to DDoS. Because protocolsbehave non-linearly to congestion, the Internet does notprovide resilience either. Last, since attackers can spooftheir IP addresses and remain undetected, there exists nodeterrence.

DDoS When DoS happens in a distributed fashion, i.e.multiple hosts sending attack packets at the same time,it is called a DDoS. DDoS can occur without the knowl-edge of the owner of the host, in which case the host isa zombie, part of a botnet. The DDoS becomes moreeffective as its hosts become more uniformly distributedacross the Internet, as they become harder to identify.Since DDoS is a subset of DoS, the paper uses bothterms interchangeably.

Flash Crowd When a major event attracts massivetraffic to a particular server, that server experiences aflash crowd. While in a flash crowd the hosts sendingthe packets are well-intentioned, from the perspective ofthe server the view is similar to a DDoS, since it is re-ceiving massive traffic from diversely distributed hostsacross the Internet.

Zombie When a host has been infected by stealthymalware that opens remote root access, it is said to bea zombie, since the attacker can, at any time activate thehost and perform instructions on it. The attacker needsto be careful not to create a noticeable load, since theuser might have the computer checked and the malwareremoved.

Botnet A botnet is a collection of zombies. Mod-ern botnets have hundreds of thousands of zombies attheir disposal. The zombies are controlled through com-mand and control servers that provide them with re-mote instructions to run. Botnets are crucial in perform-ing DDoS attacks, since they allow the attacker to sendpackets from a huge number of uniformly distributedhosts simultaneously, while concealing his presence.

Security Analysis In this paper we perform a securityanalysis across four axes:

1. Attack: The most important DoS attacks as well asthe resources the attacker needs to succeed

2. Detection: Commercial and research schemes thevictim use to detect and trace the DoS attack

3. Defense and Resilience: Commercial and researchschemes the victim can use to defend against DoSattacks and remain resilient after it has becomecompromised

4. Deterrence: Commercial and research schemes theservers can use to deter DoS attacks from happen-ing in the first place

3 AttacksEvery attack on the Internet is based on a design flaw oron an implementation flaw (software bug). In the mostfundamental sense, DoS leverages the Internet’s funda-mental design vulnerabilities to launch attacks with im-punity. However, DoS is not just a matter of congestinga link. Attackers have become a lot more sophisticatedand are able to perform attacks that are specifically tai-lored for breaking certain protocols. Below, we will seea wide breadth of attacks of varying sophistication, somevery common and some that (currently) only exist in theresearch realm, but could have devastating consequencesif ever attempted.

3.1 Bandwidth DoSBandwidth DoS is perhaps the simplest and most tra-ditional DoS attack. The basic idea is pointing a largenumber of clients to a certain resource simultaneously.Since the requests arrive from different hosts, they aredistributed uniformly across the Internet and the requestscongest the ingress link to the victim’s network. Becauseof the congestion, packets are dropped and connectionsare either very slow or impossible to maintain. In ad-dition, for transactions where the response is larger thanthe request (e.g. HTTP), the victim server might congestits own egress link, and thus self-inflict Bandwidth DoSthrough its own responses.

This attack lacks sophistication and is simply basedon the fact that networks are provisioned for a certainbandwidth based on their average traffic and not theirpeak traffic. Note that flash crowds essentially are also a

2

Page 3: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

type of Bandwidth DoS since many users visit a certainwebsite at the same time. The term “getting slashdotted”has been coined to denote the effect experienced when acertain website is linked to on slashdot.com, creat-ing a flash crowd that cripples the linked server. Popularevents (Michael Jackson’s death) have also been knownto cause such DoS attacks [11].

3.2 Resource Starvation DoSIn computational DoS the attacker depletes a resourcethat the victim critically relies on by flooding it withrequests. This resource can be: computational power(Computational DoS) by forcing the CPU to performcomplicated computations; memory, by forcing the vic-tim to allocate all of its RAM, and therefore enter into athrashing vicious circle; or storage, by making the vic-tim deplete its allocated storage.

Computational DoS is particularly effective on serversthat perform complex processing, e.g. a travel searchengine, which has to mine and analyze large sets of data.Unless the victim has significantly over-provisioned, afew requests could bring down the entire system.

In memory DoS the victim’s RAM is depleted, forcingit to store elements in secondary storage, which incursa huge performance penalty because of thrashing. Inaddition, servers might automatically reboot after theirmemory is depleted, as they are unable to perform vitalfunctions. If an attacker manages to reboot the victim,he can perform DoS by continuously forcing it to reboot.

In storage DoS, the victim is forced to deplete its stor-age on its own (i.e. without the attacker having to sendgigabytes of storage, which is not realistic). This attackis particularly relevant to devices whose storage is lim-ited, e.g. routers, firewalls, proxies and IPS. In orderto force a victim to deplete its own storage the attackercan generate packets that will generate a large numberof storage actions per packet. For example, the attackercan send packets that will create a large number of logentries in the victim’s storage, or can force a firewall tocreate a larger number of flow filters than it can handle.The device will then either reboot or stop responding,creating a DoS.

A lot of DoS attacks that exploit application-layer vul-nerabilities to perform DoS fall under Resource Starva-tion DoS. For example, the billion laughs DoS exploitsweaknesses in XML parsers. ReDoS exploits regular ex-pressions that take a long time to calculate. Buffer over-flows can cause a victim to crash and experience DoS.Such vulnerabilities are typically fixed in a faster devel-opment cycle, as they are software fixes.

3.3 Algorithmic Complexity DoSAlgorithmic Complexity DoS is a sophisticated kind ofResource Starvation DoS attack which exploits specific

vulnerabilities in the data structures used by network el-ements. Network devices need very efficient data struc-tures to minimize the delay introduced by their compo-nent. While those data structures (usually Hash Tablesand Binary Trees) have very desirable average runtimeproperties, in their worst case, they degenerate to linkedlists, in which add becomes O(n) and lookup be-comesO(n2). In addition, the structures can be forced toconstantly resize, incurring a huge performance penaltythat during average runtime would be amortized. An at-tacker can specifically choose his packets to force theseefficient data structures to reach their worst-case perfor-mance, thus bringing the victim down to its knees.

In [12] Crossby and Wallach demonstrate the effec-tiveness of such attacks on Perl, the Squid web proxyand the Bro intrusion detection system. Using band-width less than a dialup modem, after six minutes theBro server was dropping as much as 71% of traffic andconsuming all of its CPU. Figure 2 shows the cumulativedropped packets by the Bro intrusion detection system,with carefully selected traffic of only 16kb/s. At time A,only 6 minutes after the attack, Bro’s processing latencystarts increasing more than the packet interarrival time.At time B, Bro is catching up on its backlog and essen-tially dropping every packet. At time C the hash table isresized, and the attack commences again.

30

40

50

60

70

80

90

0 5 10 15 20 25

Del

ay in

sec

onds

Minutes into the Attack

AB

C

Figure 3: Packet processing latency, 16kb/s.

0

2

4

6

8

10

12

14

16

18

20

0 5 10 15 20 25

Thou

sand

s dr

oppe

d

Minutes into the Attack

A BC

Figure 4: Cumulative dropped packets, 16kb/s.

Bro’s drop rate is not constant. In fact, Bro mani-fests interesting oscillations in its drop rate, whichare visible in Figures 3 through 6. These graphspresent Bro’s packet processing latency and cu-mulative packet drop rate for attack packets beingtransmitted at 16 kb/sec and 64 kb/sec.

At time A, the latency (time between packet ar-rival and packet processing) starts increasing as to-tal processing cost per packet begins to exceed thepacket inter-arrival time.

At time B, Bro is sufficiently back-logged that thekernel has begun to drop packets. As a result, Brostarts catching up on its backlogged packets. Dur-ing this phase, the Bro server is dropping virtuallyall of its incoming traffic.

At time C, Bro has caught up on its backlog, andthe kernel is no longer dropping packets. The cycle

20

40

60

80

100

120

140

0 1 2 3 4 5 6

Del

ay in

sec

onds

Minutes into the Attack

A

B

C

Figure 5: Packet processing latency, 64kb/s.

0

5

10

15

20

25

30

35

40

45

0 1 2 3 4 5 6

Thou

sand

s dr

oppe

d

Minutes into the Attack

A B

C

Figure 6: Cumulative dropped packets, 64kb/s.

can now start again. However, the hash chain underattack is now larger then it was at time A. This willcause subsequent latencies to rise even higher thanthey were at time B.

This cyclic behavior occurs because Bro only addsentries to this hash table after it has determinedthere will be no response to the SYN packet. Bronormally uses a five minute timeout. We reducedthis to 30 seconds to reduce our testing time andmake it easier to illustrate our attacks. We antic-ipate that, if we were to run with the default 5-minute timeout, the latency swings would have alonger period and a greater amplitude, do to the tentimes larger queues of unprocessed events whichwould be accumulated.

Figure 2: Cumulative dropped packets by the Bro IDS,16kb/s [12]

When increasing the attacker’s bandwidth to only thatof an ISDN connection 64kb/s, point B starts after only1 minute into the attack and more than 45,000 packetsare dropped in only 6 minutes.

3.4 Reflector AttackA Reflector Attack allows a single host with limited pro-cessing power and bandwidth to force a legitimate inter-mediary network to perform DoS on the victim, whilesimultaneously framing the intermediary. The reflector

3

Page 4: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

attack is more of a method than an attack itself, as itrequires a protocol that generates a reply when specificpackets are sent to it. The attacker generates a packetthat seems to have come from the victim and addressesit to a multicast address in a network. This packet isspecifically crafted to induce a response from the inter-mediary to the victim. When each host in the reflectingnetwork receives the packet on its multicast interface, itfollows the protocol’s specification and generates a replydirected to the source of the packet, which in this case isthe victim.

With a single packet and a single machine the attackerhas managed to perform an amplification of the size ofthe reflecting network. When the packet flood reachesthe victim, either the victim’s bandwidth or resourcesare depleted, and DoS is experienced. Note that thistechnique is attractive because the victim thinks that theattack is coming from the reflecting network, so the at-tacker can conceal his presence.

3.5 Smurf DDoSA Smurf attack takes the Reflector attack even further.Because in a traditional reflector attack the responsesseem to come from a single network, it is easy to per-form filtering for one specific flow aggregate to blockall requests from the attacker’s network. In order tocircumvent this problem, the attacker uses a botnet todirect reflector packets to a large variety of intermedi-aries. Each of the intermediaries perform the reflection,but since the attack packets are coming from so manydifferent networks, it is impossible for the victim to fil-ter them and effectively distinguish attack packets fromlegitimate packets. Therefore, the victim experiencesDoS.

Because of its simplicity and ability to conceal the at-tacker, Smurf DDoS has been demonstrated to be ex-tremely effective and hard to defend from. Even worse,because of its distributed nature, this attack may leadto Bandwidth DoS on links that are before the victim’snetwork, forming a bottleneck before the traffic evenreaches the victim. As a result, the victim can’t evendefend from the attack since he never sees it.

3.6 TCP DoSSince TCP was created in order to provide reliable com-munications and was not designed with security in mind,there are fundamental weaknesses in the protocol thatcan be used as attack vectors for DDoS. Its most fre-quent exploitation is forcing a victim to respond withRST messages as part of a Smurf DDoS attack. All theRSTs congest the egress link of the router creating DoS.

But there are more sophisticated and effective attacksagainst TCP. It is important to note that the only securityavailable to TCP is the 232-bit Sequence Number (SN),

which the attacker has to guess in order to inject boguspackets into a connection.

SYN Flooding Upon a SYN request, a server needsto reserve state for establishing a connection. This canbe the Initial Sequence Number (ISN) of the connec-tions, connection options/preferences, the IP and portof the client etc. Typical TCP implementations allow1024 or 2048 connections in their SYN buffer (i.e. in-complete connections for which only a SYN has beenreceived). By flooding the victim with SYN requests,the attacker can exhaust the SYN queue and thus evictall legitimate connections. Assuming a uniform proba-bility and a 1024 size queue, the attacker only needs tosend

1023∑n=0

10241024− n = 7689.4 (3.1)

packets. As a result, SYN Flooding performs DoS bynever allowing new connections to be established.

0

10

20

30

40

50

60

70

80

90

100

0 500 1000 1500 2000 2500 3000

% o

f con

nect

ions

com

plet

ed

microseconds

Time needed to connect() to RELENG_4 system

backlog = 128backlog = 256backlog = 512backlog = 768

backlog = 1024

Figure 2: Time needed to connect() to a RELENG 4 system under a SYN flood attack. The kern.ipc.somaxconnparameter on the remote machine was set to 1024, and the size of the listen backlog was varied for each run.

cache. If an existing entry in the cache needs to beevicted, a sysctl tunable controls the optional behaviorof sending back a SYN cookie instead of evicting the en-try from the cache. In the following discussion, first theimplementation of the syncache will be presented, inde-pendent of syncookies, with the next section explaininghow syncookies modify the behavior of the syncache.

5 SYN Cache

The syncache implementation replaces the per-socketlinear chain of incomplete queued connections with aglobal hashtable, which provides two forms of protec-tion against running out of resources. These are a limiton the the total number of entries in the table, which pro-vides an upper bound on the amount of memory that thesyncache takes up, and a limit on the number of entries ina given hash bucket. The latter limit bounds the amountof time that the machine needs to spend searching for amatching entry, as well as limiting replacement of thecache entries to a subset of the entire cache. A global ta-ble was chosen instead of a per-socket table as it was feltthis would be a more efficient use of system resources. Acurrent implementation restriction that all kernel virtualaddress space for the memory used at interrupt time mustbe pre-allocated was also a factor in this decision.One of the major bottlenecks in the original code was

the random drop implementation from the linear list,

which did not scale. This bottleneck avoided in thesyncache, since the queue is split among hash buckets,which are then treated as FIFO queues instead of usingrandom drop. Another way of viewing this is to con-sider the original linear list partitioned up into a numberof sublists equivalent to the size of the hash table, wherechoosing a bucket enables us to choose which section ofthe list to drop. Since the hash distribution across thebuckets should be uniform, this is an approximate modelof choosing a random list entry to drop.The hash value is computed on the incoming packet

using the source and destination addresses, the sourceand destination port, and a randomly chosen secret. Thisvalue is then used as an index into a hash table, wheresyncache entries are kept on a linked list in each bucket.The secret is used to perturb the hash value so that anattacker cannot target a specific hash bucket and denyservice to a specific machine.While on the surface it may appear that an attacker

could implement a DoS by targeting a hash bucket sothat a legitimate connection does not reside on the queuelong enough to establish a connection, the risks are migi-tated by the use of the hash secret. Additionally, sincethe port number of the connecting machine is used inthe hash calculations, a second connection attempt fromthe client machine tends to result in a second hash bucketchosen, further styming any attempt by an attacker to tar-get a specific bucket.

Figure 3: Time needed to connect() to a RELENG 4system under SYN Flooding [19]

Figure 3 shows how after a SYN flood that causesa 1024-byte backlog the machine’s TCP performancecompletely deteriorates. Because of TCP’s congestioncontrol, DoS happens very abruptly and non-linearly.

TCP Reset The attacker can send TCP resets to con-stantly reset existing connections. When the TCP pro-tocol receives a RST packet from a client, it checksthat the Sequence Number is within acceptable bounds(SN+TCP Window) and if it is, it terminates the connec-tion given by the < port, src > pair.

As pointed out in [6], if the victim’s TCP implemen-tation uses a predictable algorithm for determining SNs,the attacker can simply open a legitimate connection,look at the SN and then create RSTs for every SN thatcomes after it, denying access to every subsequent com-munication request.

4

Page 5: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

But even if the SNs are not predictable, in [35] Watsonshows that because the typical TCP window is 216, only232

216 different SNs need to be generated for a RST packetto succeed. As shown in Figure 4, it only takes half asecond for a 45mbps line to reset a connection with aknown port, and 25 seconds for 50 source ports.

This allows an attacker to very quickly evict estab-lished TCP connections and perform DoS.

Fragment Flooding Both TCP and IP allow singlepackets to be fragmented into smaller units (fragmen-tation and reassembly). This allows packets to passthrough networks that may require smaller packet sizesthan the sender estimated. In order to reassemble thestream, the attacker has to first allocate space for theoriginal TCP packet and then place each fragment in theright memory slot. Only after all fragments have arrivedcan he read the packet received. But the attacker cansimply break a TCP packet into n fragments each of sizesn bytes and only send the n-th packet. In that way, hecan achieve the storage allocation of having sent an n∗sn

byte packet, with only sending sn bytes. The server’s re-sources are quickly exhausted, leading to DoS.

Attacking TCP Congestion Control One of TCP’sfundamental operations is its ability to perform conges-tion control and dynamically adjust the throughput of aconnection. During a DoS attack, the TCP stack candynamically throttle an attacker who is sending a floodof data through TCP, therefore limiting the potential forexploitation. However, in [29], researchers present TCPDaytona, three ways by which an attacker can receivemore than its fair share of bandwidth in TCP. These arebased on logic and implementation errors in most TCPstacks.

With ACK division, an attacker can send n ACKs forevery packet received. The server misinterprets the extraACKs as a sign of lower congestion, thus allowing theattacker to send and receive more data. With DupACK,an attacker sends multiple ACKs for every packet, againtricking the server into accepting larger throughput fromthe server. Last, with Optimistic ACKing the attackersends ACKs for packets it has not yet received, againincreasing its available throughput.

As shown in Figure 5, TCP Daytona can hinder thesender’s ability to determine the congestion of a linebased on the ACKs it receives. This makes a very pow-erful DDoS, since the server under DDoS will be forcedto increase its send rate instead of throttle it.

3.7 ICMP DoSICMP provides various control and maintenance com-mands for networks. Since the echo and ping commandsinduce responses by the hosts they are sent to, they pro-vide ideal candidates for Smurf and Reflector attacks,

0

10000

20000

30000

40000

50000

60000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Sequ

ence

num

ber (

Byte

s)

Time (sec)

Data SegmentsACKs

Data Segments (normal)ACKs (normal)

Figure 4: The TCP Daytona ACK division attack convinces the TCPsender to send all but the first few segments of a document in a single burst.

3 Implementation experience

To exploit the vulnerabilities described above, we made three mod-ifications to the TCP subsystem of Linux 2.2.10. This resultingTCP implementation, which we refer to facetiously as “TCP Day-tona”, provides extremely high performance at the expense of itscompetitors. We demonstrate these abilities with time sequenceplots of packet traces for both normal and modified receiver TCP's.Needless to say, our implementation is intentionally not “stable”,and would likely lead to congestion collapse if it were widely de-ployed.

3.1 ACK division

The TCP Daytona ACK division algorithm adds 24 lines of codethat divide each new outgoing ACK into many ACKs for smallerextents of the sequence space. Half of the new code is dedicatedto ensuring that the number of outgoing ACKs is no more thanshould be needed to coerce a sender in slow start to saturate ourtest machine's 100Mbps Ethernet interface.

Figure 4 shows client-side TCP sequence number plots of ourtest machine making an HTTP request for the index.html ob-ject from cnn.com, with and without our ACK division attack en-abled. This figure spans the entire transaction, beginning with theTCP handshake that starts at 0ms and ends at around 70ms, whenthe HTTP request is sent. The first HTTP data from the server ar-rives at around 140ms.

This figure shows that, when this attack is enabled, the manysmall ACKs sent around 140ms convince the Web server to un-leash the entire remainder of the document in a single burst; thisdata arrives exactly one round-trip time later. By contrast, with thenormal TCP implementation, the server spreads out the data overthe next four round-trip times. In general, as this figure suggests,this attack can convince a TCP sender to send all of its data in asingle burst.

3.2 DupACK spoofing

The TCP Daytona DupACK spoofing attack is implemented by 11lines of code that cause the receiver to send sufficient duplicateACKs such that the sender (re-)enters fast recovery and fills thereceiver's advertised flow control window each round-trip time.

Figure 5 shows another client-side plot of the same HTTP re-quest, this time with the DupACK spoofing attack superimposed

0

10000

20000

30000

40000

50000

60000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Sequ

ence

num

ber (

Byte

s)

Time (sec)

Data SegmentsACKs

Data Segments (normal)ACKs (normal)

Figure 5: The TCP Daytona DupACK spoofing attack, like the ACK divi-sion attack, convinces the TCP sender to send all but the first few segmentsof a document in a single burst.

0

10000

20000

30000

40000

50000

60000

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

Sequ

ence

num

ber (

Byte

s)

Time (sec)

Data SegmentsACKs

Data Segments (normal)ACKs (normal)

Figure 6: The TCP Daytona optimistic ACK attack, by sending a streamof early ACKs, convinces the TCP sender to send data much earlier than itnormally would.

on a normal transfer. The many duplicate ACKs that the receiversends at around 140ms cause the sender to enter fast recovery andtransmit the rest of the data, which arrives at around 210ms. Werethere more data, the flurry of duplicate ACKs sent at 210ms-230mswould elicit another burst from the sender. Since there is no morenew data, the sender simply fills in the hole it perceives; this seg-ment arrives at around 290ms. This figure illustrates how the Du-pACK spoofing attack can achieve performance essentially equiva-lent to the ACK division attack – namely, both attacks can convincethe sender to empty its entire send buffer in a single burst.

3.3 Optimistic ACKing

The TCP Daytona implementation of optimistic ACKing consistsof 45 lines of code. Because acknowledging data that has not ar-rived is a fundamentally tricky business, we chose a very simpleimplementation as a proof of concept. When a TCP connectionfor an HTTP or FTP client receives its first data, we set a timerto expire every 10ms. Any interval would do, but we chose 10msbecause it is the smallest interval that Linux 2.2.10 supports on theIntel PC platform. Whenever this periodic timer expires, or a newdata segment arrives, our receiver sends a new optimistic ACK forone MSS beyond the previous optimistic ACK.

Figure 5: TCP Daytona using ACK division [29]

leading to the term “Ping Flood”.However, ICMP packets are vulnerable to more so-

phisticated DoS attacks. Since they are not authenti-cated, any attacker can spoof them. As a result, the at-tacker can send Destination Unreachable or Time to LiveExceeded commands to reset existing TCP connections,as long as it knows the source and destination port [6].In addition, he could force routers to make other kindsof routing changes that would deny the victim a route tothe Internet. Past implementation vulnerabilities also in-clude the ping of death, where an attacker could send amalformed ping packet to crash a host, and the teardropattack, where a host would crash when receiving over-lapping TCP fragments.

All of the implementation vulnerabilities have beenfixed, but ICMP still remains an attractive target for at-tacks due to its lack of authentication and its ability toprovoke responses by victims.

3.8 Coremelt AttackCoremelt [31] is an attack that attempts to perform DoSnot just on a single target, but on the Internet Core, i.e.the core routers at the heart of the Internet that the ma-jority of traffic passes through. Such an attack disruptsthe operation of the entire Internet and the authors showit is possible to do so with a cotnet of average size. Thisis especially dangerous in the wake of the emergence ofa form of Internet terrorism, which could use it to dis-rupt millions of communications, banking transactionsand time-critical transmissions.

In Coremelt, an attacker generates traffic between thehosts on a botnet. Since the zombies are topologicallyuniformly distributed, the traffic flows pass through thecore of the Internet, creating a large load across all thecore routers. Among N routers there are O(N2) con-nections, which can cause significant congestion on thenetwork core, creating an Internet-wide DoS.

In Coremelt, the attacker is limited by the size of thebotnet, the uniformity of the distribution of bots and the

5

Page 6: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

Operating System Initial Window Size Packets Required

Windows 2000 5.00.2195 SP4 64,512 66,576Windows XP Home Edition SP1 64,240 66,858HP-UX 11 32,768 131,071Nokia IPSO 3.6-FCS6 16,384 262,143Cisco 12.2(8) 16,384 262,143Cisco 12.1(5) 16,384 262,143Cisco 12.0(7) 16,384 262,143Cisco 12.0(8) 16,384 262,143Windows 2000 5.00.2195 SP1 16,384 262,143Windows 2000 5.00.2195 SP3 16,384 262,143Linux 2.4.18 5,840 735,439Efficient Networks 5861 (DSL) v5.3.20 4,096 1,048,575

Figure 4: Initial Window Sizes for different OSs

amount of traffic each bot can generate. In the most re-alistic simulation performed (step network model), theattackers are able to create an Internet-wide DDoS usingabout 330,000 bots of the CodeRed botnet dataset whereeach bot has 128 kbps, as shown in Figure 6.

3.9 Control Plane AttacksSo far we have only seen DoS from the perspective ofthe Data Plane, in which we generate messages fromthe Data Plane to attack the Data Plane. However, wecould also use the control plane to perform DoS attacksin a more sophisticated and hard to recover from man-ner. Since the control plane determines how traffic isrouted, attacks on the control plane can affect the globalreachability of a host instantly. Control Plane attacks aretherefore the largest threat to the Internet.

3.9.1 Using the Control Plane to attack theControl Plane

The default control plane protocol for Inter-AS routing isBGP. Someone who manages to attack BGP can createDoS by: Blackholing the victim, i.e. forcing all of itstraffic to be lost; Redirection of the victim’s traffic toanother AS; and inducing Instability, whereby the routeconstantly changes [27].

Attacks launched through BGP can be deadly, partic-ularly because BGP updates are Internet-wide and takea long time to recover from. The attackers can performprefix hijacking, where they claim to have a longer pre-fix to the victim, thus attracting all of the victim’s traf-fic [10], preferably into a black hole, creating DoS. Inaddition, they can perform link flapping, where a routeis withdrawn and re-announced a sufficient number oftimes to cause all routers to avoid that route (a pro-cedure called route dampening), creating DoS throughthe supposed failure of critical links. Other instabilitiesas well as congestion-induced BGP session failures can

also lead to DoS.

3.9.2 Using the Data Plane to attack theControl Plane

We have shown that we can use the Control Plane to at-tack the Data Plane and perform DoS. It is even morealarming, however, that we can also use the Data Planeto attack the Control Plane and achieve DoS. Since thedata plane is where all of user traffic lies, this essentiallymeans that simple users can perform control-plane dis-ruptions that lead to devastating DoS attacks.

The Coordinated Cross Plane Session Termination at-tack (CXPST) [30] can melt the current Internet core byperforming targeted attacks on BGP links in a strategicmanner, through the data plane. Since BGP relies on theannouncement of routing updates to all neighbors, up-dates propagate globally. By flapping certain carefully-chosen links with a large volume of data-plane traffic(through the use of a botnet), we can induce such a highrate of updates and queuing delays that BGP control-plane traffic is evaluated at a delay of 100 minutes, es-sentially breaking down the BGP protocol. CXPST re-lies on the fact that control and data planes are oftensharing the same medium, and therefore it is possibleto manipulate one plane through the other plane. Thisis because usually control and data traffic are separateflows sharing the same physical link.

The most important advantage of this attack is itsstrategic design: blindly flooding the Internet with pack-ets will cause congestion at the border ASs and leavethe core unaffected (in addition, the core will have thechance to route around such bottlenecks). But by strate-gically congesting certain links, the attack can veryquickly bring the core routers to a standstill. The at-tack is immune to almost all countermeasures currentlyimplemented (graceful restart, flap dampening and min-imum advertisement intervals), as well as denial of ser-

6

Page 7: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

0

0.2

0.4

0.6

0.8

1

100000 200000 300000 400000 500000

Des

truct

iven

ess

Botnet Size

CcodeRedGT-DDoS

0

10

20

30

40

50

60

70

100000 200000 300000 400000 500000

Col

late

ral A

Ses

Botnet Size

CodeRedGT-DDoS

(a) Destructiveness (b) Congested Collateral ASes

Fig. 5. Results when simulating an attacker with 128 kbps per bot when ASes havestep based resources.

bots and 48 congested collateral ASes. To achieve the same destructiveness, anattacker with the GT-DDoS distribution needs an additional 308,000 bots, andcongests 128 collateral ASes.

0

0.2

0.4

0.6

0.8

1

500000 750000 1e+06

Des

truct

iven

ess

Botnet Size

CodeRedGT-DDoS

0

20

40

60

80

100

120

500000 750000 1e+06

Col

late

ral A

Ses

Botnet Size

CodeRedGT-DDoS

(a) Destructiveness (b) Congested Collateral ASes

Fig. 6. Results when the top ten ASes double their resources. (attacker tra!c genera-tion = 128 kbps per bot)

These results indicate that an attacker with a realistically distributed botnetunder realistic tra!c and network settings can launch a focused Coremelt attackwhich causes core links to fail. This attacker can launch such an attack withoutraising suspicion by congesting few tributary links.

Figure 6: Step Network Simulation 128 kbps per bot [31]

vice defenses (because the botnet units that serve as thesinks actually requested the traffic).

0.00.10.20.30.40.50.60.70.80.91.0

0 500 1000 1500 2000 2500 3000

CDF

Factors of normal load

64k Nodes125k Nodes250k Nodes500k Nodes

(a)

0.00.10.20.30.40.50.60.70.80.91.0

0 50 100 150 200 250 300 350

CDF

1000’s of messages per 5-seconds

64k Nodes125k Nodes250k Nodes500k Nodes

(b)

0.00.10.20.30.40.50.60.70.80.91.0

0 200 400 600 800 1000 1200

CDF

1000’s of messages per 5-seconds

64k Nodes125k Nodes250k Nodes500k Nodes

(c)

Figure 4: Median router load of targeted routers under attack as a factor of normal load (a); and 75th percentile (b) and 90th percentile (c)of message loads experienced by routers under attack, measured in BGP updates seen in 5-second windows.

0.0

0.2

0.4

0.6

0.8

1.0

0 100 200 300 400 500 600 700 800

CDF

1000’s of messages per 5-seconds

AS 209AS 1239AS 3356AS 4766AS 7018

Figure 5: Update messages received during 5-second windows fora collection of specific AS under attack by 250, 000 bots.

4.3.3 Time to Process Updates

The end results of CXPST can be seen by examining thetime required to process a BGP update message. Routersprocess BGP update messages at a roughly constant rate. Ifthe rate they are received at surpasses the rate of compu-tation, messages will need to be buffered, and processingdelays will occur. Using performance figures from a routerbenchmarking study [70] we computed the delay betweenwhen core routers received BGP updates and when they fi-nally finished processing those updates while under attack.

We term the average delay between when a BGP updatearrives and when it completes being processed the time-to-process or TTP.3 The TTP for core ASes under attack byvarious sizes of botnets is graphed in Figure 6. CXPST suc-cessfully triggers the first BGP session failures 180 secondsinto the attack. From this point onward the average TTP forupdates arriving to core ASes increases dramatically. Forexample, in the case of a 250, 000 node attacker, after 10

3This is also known as makespan.

0.020.040.060.080.0

100.0120.0140.0160.0180.0200.0

0 200 400 600 800 1000 1200

Ave

rage

Tim

e to

Pro

cess

B

GP

Upd

ates

(min

s)

Simulated Time (secs)

64k bots125k bots250k bots500k bots

Figure 6: The average time to process a BGP update for core ASesunder attack by botnets of various sizes. Attack traffic starts at time0, the first link failures occur at time 180.

minutes of attack the backlog of updates is large enoughto delay processing for roughly 45 minutes. Once 20 min-utes of attack time have passed the wait has increased by anadditional hour, to just over 100 minutes. The reason forthis constant increase in TTP was discussed in Section 2.2.Routers under this amount of computational load are re-source exhausted, and can only recover if they are receivingupdate messages at a low rate. However, updates are nearlyconstantly arriving as a result of CXPST. This means thatthe affected routers are never given a chance to recover.

Anecdotal evidence suggests that routers placed in re-source constrained states behave unstably [17]. It is notoutside the realm of possibility that, when confronted withupdate queues thousands of messages long and processingdelays measured in minutes rather then microseconds, thatrouters will exhibit undefined behavior. This undefined be-havior adds a new dynamic to the system. We leave studyof this for future work.

Figure 7: The average time to process a BGP update forcore ASes under attack by botnets of various sizes [30]

As seen in Figure 7, the speed at which this attack canfunction is impressive: in 20 minutes, a 500k botnet cre-ates a processing delay of 180 minutes and a 250k botnetcreates a processing delay of 100 minutes. Since theseeffects will tend to multiply globally, recovery from suchan attack would have to involve rebooting essential com-ponents of the Internet or shutting down links until theystabilize, and would create a global, sustained disruptionof service.

Such targeted attacks on certain BGP links can be thenext-generation way of performing DDoS. While this at-tack only exists in the research domain, its severity couldbe huge once it reaches the real-world.

4 DetectionThe first step in defending against a DoS attack is detect-ing it. This is particularly hard, for a number of reasons.

First, DoS may congest some of the links before the vic-tim’s routers, and therefore the victim will have no ideathat its reachability by the world has been affected. Sec-ond, even when there is a perceived increase in traffic atthe victim’s network, it is hard to tell if it due to the re-sult of a flash crowd or a DoS attack. Third, systems thatare designed to detect such attacks may easily be trickedinto not detecting the incident and administrators havebeen trained to ignore alarms due to their large numberof false positives.

4.1 Intrusion Detection SystemsThe easiest way for a host to detect if its network is avictim of a DoS or any other attack is through a Net-work Intrusion Detection System (NIDS). Such systemsare usually installed in a non-inline manner (i.e. trafficdoes pass through them but beside them) so they sniffthe network in a fail-open manner. Commercial NIDSare available by Cisco, Juniper, Barracuda and most net-work vendors. There are also open-source versions, themost popular being Bro, Honeypot and Snort.

The NIDS will look for specific attack vectors andcompare them against a database. If the attack is in thedatabase, it will raise an alarm and notify the networkadministrator of the intrusion. The administrator shouldthen proceed to one of the defensive measures outlinedin the next section.

NIDS effectiveness The most important question isto look at how effective NIDS systems are in terms ofcorrectly predicting an attack. As shown in [4], NIDSare vulnerable to the base-rate fallacy experienced byBayesian statistics, i.e. in order to increase their effec-tiveness (P (I|A)), we have to get the false positive rateto an unattainably low level.

To illustrate this, consider an example IDS with0.01% false positive rate and 0.01% false negative rate.The box logs 1.1 million total records and each actual

7

Page 8: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

attack is reflected in 10 records in the NIDS.

P (A|I) = 10−4 (4.1)

P (A|I) = 10−4 (4.2)P (A|I) = .9999 (4.3)

P (I) =10 ∗ 3

1.1 ∗ 106= 2.7 ∗ 10−5 (4.4)

P (I) = 1− P (I) = .99997 (4.5)

⇒ P (I|A) =P (A|I)P (I)

P (A|I)P (I) + P (A|I)P (I)= .21259 (4.6)

As shown by (4.6), only one out of 5 alarms denotean actual intrusion. Bringing the false positives to theunattainably low 0.0001 rate gives a large improvement.

P (A|I) = 10−5 (4.7)⇒ P (I|A) = .72972 (4.8)

As a result, NIDS fail to provide a trusted means of re-liably detecting a legitimate attack, and may be ignoredby network admins.

Insertion/Evasion Attacks NIDS can be circum-vented. Insertion/Evasion attacks [28] take advantageof the fact that the NIDS is unaware of the state of theend-host and thus may be manipulated into interpretingthe packets differently from the end-host. The NIDSwould either generate a false alarm or generate no alarmfor an attack packet. In an insertion attack, the attackermanages to insert data that the NIDS sees but the victimdrops, thus obfuscating the attack signature. In an eva-sion attack, the NIDS drops data that the victim keeps.

There are many ways to create an insertion/evasionattack. The attacker can use a TTL that will reach theNIDS but not the victim, thereby creating a packet thatwill contain more segments when seen by the IDS thanby the victim. This allows the attacker to insert garbagethat will not match an attack signature, but when thepackets reach the host, the garbage is removed throughthe low TTL and the host experiences an attack. Anotherpopular evasion technique is setting the ECN bit (Ex-plicit Congestion Notification). Some hosts will droppackets that have this bit set, while others will keepthem. If the IDS and the victim host differ in the be-havior, the attacker can inject garbage with the ECN set,which will evade the IDS but attack the host.

IP fragmentation and TCP stream reassembly intro-duce serious storage requirements for IDSs, since apacket can only be examined after all fragments fromboth the TCP and IP layer have been received. If the IDSdoes not allocate storage efficiently, an attacker could

craft the packets to exhaust the IDS’s storage resourcesas shown in section 3.2.

In addition, both types of fragmentation have vary-ing behavior across operating systems, especially whenit comes to overlapping transmissions: for IP fragmenta-tion, Windows NT and Solaris always favor old data foroverlapping fragments, while other systems favor newdata for forward overlap; for TCP fragmentation, Win-dows NT always favors old data, while other systemsfavor new data for forward overlap. These differences inbehavior leave the IDS vulnerable to insertion and eva-sion attacks, in which packets reassemble differently onthe IDS than the host, and thus the intrusion is unde-tected.

Practically every element in the IP and TCP headercan lead to an insertion/evasion attack if the host andthe NIDS interpret it differently. The most recent ofthem is the TCP split handshake attack [32]. In the con-text of DoS and DDos, an attacker can leverage inser-tion/evasion attacks to successfully avoid the IDS fromreporting the attack. This will make administrators be-lieve that this is a simple flash crowd.

Attacking the NIDS itself The first thing an attackerwill do is try to take down the NIDS itself. BecauseNIDS are mostly created in order to detect specific ex-ploits instead of DoS attacks, they will crash under theheavy load of a DoS, and therefore go offline. Algo-rithmic Complexity attacks such as the ones shown in3.3 are particularly effective in terms of knocking out anIDS.

4.2 Traffic Measurement and AccountingWhile a host can detect DoS attacks through a NIDS andbandwidth monitoring, ASs also want to be able to de-tect and perhaps trace DoS attacks. This is essential inorder for them to be able to provide any kind of throt-tling and protect their own routers and customers. De-tecting DoS from an AS standpoint is particularly hard,because each backbone router incurs an average of 10million concurrent flows per hour. Simply counting thebandwidth each flow consumes is impossible. As In-ternet routers become capable of attaining higher andhigher throughputs, the issue of actually measuring thistraffic and classifying it becomes harder and harder tosolve, especially when taking into account the memoryrequirements necessary.

As a result, ASs rely on probabilistic counting toolsto identify heavy-hitters (flows that consume a lot ofbandwidth and are thus possible DoS targets). The sim-plest approach is Cisco’s Netflow, which samples everys packets and collects the data into a table. However, itis incurs sampling bias and memory overhead, and iron-ically upon a DoS network administrators have to switchit off because of the resources it depletes.

8

Page 9: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

In [13] Estan and Varghese propose two new methodsfor measuring Internet traffic generated by large flows orflow classes:

Sample and Hold in which traffic is randomly sam-pled, but after an entity has been sampled its countis updated for every subsequent hit of the packet.Sample and Hold is able to achieve a low proba-bility of false negatives (with 99% probability wemeasure a 5% variation in the flow’s traffic). Sam-ple and Hold dramatically improves the SampledNetFlow approach, which is widely used in indus-try, as it “holds” on to the flow after it has beensampled, instead of just randomly sampling

Multistage Filters in which each flow is hashed usingvarious hash functions (stages), and if the countcorresponding to a flow ID in every stage is abovea threshold, the flow is added into flow memory forfurther analysis. Multistage filters have no falsenegatives, while the multiple stages exponentiallyattenuate the false positives. To limit the false pos-itives even further, the authors provide additionaloptimizations that can be used: “conservative up-date of counters”, where counters are not updatedif the flows are added to memory, and “serial mul-tistage filters”, where the stages occur in series in-stead of parallel

The two algorithms can work with memory in SRAM,compared to DRAM which is currently used. The rela-tive errors of both algorithms scale with 1

M (M being theamount of SRAM), which is a big improvement com-pared to 1√

Moffered by previous algorithms. In addi-

tion, both algorithms can operate in real-time.Both algorithms provide cutting-edge methods for

ASs to identify DoS traffic reliably and efficiently.

4.3 Inferring Denial of Service ActivityAnother detection goal is for there to be some dedicated“DoS Observatories” that detect and report on DoS ac-tivity across the Internet.

In [25] researchers offer the first ever technique ofmeasuring, extrapolating and analyzing DoS attacks onthe Internet. They does so by monitoring a /8 addressrange for backscatter, i.e. responses from victims thatreceived illegitimate packets from spoofed addresses.Since the /8 is 1

256 of the Internet, samples can be eas-ily extrapolated to obtain representative figures for theentire Internet. The researchers used a series of filtersto classify their attacks and obtained statistically signif-icant results for variation of DoS attacks by time, pro-tocol, traffic rate, duration, victim name, TLD and AS,providing useful insights into DoS attacks.

Their analysis reveals very interesting observations onthese attacks. For example, as figure 8 shows, the PDF

Kind Trace-1 Trace-2 Trace-3Attacks Packets (k) Attacks Packets (k) Attacks Packets (k)

Multiple Ports 2,740 (66) 24,996 (49) 2,546 (66) 45,660 (58) 2,803 (59) 26,202 (42)Uniformly Random 655 (16) 1,584 (3.1) 721 (19) 5,586 (7.1) 1,076 (23) 15,004 (24)Other 267 (6.4) 994 (2.0) 204 (5.3) 1,080 (1.4) 266 (5.6) 410 (0.66)Port Unknown 91 (2.2) 44 (0.09) 114 (2.9) 47 (0.06) 155 (3.3) 150 (0.24)HTTP (80) 94 (2.3) 334 (0.66) 79 (2.0) 857 (1.1) 175 (3.7) 478 (0.77)0 78 (1.9) 22,007 (43) 90 (2.3) 23,765 (30) 99 (2.1) 18,227 (29)IRC (6667) 114 (2.7) 526 (1.0) 39 (1.0) 211 (0.27) 57 (1.2) 1,016 (1.6)Authd (113) 34 (0.81) 49 (0.10) 52 (1.3) 161 (0.21) 53 (1.1) 533 (0.86)Telnet (23) 67 (1.6) 252 (0.50) 18 (0.46) 467 (0.60) 27 (0.57) 160 (0.26)DNS (53) 30 (0.72) 39 (0.08) 3 (0.08) 3 (0.00) 25 (0.53) 38 (0.06)SSH (22) 3 (0.07) 2 (0.00) 12 (0.31) 397 (0.51) 18 (0.38) 15 (0.02)

Table 5: Breakdown of attacks by victim port number.

1min

2 5 10 30 1hour

2 6 12 1day

2 7

Attack Duration

0

1

10

100%

Atta

cks

Figure 5: Cumulative distribution of attack durations.

out flooding the victim’s link. Consequently, we leavecorrelation between attack rates and victim connectivityas an open problem.

6.2.4 Attack duration

While attack event rates characterize the intensity of at-tacks, they do not give insight on how long attacks aresustained. For this metric, we characterize the durationof attacks in Figures 5 and 6 across all three weeks oftrace data. In these graphs, we use the flow-based classi-fication described in Section 4 because flows better char-acterize attack durations while remaining insensitive tointensity. We also combine all three weeks of attacksfor clarity; the distributions are nearly dentical for eachweek, and individual weekly curves overlap and obscureeach other.Figure 5 shows the cumulative distribution of attack

durations in units of time; note that both the axes are log-arithmic scale. In this graph we see that most attacks are

1min

2 5 10 30 1hour

2 6 12 1day

2 7

Attack Duration

0

1

2

3

% A

ttack

s

Figure 6: Probability density of attack durations.

relatively short: 50% of attacks are less than 10 minutesin duration, 80% are less than 30 minutes, and 90% lastless than an hour. However, the tail of the distributionis long: 2% of attacks are greater than 5 hours, 1% aregreater than 10 hours, and dozens spanned multiple days.Figure 6 shows the probability density of attack du-

rations as defined using a histogram of 150 buckets inthe log time domain. The x-axis is in logarithmic unitsof time, and the y-axis is the percentage of attacks thatlasted a given amount of time. For example, when thecurve crosses the y-axis, it indicates that approximately0.5% of attacks had a duration of 1 minute. As we sawin the CDF, the bulk of the attacks are relatively short,lasting from 3–20 minutes. From this graph, though, wesee that there are peaks at rounded time durations in thisinterval at durations of 5, 10, and 20 minutes. Immedi-ately before this interval there is a peak at 3 minutes, andimmediately after a peak at 30 minutes. For attacks withlonger durations, we see a local peak at 2 hours in thelong tail.

Figure 8: Probability Density Function of attack dura-tions [25]

of attack durations presents spikes at rounded time du-rations for the attack for 5, 10 and 20 minutes. This isvery strong empirical evidence of the measurements be-ing reliable, as humans are likely to choose rounded timedurations.

Kind Trace-1 Trace-2 Trace-3Attacks Packets (k) Attacks Packets (k) Attacks Packets (k)

TCP 3,902 (94) 28,705 (56) 3,472 (90) 53,999 (69) 4,378 (92) 43,555 (70)UDP 99 (2.4) 66 (0.13) 194 (5.0) 316 (0.40) 131 (2.8) 91 (0.15)ICMP 88 (2.1) 22,020 (43) 102 (2.6) 23,875 (31) 107 (2.3) 18,487 (30)Proto 0 65 (1.6) 25 (0.05) 108 (2.8) 43 (0.06) 104 (2.2) 49 (0.08)Other 19 (0.46) 12 (0.02) 2 (0.05) 1 (0.00) 34 (0.72) 52 (0.08)

Table 4: Breakdown of protocols used in attacks.

0

10

20

30

40

50

60

70

80

90

100

10 100 1000 10000 100000 1e+06

Perc

ent o

f Atta

cks

Estimated Rate (Packets Per Second)

All AttacksUniform Random Attacks

Figure 4: Cumulative distributions of estimated attack rates inpackets per second.

have been used by the attacker to produce the backscat-ter monitored at our network. We see that more than 90%of the attacks use TCP as their protocol of choice, but asmaller number of ICMP-based attacks produce a dispro-portionate number of the backscatter packets seen. Otherprotocols represent a minor number of both attacks andbackscatter packets. This pattern is consistent across allthree traces.In Table 5 we further break down our dataset based

on the service (as revealed in the victim’s port number)being attacked. Most of the attacks focus on multipleports, rather than a single one and most of these are wellspread throughout the address range. Many attack pro-grams select random ports above 1024; this may explainwhy less than 25% of attacks show a completely uniformrandom port distribution according to the A2 test. Of theremaining attacks, the most popular static categories areport 6667 (IRC), port 80 (HTTP), port 23 (Telnet), port113 (Authd). The large number of packets directed atport 0 is an artifact of our ICMP categorization – thereare fewer than ten TCP attacks directed at port 0, com-prising a total of less than 9,000 packets.

6.2.3 Attack rate

Figure 4 shows two cumulative distributions of attackevent rates in packets per second. The lower curve showsthe cumulative distribution of event rates for all attacks,

and the upper curve shows the cumulative distributionof event rates for uniform random attacks, i.e., those at-tacks whose source IP addresses satisfied the A2 uni-form distribution test described in Section 3.2. As de-scribed earlier, we calculate the attack event rate by mul-tiplying the average arrival rate of backscatter packets by256 (assuming that an attack represents a random sam-pling across the entire address space, of which we mon-itor ). Almost all attacks have no dominant mode inthe address distribution, but sometimes small deviationsfrom uniformity prevent the A2 test from being satisfied.For this reason we believe that there is likely some va-lidity in the extrapolation applied to the complete attackdataset. Note that the attack rate (x-axis) is shown usinga logarithmic scale.Comparing the distributions, we see that the uniform

random attacks have a lower rate than the distribution ofall attacks, but track closely. Half of the uniform randomattack events have a packet rate greater than 250, whereashalf of all attack events have a packet rate greater than350. The fastest uniform random event is over 517,000packets per second, whereas the fastest overall event isover 679,000 packets per second.How threatening are the attacks that we see? Recent

experiments with SYN attacks on commercial platformsshow that an attack rate of only 500 SYN packets persecond is enough to overwhelm a server [10]. In ourtrace, 38% of uniform random attack events and 46% ofall attack events had an estimated rate of 500 packets persecond or higher. The same experiments show that evenwith a specialized firewall designed to resist SYN floods,a server can be disabled by a flood of 14,000 packetsper second. In our data, 0.3% of the uniform randomattacks and 2.4% of all attack events would still compro-mise these attack-resistant firewalls. We conclude thatthe majority of the attacks that we have monitored arefast enough to overwhelm commodity solutions, and asmall fraction are fast enough to overwhelm even opti-mized countermeasures.Of course, one significant factor in the question of

threat posed by an attack is the connectivity of the vic-tim. An attack rate that overwhelms a cable modem vic-tim may be trivial a well-connected major server installa-tion. Victim connectivity is a difficult to ascertain with-

Figure 9: Cumulative distribution of estimated attackrates

Figure 9 provides alarming results: 50% of the attackshave a rate greater than 350 packets per second, with 500SYN packets per second being enough to overwhelm aserver. A small percentage of the attacks are even ca-pable of breaking specialized firewalls designed to resistSYN floods.

5 Defense and ResilienceDefending against DoS and DDoS is hard, because with-out any mechanism predeployed, the victim cannot dif-ferentiate between the legitimate flows and the maliciousflows. As a result, evicting a flow or throttling band-

9

Page 10: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

width may even exacerbate the DoS because it will con-strain the resources available to legitimate clients evenfurther.

Resilience is also an important property, so that evenif a host fails to completely defend against the DoS at-tack, its customers can still be given some level of ser-vice.

In this section we will examine a number of defensesfor DoS, some of which are currently implemented, andsome which exist only on the theoretical realm. As wewill see, most real-world deployed defenses are inca-pable of successfully coping with DoS attacks.

5.1 Ingress Filtering

In ingress filtering each network drops all packets forwhich the source IP address does not belong to the net-work. This guarantees that a host in the network cannotspoof an IP address outside the network (note that theingress filtering is only performed to the network-levelgranularity). If every network in the Internet performedingress filtering, then a DoS victim would simply detectthe network from which the packets are coming and fil-ter it. This would allow the victim to maintain servicefor all non-attack packets and effectively defend againstDoS.

The problem is that unless every network performsit, ingress filtering is of no value. This is because itis a restriction imposed on a network that benefits net-works other than the network it is deployed on. Sincenetworks exist in different legal frameworks and coun-tries, it is impossible to convince every network or ISP toperform ingress filtering, so the measure has limited suc-cess. However, it can be leveraged as part of peering re-lationships and contractual agreements, or legally man-dated. In addition, ingress filtering is useless in DDoSattacks since the zombies do not spoof their addresses.

5.2 Pushback

Pushback [21, 18] works by proactively rate-limiting se-lected links with packets that would be dropped by thevictim anyway.

Each router monitors incoming flows and tries to clus-ter traffic together to establish aggregates. The paperassumes that DoS packets can be clustered into aggre-gates by some feature of the packet (destination IP, pro-tocol, network prefix). Upon a DoS attack or a flashcrowd, the router identifies the aggregates that are re-sponsible for the excess bandwidth and rate-limits themusing Random Early Drop. The rate limit L is computedby formula (5.1) where Aggregate[k].arrival is the ar-rival rate of the k-th aggregate:

i∑K=1

(Aggregate[k].arrival − L) = Rexcess (5.1)

Next, the router classifies each of its links as eithercontributing or non-contributing, depending on whetherthey consume a large or small fraction of the aggre-gate traffic. For all contributing links, the router sendsa pushback message upstream to the neighboring routerin which it instructs it to adopt a specific rate limit for allpackets matching that aggregate description. The neigh-boring router will perform the rate-limiting and if it de-termines that there are aggregates of its own that are con-gesting the links, it will further propagate a pushbackmessage upstream to its own neighbor. This creates apropagation of rate-limits upstream, greatly mitigatingDoS from the predetermined aggregates.

Each of the routers that have received the pushbackwill periodically send their status to the victim. This in-cludes the traffic that the victim would have received ifthe rate-limit was not in place. The victim will then pe-riodically re-examine the status of each link and updateor expire the rate-limit.

Good

R0

R1

Good Poor

R3R2100 Mbps

Bad

100 Mbps10 Mbps

Figure 6: Simple topology. Link - is congested.

10 20 30 40Number of bad flows

020406080

100

% o

f bw

default

badgoodpoor

10 20 30 40Number of bad flows

020406080

100

% o

f bw

local ACCbadgoodpoor

10 20 30 40Number of bad flows

020406080

100

% o

f bw

pushbackbadgoodpoor

Figure 7: The throughput of different aggregates, in de-fault (top), local ACC (middle), and pushback (bottom)scenarios.

tern with equal on and off times, chosen randomly be-tween 0 and 40 seconds. Each bad flow sends at 1 Mbpsduring the on periods. A collection of these flows givesvariable-rate non-congestion-controlled traffic, harder totackle because of its unpredictable sending rate. Thenumber of bad flows is varied to model different levelsof aggressiveness of the bad aggregate.Figure 7 shows the results of simulations without ACC(default), with only local ACC, and with pushback. Inthe default case, the bad aggregate consumes most of thebandwidth, and the good and the poor traffic suffer asa result. Local ACC controls the throughput of the badaggregate to protect the good traffic, but fails to protectthe poor traffic. Because local ACC cannot differenti-ate between the two, it penalizes the poor traffic along

2 Mpbs

R0.0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

R1.0

R2.0 R2.2 R2.3

R3.0 R3.3 R3.4 R3.7 R3.8 R3.11 R3.12 R3.15

R2.1

..... .... .... ....

S0 S63S32S31

. . . . . . . . . . .various destinations

20 Mbps

2 Mbps

20 Mbps

Figure 8: The topology for sparse and diffuse attacks.Link is congested.

default local pushback020406080

100

% o

f bw

sparse

default local pushback020406080

100

% o

f bw

diffuse

badpoorgood

Figure 9: Bandwidth allocation at the congested linkduring sparse (left) and diffuse (right) DDoS attacks.

with the bad traffic. In contrast, by pushing rate-limitingupstream where the bad and the poor sources can be dif-ferentiated, pushback protects the poor traffic as well asthe good traffic.

5.2 DDoS AttacksThe simulations in this section illustrate Local ACC andpushback with both sparsely-spread and highly diffuseDDoS attacks. These simulations use the topology inFigure 8, with four levels of routers. Except for therouter at the lowest level, each router has a fan-in of four.The top-most routers are each attached to four sources.The link bandwidths have been allocated such that con-gestion is limited to the access links from the sourcehosts and to the destination router.Ten good sources and four poor sources are picked atrandom in the topology, each of which spawn Web-liketraffic (using the Web-traffic generator in ns). The num-ber of bad sources depends on the simulation scenario.The sparse-attack scenario contains four randomly cho-sen bad sources, each sending on-off UDP traffic (asabove) but with an on-period sending rate of 2 Mbps.The diffuse attack scenario contains 32 UDP sources

10

Figure 10: Bandwidth allocation for a link under DDoSwith local rate-limiting and pushback [21]

Figure 10 shows how Pushback achieves a better allo-cation of bandwidth between good and bad flows. Notethat poor flows, i.e. flows that are mistakenly attributedas attack traffic and rate-limited, also increase. The ef-fects are stronger in sparse attacks than attacks where theattackers are more diffused across the network. This isexpected, because in a diffused DDoS attack it would behard to correctly classify the aggregates.

If DDoS attacks could be easily grouped into aggre-gates by some defining characteristic they share, push-back would be a great DDoS defense, because it couldprohibit attack traffic from even reaching the victim.By essentially installing a distributed system for filter-ing, pushback can effectively prevent against aggregates.The problem, however, is that, as we saw in section 4,detecting and isolating aggregates is extremely hard, be-cause DDoS traffic is so uniformly distributed. As a re-sult, Pushback’s effectiveness is significantly hampered.

10

Page 11: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

5.3 Over ProvisioningCompanies such as Akamai, Limelight and AmazonWeb Services offer their customers services for whichbandwidth is elastic, i.e. the service will automaticallyprovide more bandwidth if a customer is experiencinghigh loads (and potentially DoS). These Content Deliv-ery Networks (CDNs) have such huge resources avail-able that it is impossible for them to be victims of DDoS.For example in recent DDoS attacks, Amazon’s serverswere not affected by a DDoS launched against it [26].In addition, since CDNs offer points of presence closeon the ISPs, attacks that may bring down some points ofpresence only impact the reachability from that ISP orISP cluster.

Typically, the customer will pay for 95th percentilepricing, meaning he only pays for the bandwidth usedby 95% of traffic during a billing cycle. He may alsorequest an over provisioning margin above the averagebandwidth in case of attacks or flash crowds. However,after prolonged DDoS attacks, the DDoS traffic quicklybleeds into the 95th percentile, and the cost becomesprohibitive. For this reason, a company has to decideon how much it is willing to pay to keep its website liveduring a DDoS.

In [9, 22] it is shown how organized crime performedracketeering by launching DDoS attacks on online bet-ting websites right before the results were due, request-ing a fee for the attack to stop. The extortion feewas carefully calculated to be less than what wouldbe required for hiring a CDN to over provision (about$100,000 in [22]). The fact that most of these websiteschose to pay off the attackers denotes how expensive aover provisioning can be.

5.4 IPSIntrusion Prevention Systems (IPS) are IDS (section 4.1)that are placed inline and have the ability to thwart at-tacks in real-time, either on their own or by installingfilters at the firewall. Since IPS use the IDS functional-ity to detect the attacks before they act upon them, theyare limited by all the problems mentioned in section 4.1.

While an IDS crashing will simply not be able to de-tect attacks, an IPS failing will create downtime for theentire network, because IDSs are fail-closed. For thisreason, IPS make attractive targets for DoS attacks. Theattacker can simply find a way to exploit a bug or per-form DoS on the IDS itself, thereby causing it to crashor reboot. While it the IPS is down, the clients of thevictim network experience DoS.

In addition, since DDoS packets may not be classifiedas attack traffic (in fact, it can be totally legitimate traf-fic that does not map to an attack signature) the IPS willnot detect the attack and is therefore useless in prevent-ing it. IPSs are not only limited in their effectiveness to

defend against DDoS attacks, but may be attack vectorsin themselves. Algorithmic Complexity Attacks (section3.3) may be particularly effective on bringing down anIDS.

5.5 CAPTCHAsA CAPTCHA [33] is a Completely Automated PublicTuring test to tell Computers and Humans Apart. Theuser is presented with a challenge that only a humancan interpret and is asked to enter a response. If the re-sponse validates, the website has verified that he is hu-man. CAPTCHAs prevent DoS on the basis that a com-puter program cannot itself solve the CAPTCHA andthus the attack cannot be scripted, but has to rely on ahuman. Since each human takes an average of 10 sec-onds to solve a CAPTCHA, requests cannot be sent ata fast enough rate to cause DoS. CAPTCHAs are par-ticularly effective in websites where a few requests pro-vide large computational load on the server (section 3.2),since they require each client to perform a small compu-tation in exchange for the server investing computationtime.

CAPTCHAs are relatively effective against DDoS be-cause they cannot be pre-computed or solved in real-time in large numbers. However, attackers that have ac-cess to webservers with high throughput are able to per-form a relay attack, whereby they relay the captcha totheir visitor and play back the CAPTCHA to the vic-tim. Typically attackers have access to high-demandporn sites where they can request each visitor to solvea CAPTCHA before playing a video or image. If theyhave sufficiently high number of customers per second,they may be able to solve CAPTCHAs in real-time, fastenough to cause a DDoS.

5.6 CapabilitiesCapabilities are a way to provide resilience during at-tacks through flow differentiation, so that when DDoStakes place, the victim can police each flow differently.In the simplest case, there exist two flows, privilegedtraffic and unprivileged traffic. Privileged traffic is guar-anteed bandwidth through QoS reservations, while un-privileged traffic remains best effort. Upon a DDoS,privileged traffic is unaffected and experiences no degra-dation, while unprivileged traffic experiences DoS as itcompetes with each other.

The victim periodically makes sure that each of theprivileged flows is behaving properly. If not, it does notget its capability renewed and its flow is downgradedto unprivileged. Since the victim can identify all thehosts that have received a capability, it can easily blockthose hosts from accessing its network altogether, thusalso protecting the unprivileged users from performancedegradation.

11

Page 12: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

Client Authentication Username-password authenti-cation is the simplest way to establish a privileged flowthrough the application layer. Upon a DDoS, websiteslike Amazon may choose to give their authenticatedusers privileged flows, so that the DDoS is only experi-enced by unauthenticated users. In order for the attackerto perform DDoS at a high enough rate, he would needaccess to hundreds of thousands of accounts, which wemay safely assume he does not have. As a result, the at-tacker is unable to affect the privileged flow. The site’scustomers are therefore not affected and the DDoS is lessexpensive for the victim.

A victim can also provide its customers with priv-ileged flows through the use of client certificates, amethod very popular in SSH. Upon connection request,the client sends its certificate to the server, who then pro-motes the client to a privileged flow. If the client misbe-haves he is blacklisted for a short time and cannot per-form DoS through that account.

SIFF SIFF [38] is a backwards-compatible incremen-tal protocol that can be deployed over the current Inter-net to mitigate DoS attacks, by introducing a new classof privileged packets (vs unprivileged packets). Its suc-cess is phenomenal, managing to filter 97% of attacktraffic on real data. It does so by adding to each priv-ileged packet a capability field to the TCP header inwhich each hop adds its own (time-sensitive) markingfor the upward and downward path. The protocol re-quires a two-way handshake to bootstrap, through theuse of EXPLORER packets, while key rollovers can beperformed at no additional overhead, by having each hopmaintain a window of markings.

Privileged packets carry capabilities that are verified piece-meal (and statelessly) by the routers in the network, and aredropped when the verification fails. Routers implementingSIFF are programmed to give preferential treatment to priv-ileged packets, so that privileged packets are never droppedin favor of unprivileged ones (legacy packets not conform-ing to our scheme are treated as unprivileged packets). Priv-ileged channel capabilities are time limited and require up-dating by the server to remain valid. Because server coop-eration is required for capability updates, a server can haltthe packets of a privileged channel by simply quenching itscapability update messages.At a high level, the system works as follows: clients

and servers participate in a handshake (similar to the TCPhandshake, which can be carried on top of this handshake)using a specific type of unprivileged packet known as anEXPLORER (or EXP) packet. Routers insert path specificinformation into EXP packets, who’s aggregate among allthe routers in the path is used as a capability token for aprivileged channel between the client and the server. Af-ter the handshake, clients and servers communicate us-ing privileged packets called DATA (or DTA) packets, intowhich they insert the capabilities carried in the EXP pack-ets. When routers forward a DTA packet, they first checkto see if part of its capability equals that information whichwould have been inserted into the packet had it been an EXPpacket. If the markingsmatch, then the packet is forwarded.If not, then the packet is immediately dropped.Our discussion assumes a new format for the IP header.

The following fields are assumed to be present:

• Flags field (3-bits). This field contains the follow-ing 1-bit flags: the signalling flag (SF), used to in-dicate that the packet is a non-legacy (either EXP orDTA) packet; the packet type flag (PT), used to indi-cate that the packet is either a DTA (set) or EXP (unset)packet; and the capability update (CU) flag, set to indi-cate that the optional capability reply field is present inthe header.

• Capability field. This field is used by routers to addtheir marks to the packet en route to its destination.

• (Optional) Capability reply field. This field is usedby packet recipients to signal to the packet sender anew (or updated) capability, and is only present whenthe capability update flag is set.

We do not assume an exact length for the capability or ca-pability reply fields, as their lengths will depend upon otherparameters (such as the bits marked per router and maxi-mum path length). We assume the presence of a source anddestination address in the header, but not their exact length.No other fields of the packet header are used in our scheme.

In the following subsections, we describe in detail thehandshake protocol, as well as the potential issues in its im-plementation.

3.1 Handshake Protocol

Any client wishing to contact a server over a privilegedchannelmust first complete a handshake protocol to obtain acapability to insert into its privileged packets, and vice versafor server communication with the client. A single hand-shake is sufficient to provide both sides of a communicationwith their capabilities. Furthermore, handshake packets cancarry upper layer protocol data. The protocol is shown inFigure 1.

Figure 1. Handshake establishing a privilegedchannel. A client sendsanEXPLORERpacketto the server, which gets marked with mark-ing !. The server responds with its own EX-PLORER packet, with ! enclosed in the ca-pability reply field. The client sends its firstDATA packet with ! in its capability field andwith ", from the server’s EXPLORER packet,enclosed in the capability reply field.

The initiator of the handshake (the client) first sends outan EXP packet with its Capability field initialized to0. A packet is marked as an EXP packet by setting thesignalling (SF) flag and leaving the packet type (PT) flagunset. All routers along the path left shift z bits into theCapability field the EXP packet (see Section 3.2 for adescription of how these markings are computed). The ex-ception to this rule is that the first router in the path that seesa marking field of all 0 bits inserts a 1 bit before its marking(so that the actual capability in the field will consist of allbits up to, but not including, the most significant 1 bit.). Re-call from Section 2 that we assume that the marking field islarge enough to accommodate the markings from all of therouters in the path plus the 1 bit inserted by the first router.EXP packet marking is shown in Figure 2(a).

5

Figure 11: The SIFF handshake [38]

Figure 11 demonstrates the three-way handshake. TheEXPLORER packet obtains marking α as it traverseseach router. The sender then returns the marking thoughthe Capability Update field if he wishes to establish aprivileged connection with the client. The client thenappends his returned capability to the data packet, alongwith the server’s capability β for the downstream direc-

tion. As a result, a bidirectional privileged flow has beenestablished.

What makes SIFF particularly attractive in the realworld is its ability to be offered “as a service” runningover the legacy Internet, by differentiating traffic intolegacy/unprivileged and privileged. Even when a sin-gle routing path contains SIFF compatible and non-SIFFcompatible routers, we can still achieve good attack fil-tering, provided the legacy routers do not reset any ofSIFF’s headers.

Another great feature is SIFF’s ability to select andfilter individual traffic flows without requiring per-flownetwork state, thereby providing higher reliability to le-gitimate connections than other solutions like proba-bilistic filtering.

Another huge DoS issue is that incoming traffic is of-ten blocked even before it reaches the victim’s network,thus leaving the victim helpless to combat the DoS. Be-cause each hop on the network decreases the probabilitythat a legitimate packet with forged markings reachesthe server, by the time the packets reach the victim, theyhave already been significantly filtered, therefore allevi-ating a single router from having to bear the burden ofthe increased packet volume.

During a DoS, privileged SIFF flows are protected,since their bandwidth is reserved. The victim can choosewhether to renew or not an existing privileged flow, bychoosing whether it sends back the updated marking.For flows that attempt to be established after the DoS hasstarted, the client may need to resend its EXPLORERpacket until it hears back from the server. For example,even with a packet loss rate of 90%, the legitimate clientonly needs to send 10 packets to get one through. Af-ter the privileged flow is established, he is free from anycongestion.

For a router with x markings in its window, each of xbits, the probability that an attacker correctly generatesa random marking is:

P (x, z) = 1−(

1− 12z

)x

(5.2)

For a path of d routers, the probability of an attackercorrectly generating a random marking is

P (x, z, d) =[1−

(1− 1

2z

)x]d

(5.3)

For a typical path length of 15 routers, the probabil-ity that an attacker correctly guesses a marking becomesP (x = 2, z = 4, d = 15) = 10−13 which is negligible.

Figure 12 shows how effective SIFF is in defendingagainst DoS for different combinations of z and x. SIFFachieves this without significant per-flow state, and isthus a great way of providing capabilities to defend fromDDoS.

12

Page 13: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

0 10 20 30Hops from Victim

0

0.2

0.4

0.6

0.8

1R

atio

of T

otal

Atta

ck P

acke

tsNo FilteringFiltering (z=1, x=2)Filtering (z=2, x=2)Filtering (z=3, x=2)Filtering (z=4, x=2)Ideal Filtering

(a) Performance for various values of z, (x = 2).

0 10 20 30 40Hops from Victim

0

0.2

0.4

0.6

0.8

1

Rat

io o

f Tot

al A

ttack

Pac

kets

No FilteringFiltering (z=2, x=5)Filtering (z=2, x=4)Filtering (z=2, x=3)Filtering (z=2, x=2)Ideal Filtering

(b) Performance for various values of x, (z = 3).

Figure 4. Packet filtering performance for varying bits per router (z) and window sizes (x).

attackers at random from our map, and have them send anumber of packets with randomly forged capabilities. Thenumber of attackers and the number of packets per attackerdo not affect the outcome of our experiment, in which wesimply drop a certain percentage of attack packets at eachhop. We show the results of our experiments only from thef-root skitter monitor; which are the most pessimistic be-cause of the high concentration of paths close to the victimrelative to the other two monitors’ path distributions.Figure 4(a) shows the ratio of total attack traffic at each

hop from the victim for varying values of z. As expected,without filtering of any kind, eventually all attack pack-ets arrive at the victim. Furthermore, with ideal filtering(routers automatically drop all attack packets) we see acurve that matches the path distribution, since the attackers’packets are immediately dropped after one hop. The SIFFscheme performs excellently, filtering out 97.14% of the at-tack traffic using only a one bit marking per router (z=1),and filtering out 100% of the attack traffic (six nines) witha marking scheme of four bits per router (z=4).Figure 4(b) shows the same experiment with a constant

z and a varying x. As expected (and suggested by Table 1)the effect on filtering performance caused by varying z isfar greater than that caused by varying x. Furthermore, al-though not shown in the figure, it is intuitive that the effectof varying x decreases as z increases.

4.2 Privileged Channel Establishment Under Un-privileged Packet Flooding

As shown in the previous section, attacker flooding ofprivileged packets has little effect on the victim, because so

few of the forged packets reach destinations even close tothe victim’s network. In this section, we analyze a differentattack approach, which is to flood with unprivileged pack-ets for an extended period of time with the goal of stop-ping all new connection establishments. However, unlikethe current Internet infrastructure, in which established TCPflows can still be affected by IP packet floods, SIFF’s priv-ileged flows are unaffected by unprivileged traffic conges-tion. Thus, a client and server only need to exchange twopackets within minm time, the least amount of time that acapability is valid (defined in the previous section), beforethe privileged channel between them is established and theycan communicate from then on, unaffected by the ongoingattack.We assume that unprivileged traffic is causing conges-

tion at the last i hops of the network, and that the proba-bility of getting dropped at any one of those routers is !i.We ignore the probability of the server getting its outboundpackets dropped, because congestion in the routers duringflooding attacks is typically experienced by inbound packetsonly. Because the drop probabilities at routers are indepen-dent Bernoulli trials, the probability that a client and serverwill be able to establish a privileged channel after one try(by exchanging two packets is): P (connect after 1 try) =(1! !i)i.The probability that the client can connect after k tries

is:

P ( connect after k tries)= 1! (1! P (connect after 1 try))k

= 1! (1! (1 ! !i)i)k

For a given desired connection probability, P (connect)

9

Figure 12: Packet filtering performance for varying bits per router (z) and window sizes (x) [38]

5.7 PuzzlesPuzzles [34] are an excellent way for protecting againstresource starvation attacks (section 3.2) on servers forwhich a single request induces significant load. Theserver can require the client to perform a small com-putation (solve a puzzle) before the server performs theexpensive operation. While for the average client thisis a minor unnoticeable cost, for an attacker performingDoS, sending packets of adequate throughput is impos-sible unless the client has access to atypically huge re-sources, such as a mainframe.

The typical puzzle works as follows: the server calcu-lates the value T which is based on a MAC whose key isonly known to the victim.

T = MACK(ClientIP,ClientPort,T imestamp, ISN) (5.4)

The server conceals the first r bits of the value T andtransmits it to the server along with the hash of T. Theserver then computes on average 2r−1 hash operationsto find the r-bit value x needed to complete T.

S → C : [T ]m−r, r,H(T ) (5.5)C : H(x||[T ]m−r) = H(T ) (5.6)

C → S : T, r, x (5.7)

Puzzles allow the server to almost statelessly separatelegitimate flows from attack flows. Some care needs tobe taken so puzzles are not vulnerable to replay attacksand that the puzzle generation/verification does not turninto an attack vector in itself.

Without knowing the amount of processing poweravailable to the attacker, it is hard to set up the puzzle.Puzzle Auctions are a way to fix this problem, by havingeach host auction off the amount of computation they arewilling to make. This can be done by having each hostauction the number of bits r it is willing to search, withan average of 2r−1 ∗ 1µs computation required. Upona DoS, the server allocates connections or bandwidth tothe highest bidders, thereby maintaining fairness. This isdone by mapping the each bid to a priority in a priorityqueue or a weight in Weighted Fair Queuing.

Because in most botnets the resources of the zombiesare not significant, so as not to alert the user, zombieswill be unable to commit to high values of r and the le-gitimate hosts will win. If legitimate hosts see that thereis a DoS and they can’t reach the victim, they will sim-ply raise their bid until they outweigh the attack packets.Even if the attack packets attempt to compete with theuser, DoS will not be possible because the high compu-tational load will render the attacker unable to providethe high throughput necessary.

Puzzle Auctions have been shown to be fully compat-ible with TCP clients and even interoperable with un-modified TCP stacks. They can therefore be presentlydeployed without breaking network functionality.

5.8 SYN Cache and CookiesWhen machines receive a SYN packet, they need to storesome data in order to identify the connection and its op-tions (window size, ISN, TCP header options). Even ifthey ignore those details, they still have to be able toretransmit their SYN-ACK if an ACK is not receivedwithin a timeout period specified by the TCP standard.Unfortunately, this gives rise to SYN flooding attacks,explained in section 3.6.

13

Page 14: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

SYN Cache and Cookies [19] attempts to preventDDoS during a SYN flood. During average load times,the server maintains a SYN cache, where each SYNpacket is stored in a global hash table with fast lookuptimes O(1). The hash value used is of the form

H = MACK(SrcIP, DstIP, SrcPrt, DstPrt) (5.8)

where the value K is know only to the server. This isdone to prevent Algorithmic Complexity Attacks (sec-tion 3.3). The hash table has an upper bound on the num-ber of entries in the table and hence amount of memoryused. If new entries overflow the per-bucket limit, theoldest packets are dropped first.

When SYN flooding is experienced, the serverswitches from using a SYN cache to SYN cookies. InSYN cookies the server statelessly encodes necessaryinformation in the ISN of its SYN-ACK, so that whenit receives it back with an ACK it can pick up the con-nection from where it was left. This requires no storageon the server during the connection establishment phase.

In [19], the client’s MSS size is downsampled to4 predefined values, denoted by M ′. Kt is a time-dependent secret know only to the server. The 32-bitcookie C is then generated as follows:

H = H(SrcIP,DstIP, SrcPrt,DstPrt,Kt) (5.9)c1 = ([H]24..0 || [WindowIndex]7..0) (5.10)c2 = SrcISN (5.11)c3 = M ′ << 5 (5.12)C = c1 ⊕ c2 ⊕ c3 (5.13)

When the server receives back the ACK from theserver, he first subtracts 1 from the ISN and XORs itwith the cookie. He then XORs the TCP options to ob-tain c1. He verifies the hash and that the key Kt has notexpired and grants the connection to the client.

Because the DDoS attacker cannot hear the repliesfrom the server directed to the spoofed address he sentthe SYN from, the attacker will not be able to initiate aconnection using an ACK. Since SYN cookies are state-less and the server never receives an ACK back, the SYNflooding attack is completely averted. In order to guessthe correct cookie and cause the server to reserve a con-nection, the attacker needs to perform an average of 223

hashes, each of 1µs, which will take about 8 seconds todo. Since 8 seconds is longer than the timeout period forKt, the attacker simply cannot perform a DDoS using aSYN flood.

Figure 13 shows the performance of SYN cache andSYN cookies during an attack. Most connections arecompleted within 300µs. A simple comparison with

0

10

20

30

40

50

60

70

80

90

100

0 200 400 600 800 1000

% o

f con

nect

ions

com

plet

ed

microseconds

Time needed to connect() to remote system

syncache & syncookies, idle boxsyncache, SYN flood

syncache & syncookies, SYN flood

Figure 4: Performance comparison of a system with syncache and syncookies over one using only syncache.

6.2 Round trip performancePrior measurements were taken by timing how long ittakes for a connect() call to complete on the client ma-chine. This corresponds to the time required to complete2 stages of a TCP handshake, since the client machineenters the ESTABLISHED state as soon as it receives aSYN,ACK. An unanswered question is how long it takesthe server to enter the ESTABLISHED state, from thetime the initial SYN is sent from the client. This timemay be affected by the different processing requirementsto verify the ACK, and may fail if the original syncacherecord no longer exists.To verify failure was not a concern, the experimental

setup was modified to include the time required to read()a byte from the server, which can be viewed as a 4 wayhandshake: transmit SYN, receive SYN,ACK, transmitACK, receive data. The results for this test are presentedin Figure 6.On an unloaded box, there is no measurable differ-

ence in performance between the syncache and syncook-ies approaches. However, when the box is loaded, thecombination of syncache and syncookies outperforms apure syncache configuration. Again, as there are no TCPretransmits occurring, the performance difference is notdue to entries getting dropped from the syncache hashbuckets. This also indicates that the bucket depth of 30entries that is used in these tests is sufficient to handlethe RTT across the local LAN; connections are gettingestablished before they are dropped.

The difference between the two algorithms could beexplained by the difference in ISS generation, or by thefact that the standalone syncache needs to perform FIFOdrop for a bucket, which is bypassed when syncookiesare in use. However, it is not expected that the list man-agement requirements, which consist of few TAILQ *calls, would be significant. The investigation into theperformance difference is still ongoing.In comparison to the unmodified system presented in

Figure 2, there is a dramatic improvement. In this ex-periment, clients were able to connect to the server andperform useful work (reading one byte), with all attemptscompleting within 1 second. In the unmodified system,90% of the connections still had not completed the TCPhandshake after 1 second. Even with reduced queuedepths, the performance of the unmodified system doesnot match the new code.

7 Previous Work

David Borman wrote a patch for BSDi which imple-mented a SYN cache in October 1996, which was re-leased as an official BSDi patch [2]. This implementa-tion used the cache only as a fallback mechanism in casethe listen queue overflowed, and did not retransmit theSYN,ACK to the peer. The justification given was thatsince the host was under attack, performing retransmitswould be a waste of CPU time [3].This code was incorporated into NetBSD[5] in May

Figure 13: SYNCache and SYNCookies performanceduring SYN Flood. Compare with figure 3. [19]

Figure 3 in section 3.6 (which shows us the vast deterio-ration experienced by a system under SYN flood) provesthe effectiveness of this measure against SYN flooding.

6 DeterrenceThe reason DoS is still a debilitating attack after so manyyears is that the current Internet has failed to provideDeterrents to attackers. This is not because of unwill-ingness or lack of incentives, but because of its design,which permits anyone on the Internet to send packets toanyone, with any source IP and without being detected.

One way countries can provide deterrence to DoS isthrough legal deterrents, i.e. imposing large fines on at-tackers that perform DoS. In the US for instance, theComputer Fraud and Abuse Act includes penalties thatinclude years of imprisonment. In the UK the Police andJustice Act of 2006 specifically outlaws DoS attacks andoffers 10-year maximum imprisonment. Other countrieshave similar legislation.

There are three primary problems with legal deter-rents. First, because of the global trust-all structure ofthe Internet, attacks can always be launched by coun-tries that do not provide a legal framework against DoSattackers. Second, because attackers typically use zom-bies that are spread across the world and are infectedwithout the knowledge of the owners, the host perform-ing the DoS is usually not the attacker himself. Thirdand most importantly, the Internet does not provide Ac-countability; since any host can spoof its IP address andsend packets, there is no way of holding someone ac-countable for a packet or tracing an attack back to itsorigin. Even when DoS attackers are caught, they arenever caught by tracing the packets back to the attacker,but through other forensic residue that the attacker leftbehind.

In order to provide credible deterrents to DoS attacks,we have to provide accountability, specifically Host Ac-

14

Page 15: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

countability. This allows us to hold a host accountablefor its actions and reliably trace back a packet to its orig-inal sender. By providing accountability, we allow vic-tims of DoS attacks to legally prosecute their attackersand demand compensation for the attack.

In the following sections we outline accountabilityprotocols for both the current Internet and a clean-slateInternet.

6.1 Accountability for the Current Inter-net

Because the current Internet does not offer accountabil-ity by design, adding accountability incrementally is adirty procedure that introduces significant overhead. Be-low, we present two protocols that attempt to do so.

Accountability as a Service Accountability as a Ser-vice (AAaS) [8] offers accountability as a first-classnetwork service, separate from routing and addressing.This creates a new economic and legal model, wherethe client subscribes to an accountability service that re-quires an escrowed deposit.

The accountability service provides authenticatedclients with identifiers that can be used to mark pack-ets as accountable. A victim that is experiencing DoScan report that abuse is taking place from one of the ac-countability provider’s clients by sending it a messagewith the traffic identifier of the unwanted data. The ac-countability provider can then choose to filter the portionof its client’s traffic that matches that specific identifier.If the abuse is serious enough that it requires legal ac-tion, the accountability service may release the client’sidentity and use some his escrowed deposit to pay backthe victim.

The authors propose two simple approaches for pro-viding AAaS in the current Internet.

In the strawman protocol, the accountability providergives its customers signed client certificates. Eachsender then needs to sign each packet with its privatekey. The receiver retrieves the accountability provider’scertificate to verify that the client’s certificate is signedby a provider it trusts, and then verifies that the signatureof the packet is correct. The strawman protocol providesnon-repudiation, because only the host that signed thepacket has the private key that corresponds to it. How-ever, the use of public key cryptography for every packetis a significant performance impediment.

For this reason, the authors propose a more efficientprotocol, in which each ISP and client has a certificatewith a Diffie-Hellman public key. This allows the senderto generate pairwise keys between him and every routeralong the path, and only use symmetric cryptography forauthenticating subsequent transmissions along the samepath. To do this, the sender S first determines the ISPs

along the path. It then uses the Diffie-Hellman calcula-tion to derive pairwise keys. It uses those keys for com-puting MACs of the packet for each of the ISPs, andthen appends its certificate as well as each MAC to thepacket.

S :{S, gs}K−1CA

(6.1)

ISPi :{i, gi}K−1CA∀i (6.2)

S :Ki = gsi ∀i (6.3)S → ISP1 :pkt, {S, gs}K−1

CA,MACK1(pkt),

MACKi(pkt) ∀i (6.4)

ISPi :check cert, MACKi(6.5)

Each router along the way verifies the certificate andMAC and forwards the packet. We have therefore estab-lished an accountability chain in a hop-by-hop fashion.In case of a DoS, the receiver can know exactly who sentthe packet and request filtering or even legal action.

In terms of deployment, it is important to mention thatAAaS does not require adoption by a significant portionof clients to provide a benefit to the users. Instead, it pro-vides instant gratification. During a DoS, the victim canremain resilient by guaranteeing bandwidth for clientsusing AAaS and providing best-effort service for otherflows.

Bootstrapping Accountability In [20] the authorspropose IP made Accountable (IPA). The design usesDNSSEC to securely bind an IP Prefix to an ASs pub-lic key. At the AS level, certificates are distributed in-band, through BGP. In order to decrease the overheadfrom sending the public key to each of the neighbors ev-ery time a BGP update is made, IPA uses three caches.The cached certificates are stored in a tree, where theroot is ICANN. Every new certificate or revocation hasto validate in that tree, or it is discarded. With thesesecure prefix-to-key bindings, we can now have senderaccountability for whoever is sending the packets.

IPA uses reverse DNSSEC queries to obtain publickeys for each source network address range on the Inter-net. The hierarchical structure of DNS works really wellin this case, as the parent network provides the child withsub-delegation certificates. In case network prefixes donot fall under CIDR bounds, a small modification toDNS queries is proposed that includes the subnet maskfor the network, e.g. 2/24.10.10.in-addr.arpa.

When a router in a network sends a packet, he signs itwith its private key. The receiving router performs a re-verse DNSSEC query and obtains the public key, signedby the sender’s AS. The receiver retrieves the AS’s keyand verifies the signature. It then verifies the signatureon the data. If the signatures verify, the packet is valid.

15

Page 16: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

During a DoS, the victim can ask all senders to signtheir packets if they want to connect. In order for theattacker to connect, he would have to reveal his identityand therefore face legal consequences.

In terms of deployment, IPA does not replace currentfunctionality and is affordable to deploy, since a big per-centage of the infrastructure is already in place. In addi-tion, because IPA can be incrementally deployed, earlyadopters get a lot more value for adopting. However, aswith any approach to add accountability to the currentInternet, it increases overhead. There are also some racecondition timing issues that can lead to a deadlock dur-ing revocation/key rollover.

6.2 Deterrence in the Next Generation In-ternet

We have seen how the structure of the current Internetdoes not permit accountability to be established in anattractive manner. For this reason, researchers have de-signed protocols to offer accountability using a clean-slate approach. Accountability has to be an integral partof the Next Generation Internet. Below, we present twoprotocols that offer accountability and deterrence in aclean-slate approach.

AIP The Accountable Internet Protocol (AIP) [3] pro-vides a means of replacing the IP protocol to ensure ac-countability in a network, enabling us to detect and tracethe source of spoofing and DoS, bootstrap identity intonew protocols and provide additional security to exist-ing protocols. AIP introduces two main security fixesfor attacks that the current structure of the IP protocol isunable to not only prevent, but even detect: first, a pro-cedure for verifying identities of packets, and second,a shut-off protocol that relies on trusted network cards’ability to halt packet transmissions and establish filtersat the source of DoS attacks.

In AIP, each host and AS has a public key certificate.The address of each end host is the hash of its publickey, or the Endpoint Identifier (EID). Likewise, the ad-dress of each AS is the hash of its public key, or theAutonomous Domain Identifier (AID) (note that AID isused here to avoid confusion, the actual paper calls thisthe AD: Autonomous Domain). By using the public keysthemselves as the network addresses, AIP avoids the useof a PKI to bind an address to a public key and tremen-dously simplifies the task of providing accountability.

Verification of a packet’s source address is done on aper-hop basis by using unicast reverse path forwarding.Each router along a path sends a verification packet tothe source. The source is required to respond by provid-ing a signature of the packet:

Ri → S :V = HMACKt(SAID : SEID,

DAID : DEID, interface) (6.6)S → Ri : {KEID, V }K−1

EID(6.7)

Because only the legitimate source has the public key,the identity of the source is established and cached bythe router. Spoofing is not possible.

AIP also provides a shutoff protocol, through whicha victim V of a DDoS attack can request the zombieZ’s NIC to filter the packets directed to it. The shutoffprotocol is of the form:

Z → V : Packet P (6.8)V → Z : {KV , TTL,Hash(P )}K−1

V(6.9)

Since the victim signs it with its private key, it cannotbe spoofed. Since the victim has to show proof that itreceived a packet from that host (through H(P )), it pre-vents abuse by attackers that would use the shutoff itselfto launch a DoS. In order for this to work, the zombie’sNIC has to be untamperable, so that the zombie cannotignore the shutoff packets without facing legal conse-quences.

AIP is a great way to provide accountability in aclean-slate approach without the use of a complicatedPKI. A victim can reliably detect an attacker. In ad-dition, through the shutoff protocol he can choose in-struct the attacker to stop sending attack packets duringa DDoS. As a result, AIP provides a credible deterrentfor DDoS attacks.

SCION SCION [39] is a proposal for a new Internetarchitecture designed to provide, among others, explicittrust and accountability. SCION separates the Internetinto a set of independent trust domains (TDs). A TDwill most probably be a country or an alliance of coun-tries. Each TD is completely independent from the otherTDs, and actions in one TD cannot affect in another TD.Each TD has its own legal framework, which it strictlyenforces. Therefore, SCION reduces the number of trustchecks that a network needs to perform to a small num-ber of trusted domains, instead of each node having topairwise evaluate the other.

As shown in Figure 14, each TD is comprised of mul-tiple Autonomous Domains (ADs) which are the equiva-lent of today’s ASs. ADs that have peerings with ADs inother TDs are called Core ADs. The set of all Core ADsfor a TD makes up its TD Core. Inter-TD routing hap-pens using established manually-configured paths be-tween the TD Cores, since there is only a handful ofTDs. Intra-TD routing happens using construction bea-cons, where the TD Core regularly polls each AD for a

16

Page 17: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

3

AS PATH {C, M, A, E} is active to reach destination E. Sup-pose that A withdraws the path {A, E}, but the malicious ASM intentionally suppresses this withdrawal message from C.Consequently, the same AS PATH {C, M, A, E} still remainsactive, because in path vector routing B only withdraws a path{B, A, E} instead of a specific link, which does not invalidatethe path through M .

III. SCION OVERVIEW

SCION has three grounding principles: domain-based iso-lation, mutually controllable path selection by both the end-points and intermediate ISPs, and explicit trust for end-to-end communication, as Section III-A details. These principlesprovide a framework within which SCION achieves resilienceto routing attacks. The rest of the section provides an overviewof the SCION architecture.

A. Design Principles

Principle 1: Domain-based isolation – Dividing the routingcontrol plane into independent domains. Isolation amongindependent domains protects routing in one domain frommalicious activities and routing churn in other domains. Thisbenefits both security and scalability while retaining reacha-bility and path diversity across domains. For example, SCIONenables frequent routing updates to periodically refresh pathstate, so that each AD always maintains a fresh (and accurate)network topology for efficient routing decisions.Principle 2: Mutually controllable path selection – Jointpath selection between source and destination. SCIONgreatly increases both the source and destination’s ability toaffect, select and control the construction of the routes toand from themselves, while still respecting intermediate ISPs’routing policies.Principle 3: Explicit trust and small TCB for end-to-endcommunication. By segregating mutually distrustful entitiesinto different trust domains, each trust domain can choose acoherent root of trust (e.g., a few tier-1 ISPs) for bootstrappingtrust among ADs in the same trust domain. As a result, anendpoint E knows and is able to choose explicitly whom totrust for achieving reliable end-to-end communication, whileuntrusted ADs in other trust domains cannot affect the pathdiscovery and route computation of E. Consequently, an entityonly has to trust a small subset of the network thus achievinga small TCB for end-to-end communication.

B. Hierarchical Decomposition

Our architecture defines the Autonomous Domain (AD) asthe atomic failure unit, representing both ISPs (or transit ADs)and endpoint ADs. Large ISPs would be split into multipleADs, based on their topology of separately administered do-mains. SCION divides the ADs in the Internet into a hierarchyof trust domains, or TDs, as shown in Figure 2, used toprovide the domain-based isolation property. A TD is a setof ADs that agree on a coherent root of trust and have mutualaccountability and enforceability for route computation undera common regulatory framework.

AD 1

Shortcutfrom 1 to 2

TD(e.g., EU)

Sub TD(e.g., PA)

TD(e.g., US)

TD Core(e.g., EU Core)

Inter TDRoute

from 3 to 2

TD Core(e.g., US Core)

AD 2

AD 3(e.g., CMU)

AD 4(e.g., PSC)

Fig. 2. Trust domain architecture. Black nodes are ADs in the TDCore. Arrows indicate customer-provider relationships. Dashed lines indicatepeering relationships.

Each TD has a TD Core, a set of designated ADs forming amutually reachable clique that interfaces with other TDs. ADsin the TD core naturally serve as the egress/ingress ADs ofthe corresponding TD. In the current Internet, the top-tier ISPswould constitute the TD Core.

We envision the effort to establish a TD to closely mirrorthat of starting a certification authority, and the number of top-level TDs to be limited (e.g., up to a few hundred) which mapto real-world political or cultural groups. Section IV presentsa detailed description of a TD.

C. Routing, Lookup, and Forwarding

All ADs in SCION know a set of paths to reach the TDCore in their trust domain for establishing communication withother endpoint ADs. Specifically, for an AD N that is not inthe TD Core, we call the paths for sending packets from N tothe TD Core up-paths of N , and the paths for sending packetsfrom the TD Core to N the down-paths of N , which are notnecessarily different from the up-paths.

The down-paths of each AD N are available to other ADsvia a lookup service, and are used by other ADs to reach N .To communicate with a destination AD D in another TD, thesource AD S selects a subset of its up-paths to reach the top-level TD containing S for sending data to D, and can pick anindependent subset of the down-paths for receiving data fromD. In this way, ADs retain control over the paths for bothoutgoing and incoming data within their own TDs.

In the following, we first sketch routing, name lookup,and forwarding between two endpoint ADs in the same TD,and then briefly explain how cross-domain communication isenabled.

Path construction. In SCION, ADs use a set of up-paths/down-paths to send/receive packets to/from the TDCores. We generally refer to these up-/down-paths as paths,which are constructed similar to path vector as follows. TheADs in the TD Core first transmit one-hop paths starting from

Figure 14: SCION trust domain architecture. Dashedlines indicate peering relationships while arrows indi-cate customer-provider relationships [39]

set of k up-paths it would like to use to connect to theTD Core. Since these construction beacons are authen-ticated, SCION prevents the control plane DoS attacksthat were demonstrated in section 3.9, such as Prefix Hi-jacking, Routing Path Falsification and Blackholing. Inaddition, since each destination announces its up-paths,it gets to choose what paths incoming traffic will take,and thus route around failures and congestion, prevent-ing DoS. As opposed to the current Internet where thesender has all the control, in SCION the control is on thereceiver, as it ought to be.

Isolation between TDs allows enforcement of filtersand restrictions on inter-TD traffic that prevents DDoS.An attacker can only attack nodes in his own TD andonly at the data plane. But because each TD is a legallycoherent domain, it can credibly enforce legal deterrentsand pursue attackers. In addition, SCION uses AIP, soevery attack can be traced back to its sender.

SCION provides not only an accountability systemthrough its use of AIP, but also transforms the Internetarchitecture in such a way that a host is not vulnerableto attack by a host in a legal system it cannot control.For these reasons, attackers will be reluctant to performDDoS, solving the problem once and for all.

7 ConclusionWe have started this discussion by looking at the typesof DoS attacks currently available. DoS attacks are ofvarying sophistication and may have devastating conse-quences on the victims, such as monetary loss. For DataPlane DoS, we looked at Bandwidth, Resource Starva-tion, Algorithmic Complexity, TCP and ICMP as vectorsfor this attack. We also provided Reflector and Smurf as

methods for performing the DoS. We also looked at Con-trol Plane attacks as ways to target the Control Plane andthe Data Plane for DoS.

As a result of poor design, the current Internet doesnot offer DoS victims Detection, Defense, Resilienceor Deterrence when attacks take place. For this rea-son, we looked at a number of commercial and researchapproaches for each. In Detection, we looked at IDS,Traffic Measurement and Accounting, and Inference asmethods of detecting DoS. In Defense and Resilience,we looked at Ingress Filtering, Pushback, Over Provi-sioning, IPS, CAPTCHAs, Capabilities, Puzzles, andSYN Cache and Cookies. In Deterrence, we looked attwo incremental mechanisms for accountability in thecurrent Internet, Accountability As a Service and Boot-strapping Accountability in the Internet We Have. Wealso looked at two clean-slate approaches for account-ability, AIP and SCION.

We conclude by emphasizing that the only way tocompletely eliminate DoS in an efficient and effectivemanner is by adopting a clean-slate approach for the fu-ture Internet.

References[1] Z. Ahmad, J.L. Ab Manan, and S. Sulaiman,

Trusted Computing based open environment userauthentication model, Advanced Computer Theoryand Engineering (ICACTE), 2010 3rd InternationalConference on, vol. 6, IEEE, pp. V6–487.

[2] D. Andersen, H. Balakrishnan, N. Feamster, T. Ko-ponen, D. Moon, and S. Shenker, Holding the In-ternet accountable, ACM HotNets-VI (2007).

[3] David G. Andersen, Hari Balakrishnan, NickFeamster, Teemu Koponen, Daekyeong Moon,and Scott Shenker, Accountable Internet Protocol(AIP), Proc. ACM SIGCOMM (Seattle, WA), Au-gust 2008.

[4] S. Axelsson, The base-rate fallacy and the dif-ficulty of intrusion detection, ACM Transactionson Information and System Security (TISSEC) 3(2000), no. 3, 186–205.

[5] BBC, Anonymous wikileaks supporters explainweb attacks, http://www.bbc.co.uk/news/technology-11971259.

[6] S.M. Bellovin, Security problems in the TCP/IPprotocol suite, ACM SIGCOMM Computer Com-munication Review 19 (1989), no. 2, 32–48.

[7] S.M. Bellovin, D.D. Clark, A. Perrig, and D. Song,A clean-slate design for the next-generation secureinternet, Relatorio tecnico, Pittsburgh, PA: Reportfor NSF Global Environment for Network Innova-tions (GENI) Workshop, 2005.

17

Page 18: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

[8] A. Bender, N. Spring, D. Levin, and B. Bhattachar-jee, Accountability as a service, Proceedings of the3rd USENIX workshop on Steps to reducing un-wanted traffic on the internet, USENIX Associa-tion, 2007, p. 5.

[9] R. Chen, T. Giedgowd, and L. Greaves, Under-standing the Threat of Denial-of-Service Attacks inSociety Today.

[10] CNET, How pakistan knocked youtube of-fline, http://news.cnet.com/8301-10784_3-9878655-7.html.

[11] , News sites swamped following michaeljackson’s death, http://news.cnet.com/8301-1023_3-10273325-93.html.

[12] S.A. Crosby and D.S. Wallach, Denial of ser-vice via algorithmic complexity attacks, Proceed-ings of the 12th conference on USENIX Secu-rity Symposium-Volume 12, USENIX Associa-tion, 2003, pp. 3–3.

[13] C. Estan and G. Varghese, New directions in trafficmeasurement and accounting, ACM SIGCOMMComputer Communication Review, vol. 32, ACM,2002, pp. 323–336.

[14] C. Farkas, G. Ziegler, A. Meretei, and A. L”orincz, Anonymity and accountability in self-organizing electronic communities, Proceedings ofthe 2002 ACM workshop on Privacy in the Elec-tronic Society, ACM, 2002, pp. 81–90.

[15] L. Garber, Denial-of-service attacks rip the Inter-net, IEEE Computer 33 (2000), no. 4, 12–17.

[16] M. Handley, V. Paxson, and C. Kreibich, Net-work intrusion detection: Evasion, traffic normal-ization, and end-to-end protocol semantics, Pro-ceedings of the 10th conference on USENIX Se-curity Symposium-Volume 10, USENIX Associa-tion, 2001, pp. 9–9.

[17] The Tech Herald, Themis: Looking at theaftermath of the hbgary federal scandal,http://www.thetechherald.com/article.php/201112/6951/Themis-Looking-at-the-aftermath-of-the-HBGary-Federal-scandal.

[18] J. Ioannidis and S.M. Bellovin, Implementingpushback: Router-based defense against DDoSattacks, Proceedings of Network and DistributedSystem Security Symposium, vol. 2, Citeseer,2002.

[19] J. Lemon et al., Resisting SYN flood DoS attackswith a SYN cache, Proceedings of the BSDCon,2002, pp. 89–97.

[20] A. Li, X. Liu, and X. Yang, Bootstrapping Ac-countability in the Internet We Have.

[21] R. Mahajan, S.M. Bellovin, S. Floyd, J. Ioan-nidis, V. Paxson, and S. Shenker, Controllinghigh bandwidth aggregates in the network, ACMSIGCOMM Computer Communication Review 32(2002), no. 3, 62–73.

[22] A. McCue, Bookie reveals $100,000 cost ofdenial-of-service extortion attacks, http://www.silicon.com/technology/security/2004/06/11/bookie-reveals-100000-cost-of-denial-of-service-extortion-attacks-39121278/.

[23] J. McHugh, Intrusion and intrusion detection,International Journal of Information Security 1(2001), no. 1, 14–35.

[24] J. Mirkovic and P. Reiher, Building accountabilityinto the future Internet, Secure Network Protocols,2008. NPSec 2008. 4th Workshop on, IEEE, 2008,pp. 45–51.

[25] D. Moore, G.M. Voelker, and S. Savage, Infer-ring internet denial-of-service activity, Proceed-ings of the 10th conference on USENIX Secu-rity Symposium-Volume 10, USENIX Associa-tion, 2001, pp. 2–2.

[26] Netcraft, Operation payback aborts at-tack against amazon.com, http://news.netcraft.com/archives/2010/12/09/operation-payback-aborts-attack-against-amazon-com.html.

[27] O. Nordstrom and C. Dovrolis, Beware of BGP at-tacks, ACM SIGCOMM Computer Communica-tion Review 34 (2004), no. 2, 1–8.

[28] T.H. Ptacek, Insertion, evasion, and denial of ser-vice: Eluding network intrusion detection, Tech.report, SECURE NETWORKS INC CALGARYALBERTA, 1998.

[29] S. Savage, N. Cardwell, D. Wetherall, and T. An-derson, TCP congestion control with a misbehav-ing receiver, ACM SIGCOMM Computer Com-munication Review 29 (1999), no. 5, 71–78.

[30] M. Schuchard, A. Mohaisen, D. Foo Kune,N. Hopper, Y. Kim, and E.Y. Vasserman, Losingcontrol of the internet: using the data plane toattack the control plane, Proceedings of the 17thACM conference on Computer and communica-tions security, ACM, 2010, pp. 726–728.

[31] A. Studer and A. Perrig, The coremelt attack, Com-puter Security–ESORICS 2009 (2010), 37–52.

18

Page 19: DoS and DDoS Detection, Defense and Deterrence · DDoS When DoS happens in a distributed fashion, i.e. multiple hosts sending attack packets at the same time, it is called a DDoS

[32] Beardsley T. and J. Qian, The tcp split handshake:Practical effects on modern network equip-ment, http://nmap.org/misc/split-handshake.pdf.

[33] L. Von Ahn, M. Blum, N. Hopper, and J. Lang-ford, CAPTCHA: Using hard AI problems for secu-rity, Advances in CryptologyEUROCRYPT 2003(2003), 646–646.

[34] X.F. Wang and M.K. Reiter, Defending againstdenial-of-service attacks with puzzle auctions,(2003).

[35] P. Watson, Slipping in the Window: TCP Reset at-tacks, 2004.

[36] WSJ, What does it cost amazon.com to go dark?,http://blogs.wsj.com/numbersguy/what-does-it-cost-amazoncom-to-go-dark-374/.

[37] Y. Xiao, Accountability for wireless LANs, ad hocnetworks, and wireless mesh networks, Communi-cations Magazine, IEEE 46 (2008), no. 4, 116–126.

[38] A. Yaar, A. Perrig, and D. Song, SIFF: A state-less Internet flow filter to mitigate DDoS floodingattacks, (2004).

[39] X. Zhang, H.C. Hsiao, G. Hasker, H. Chan, A. Per-rig, and D.G. Andersen, SCION: Scalability, Con-trol, and Isolation On Next-Generation Networks.

19