13
A Distributed Throttling Approach for Handling High Bandwidth Aggregates Chee Wei Tan, Student Member, IEEE, Dah-Ming Chiu, Senior Member, IEEE, John C.S. Lui, Senior Member, IEEE, and David K.Y. Yau, Member, IEEE Abstract—Public-access networks need to handle persistent congestion and overload caused by high bandwidth aggregates that may occur during times of flooding-based DDoS attacks or flash crowds. The often unpredictable nature of these two activities can severely degrade server performance. Legitimate user requests also suffer considerably when traffic from many different sources aggregates inside the network and causes congestion. This paper studies a family of algorithms that “proactively” protect a server from overload by installing rate throttles in a set of upstream routers. Based on an optimal control setting, we propose algorithms that achieve throttling in a distributed and fair manner by taking important performance metrics into consideration, such as minimizing overall load variations. Using ns-2 simulations, we show that our proposed algorithms 1) are highly adaptive by avoiding unnecessary parameter configuration, 2) provide max-min fairness for any number of throttling routers, 3) respond very quickly to network changes, 4) are extremely robust against extrinsic factors beyond the system control, and 5) are stable under given delay bounds. Index Terms—Resource management, DDoS attacks, network security. Ç 1 INTRODUCTION T HIS paper proposes a family of distributed throttling algorithms for handling high bandwidth aggregates in the Internet. High bandwidth aggregates, as well as persistent congestion, can occur in the current Internet even when links are appropriately provisioned, and regardless of whether flows use conformant end-to-end congestion control or not. Two prime examples of high bandwidth aggregates that have received a lot of recent attention are distributed denial-of-service (DDoS) attacks and flash crowds [7]. In a DDoS attack, attackers send a large volume of traffic from different end hosts and direct the traffic to overwhelm the resources at a particular link or server. This high volume of traffic will significantly degrade the performance of legitimate users who wish to gain access to the congested resource. Flash crowds, on the other hand, occur when a large number of users concurrently try to obtain information from the same server. In this case, the heavy traffic may overwhelm the server and overload nearby network links. An example of flash crowds occurred after the 9/11 incident [7], when many users tried to obtain the latest news from the CNN Web site. Note that network congestion caused by high bandwidth aggregates cannot be controlled by conventional flow-based protection mechanisms. This is because these traffic aggregates may originate from many different end hosts, each traffic source possibly of low bandwidth. One approach to handle high bandwidth aggregates is to treat it as a resource management problem and protect the resource proactively (i.e., ahead of the congestion points) from overload. Consider a network server, say S, experiencing a DDoS attack. To protect itself from overload, S can install a router throttle at a set of upstream routers that are several hops away. The throttle specifies the maximum rate (in bits/s) at which packets destined for S can be forwarded by each router. An installed throttle will change the load experi- enced at S, which then adjusts the throttle rate in multiple rounds of feedback control until its load target is achieved. In [22], the authors proposed a throttle algorithm known as the additive-increase-multiplicative-decrease (AIMD) algo- rithm, and showed how such server-based adaptation can effectively handle DDoS attacks, while guaranteeing fair- ness between the deployment routers. Although the AIMD throttle algorithm in [22] is shown to work under various attack scenarios, it also raises several important issues that need to be addressed, for example, the proper parameter setting so as to guarantee stability and fast convergence to the steady state. Addressing these issues, this paper leverages control theory to design a class of fair throttle algorithms. Our goal is to derive a throttle algorithm that is 1. highly adaptive (by avoiding unnecessary parameters that would require configuration), 2. able to converge quickly to the fair allocation, 3. highly robust against extrinsic factors beyond the system’s control (e.g., dynamic input traffic and number/locations of current attackers), and 4. stable under given delay bounds. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007 983 . C.W. Tan is with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544. E-mail: [email protected]. . D.-M. Chiu is with the Department of Information Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. E-mail: [email protected]. . J.C.S. Lui is with the Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong. E-mail: [email protected]. . D.K.Y. Yau is with the Department of Computer Science, Purdue University, 250 N. University St., West Lafayette, IN 47907-2066. E-mail: [email protected]. Manuscript received 1 May 2006; revised 5 Aug. 2006; accepted 10 Aug. 2006; published online 8 Jan. 2007. Recommended for acceptance by Y. Pan. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TPDS-0106-0506. Digital Object Identifier no. 10.1109/TPDS.2007.1034. 1045-9219/07/$25.00 ß 2007 IEEE Published by the IEEE Computer Society

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

A Distributed Throttling Approach forHandling High Bandwidth AggregatesChee Wei Tan, Student Member, IEEE, Dah-Ming Chiu, Senior Member, IEEE,

John C.S. Lui, Senior Member, IEEE, and David K.Y. Yau, Member, IEEE

Abstract—Public-access networks need to handle persistent congestion and overload caused by high bandwidth aggregates that may

occur during times of flooding-based DDoS attacks or flash crowds. The often unpredictable nature of these two activities can severely

degrade server performance. Legitimate user requests also suffer considerably when traffic from many different sources aggregates

inside the network and causes congestion. This paper studies a family of algorithms that “proactively” protect a server from overload by

installing rate throttles in a set of upstream routers. Based on an optimal control setting, we propose algorithms that achieve throttling in a

distributed and fair manner by taking important performance metrics into consideration, such as minimizing overall load variations. Using

ns-2 simulations, we show that our proposed algorithms 1) are highly adaptive by avoiding unnecessary parameter configuration,

2) provide max-min fairness for any number of throttling routers, 3) respond very quickly to network changes, 4) are extremely robust

against extrinsic factors beyond the system control, and 5) are stable under given delay bounds.

Index Terms—Resource management, DDoS attacks, network security.

Ç

1 INTRODUCTION

THIS paper proposes a family of distributed throttling

algorithms for handling high bandwidth aggregates in

the Internet. High bandwidth aggregates, as well as persistent

congestion, can occur in the current Internet even when links

are appropriately provisioned, and regardless of whether

flows use conformant end-to-end congestion control or not.Two prime examples of high bandwidth aggregates that have

received a lot of recent attention are distributed denial-of-service

(DDoS) attacks and flash crowds [7].In a DDoS attack, attackers send a large volume of traffic

from different end hosts and direct the traffic to overwhelm

the resources at a particular link or server. This high volume

of traffic will significantly degrade the performance of

legitimate users who wish to gain access to the congested

resource. Flash crowds, on the other hand, occur when a large

number of users concurrently try to obtain information from

the same server. In this case, the heavy traffic may overwhelm

the server and overload nearby network links. An example of

flash crowds occurred after the 9/11 incident [7], when many

users tried to obtain the latest news from the CNN Web site.

Note that network congestion caused by high bandwidthaggregates cannot be controlled by conventional flow-basedprotection mechanisms. This is because these trafficaggregates may originate from many different end hosts,each traffic source possibly of low bandwidth. Oneapproach to handle high bandwidth aggregates is to treatit as a resource management problem and protect theresource proactively (i.e., ahead of the congestion points)from overload.

Consider a network server, say S, experiencing a DDoSattack. To protect itself from overload, S can install a routerthrottle at a set of upstream routers that are several hopsaway. The throttle specifies the maximum rate (in bits/s) atwhich packets destined for S can be forwarded by eachrouter. An installed throttle will change the load experi-enced at S, which then adjusts the throttle rate in multiplerounds of feedback control until its load target is achieved.In [22], the authors proposed a throttle algorithm known asthe additive-increase-multiplicative-decrease (AIMD) algo-rithm, and showed how such server-based adaptation caneffectively handle DDoS attacks, while guaranteeing fair-ness between the deployment routers.

Although the AIMD throttle algorithm in [22] is shownto work under various attack scenarios, it also raises severalimportant issues that need to be addressed, for example, theproper parameter setting so as to guarantee stability andfast convergence to the steady state. Addressing theseissues, this paper leverages control theory to design a classof fair throttle algorithms. Our goal is to derive a throttlealgorithm that is

1. highly adaptive (by avoiding unnecessary parametersthat would require configuration),

2. able to converge quickly to the fair allocation,3. highly robust against extrinsic factors beyond the

system’s control (e.g., dynamic input traffic andnumber/locations of current attackers), and

4. stable under given delay bounds.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007 983

. C.W. Tan is with the Department of Electrical Engineering, PrincetonUniversity, Princeton, NJ 08544. E-mail: [email protected].

. D.-M. Chiu is with the Department of Information Engineering, TheChinese University of Hong Kong, Shatin, New Territories, Hong Kong.E-mail: [email protected].

. J.C.S. Lui is with the Department of Computer Science and Engineering,The Chinese University of Hong Kong, Shatin, New Territories, HongKong. E-mail: [email protected].

. D.K.Y. Yau is with the Department of Computer Science, PurdueUniversity, 250 N. University St., West Lafayette, IN 47907-2066.E-mail: [email protected].

Manuscript received 1 May 2006; revised 5 Aug. 2006; accepted 10 Aug.2006; published online 8 Jan. 2007.Recommended for acceptance by Y. Pan.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TPDS-0106-0506.Digital Object Identifier no. 10.1109/TPDS.2007.1034.

1045-9219/07/$25.00 � 2007 IEEE Published by the IEEE Computer Society

Page 2: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

While our discussion uses flooding DDoS attacks forcontext, our results are also applicable for congestioncontrol involving a large number of distributed contributingsources. DDoS attacks complicate the network congestioncontrol issue because additional detection mechanism suchas statistical filtering [10], [14], [15], [20], [21] has to workjointly with overload control. In general, a robust defensemechanism has to minimize the disruption in terms of theamount of attackers’ traffic getting through to the serverand the time to bring down the congestion level withoutany knowledge of the number of compromised routers andthe adopted attack distribution.

The balance of the paper is organized as follows: InSection 2, following the discussion in [22], we introduce thebackground and basic terminology of adaptive routerthrottling. We also comment on the limitations of theprevious approach, and motivate the need for a formalapproach in the design of a stable and adaptive system. InSection 3, we present a class of distributed throttlealgorithms. In Section 4, we analyze the properties of ouralgorithms. Section 5 reports experimental results thatillustrate various desirable properties of the algorithms.Related work is discussed in Section 6. Finally, theconclusions are drawn in Section 7.

2 BACKGROUND OF ROUTER THROTTLING

In [22], the authors formulated the problem of defendingagainst flooding-based DDoS attacks (i.e., one form of highbandwidth aggregates) as a resource management problemof protecting a server from having to deal with excessiverequests arriving over a global network. The approach isproactive: Before aggressive traffic flows can converge tooverwhelm a server, a subset of routers along the forward-ing paths is activated to regulate the incoming traffic rates1

to more moderate levels. The basic mechanism is for aserver under attack, say S, to install a router throttle at a setof upstream routers several hops away. The throttle limitsthe rate at which packets destined for S can be forwardedby a router. Traffic that exceeds the throttle rate will bedropped at the router.

An important element in the proposed defense system in[22] is to determine appropriate throttling rates at thedefense routers such that, globally, S exports its full servicecapacity US (in kb/s) to the network, but no more. Theseappropriate throttle rates should depend on the currentdemand distributions and, thus, must be negotiateddynamically between the server and the defense routers.The negotiation approach is server-initiated. A serveroperating below the designed load limit needs no protec-tion, and so does not install any router throttles. As serverload increases and crosses the designed load limit US , theserver may start to protect itself by installing a rate throttleat the defense routers. If a current throttle rate fails to bringdown the load at S to below US , then the throttle rate can befurther reduced. On the other hand, if the server load fallsbelow a low-water mark LS (where LS < US), then thethrottle rate is increased to allow more packets to be

forwarded to S. If an increase does not cause the load tosignificantly increase over some observation period, thenthe throttle is removed. The goal of our throttle algorithmsis to keep the server load within ½LS; US�whenever a throttleis in effect.

Router throttling does not have to be deployed at everyrouter in the network. Rather, the deployment points areparameterized by a positive integer k, and are within theadministrative domain ofS. We denote this set of deploymentpoints by RðkÞ. RðkÞ contains all the routers that are k hopsaway from S and also routers that are less than k hops awayfromS, but are directly connected to an end host. To illustratethis concept, Fig. 1 shows an example network topology inwhich a square node represents a host and a round noderepresents a router. The host on the far left is the target serverS. The deployment routers in Rð3Þ are shaded. Note that thebottom-most router inRð3Þ is only two hops away fromS, butis included because it is directly connected to a host. Thenumber above each host (or router) denotes the rate at whichthe host (or router) sends to S. For this example, we use US ¼22 and LS ¼ 18. Since the total offered rate of traffic to S is� ¼ 59:9, S will be overloaded without protection.

In fairly allocating the server capacity among the routers inRðkÞ, a notion of level-kmax-min fairness is introduced in [22]:

Definition 1 (level-k max-min fairness). A resource controlalgorithm achieves level-k max-min fairness among therouters RðkÞ, if the allowed forwarding rate of traffic for Sat each router is the router’s max-min fair share of some rate rsatisfying LS � r � US .

The AIMD algorithm in [22] (see Fig. 2) achieves level-kmax-min fairness. It installs at each router in RðkÞ a throttlerate rs, which is the “leaky bucket” rate at which each routercan forward traffic to S. Traffic that exceeds rs will bedropped by each router. An installed throttle has very lowruntime overhead as only a leaky-bucket throttle is requiredat each server that needs to handle high bandwidthaggregates.

In Fig. 2, rs is the current throttle rate used by S. It isinitialized to ðLS þ USÞ=fðkÞwhere fðkÞ is either some smallconstant, say 2, or an estimate of the number of throttlepoints needed in RðkÞ. The monitoring window W is usedto measure the incoming aggregate rate. It is set to be

984 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007

1. All traffic rate and server load quantities are in unit of Mbps, unlessstated otherwise.

Fig. 1. Network topology illustrating Rð3Þ deployment points of routerthrottle. A square node represents a host and a round node represents arouter. The host on the far left is the target server S. The deploymentrouters in Rð3Þ are shaded. The number above each host (or router)denotes the rate at which the host (or router) sends to S.

Page 3: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

somewhat larger than the maximum round trip timebetween S and a router in RðkÞ so as to smooth out thevariation of incoming aggregate rate. In Section 3.4, weexamine the impact of the window size W in mitigatingDDoS attack traffic. A constant additive step, �, is used toramp up rs if a throttle is in effect and the current serverload is below LS . Each time S computes a new rs that willbe installed in RðkÞ, which causes the traffic load at S tochange. The algorithm then iteratively adjusts rs so that theserver load converges to and stays within a point in ½LS; US �whenever a throttle is in effect.

To illustrate the AIMD algorithm, consider the exampleshown in Fig. 1, with US ¼ 22, LS ¼ 18, and a step size� ¼ 1. We initialize rs to ðLS þ USÞ=4 ¼ 10. Table 1a,reproduced from [22], shows the AIMD algorithm in action.When the algorithm is first invoked with throttle rate 10, theaggregate load at S drops to 31.78. Since the server load stillexceeds US , the throttle rate is halved to 5, and the serverload drops below LS , to 16.78. As a result, rs is increased to6, and the server load becomes 19.78. Since 19.78 is withinthe target range [18, 22], the algorithm terminates. Whenthat happens, the forwarding rates of traffic for S at thedeployment routers (from top to bottom in the Fig. 1) are 6,0.22, 6, 6, 0.61, and 0.95, respectively. This is the max-minfair allocation of a rate of 19.78 among the deploymentrouters, showing that level-k max-min fairness is achieved(in the sense of Definition 1).

Although the AIMD algorithm is shown to be effectiveunder various attack scenarios, there remain some unre-solved issues. In particular, the algorithm requires a carefulchoice of several control parameters, including the monitor-ing windowW , step size �, an initial estimate of the number ofthrottles fðkÞ, and the load limits US and LS . The choice ofthese parameters will affect system stability and the speed ofconvergence to the stable load. For example, had we picked asmaller � in the above example, it would have taken longer forthe system to converge. This is shown in Table 1b.Furthermore, the parameters can be affected by a number ofextrinsic factors that are beyond the algorithm’s control, e.g.,

the number of routers inRðkÞ, the delay of messages fromS tothe routers inRðkÞ, and changes in the offered traffic to S. Aspointed out in [22], if the parameters are not set properly, thealgorithm may not converge.

In the following section, we will address these robust-ness and stability issues by developing a family ofdistributed throttling algorithms that can effectively andautomatically handle this problem.

3 A FAMILY OF ADAPTIVE FAIR THROTTLE

ALGORITHMS

In this section, we describe several algorithms withincreasing sophistication. There are multiple objectives forthese throttle algorithms:

1. Fairness: achieve level-k max-min fairness amongthe RðkÞ routers.

2. Adaptiveness: use as few configuration parametersas possible.

3. Fast convergence: converge as fast as possible to theoperating point for stationary or dynamic load.

4. Stability: algorithm converges regardless of theinput rate distribution, and is robust against limitedor unknown information of network parameters.

The throttle algorithm terminates if the aggregate loadfalls within a given region ½Ls; Us�, which contains C0 whereC0 is a desired operating point. By default, we assume C0 tobe the midpoint between Ls and Us. Each throttle algorithmis basically a probing algorithm that tries to determine acommon rs, to be sent by the server to all the RðkÞ routersusing negative feedback [4], [13]. Negative feedback is theprocess of feeding back the input a part of the systemoutput. The principle of negative feedback thus implies thatwhen the server asks the routers to decrease, we shouldensure that the total load at the server will not increase, andwhen the server asks the routers to increase, the total load atthe server will not decrease [4]. They are all designed toconverge to the level-k max-min fairness criterion. All thealgorithms differ in the way the server computes rs, whichwill affect stability, i.e., whether the load converges to apoint in ½Ls; Us� and the convergence speed. We summarizeour algorithms as follows:

1. Binary search algorithm: A simple algorithm thatsatisfies the first three objectives of a fair throttlealgorithm for stationary load only.

2. Proportional aggregate control (PAC) with anestimate of the number of throttles: An algorithm

TAN ET AL.: A DISTRIBUTED THROTTLING APPROACH FOR HANDLING HIGH BANDWIDTH AGGREGATES 985

Fig. 2. Pseudocode for the AIMD fair throttle algorithm.

TABLE 1Example of Using the AIMD Algorithm

(a) � ¼ 1 and (b) � ¼ 0:15.

Page 4: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

that satisfies all the objectives of a fair throttlealgorithm.

3. Proportional-aggregate-fluctuation-reduction(PAFR) algorithm: Improves the PAC algorithm byconsidering stability issues and tracking ability.

3.1 Binary Search (BS) Algorithm

The possible range for rs is ½0; Us�, the two extremescorresponding to an infinite number of throttles and onethrottle, respectively. A binary search algorithm can beapplied: Reduce the search range by half at every iterationof the algorithm (see Fig. 3). Like the AIMD algorithm, thisalgorithm uses the aggregate rate � to determine thedirection of adjustment. When the aggregate rate is morethan Us, rs is reduced by half (same as before). However,when the aggregate rate is less than Ls, rs is increased to themidpoint between the current rs and Us. This changeremoves the need to configure �. The increase and decreaserules are symmetrical, each time reducing the search rangeby half. The pseudocode of the binary search algorithm isshown in Fig. 3. In the algorithm, we initialize the throttlerate rs to ðUs þ LsÞ=jRðkÞj, which corresponds to the guessthat half of the RðkÞ routers will be dropping traffic due tothrottling.

The problem of optimal bandwidth probing assuming aknown upper bound has been considered earlier by [9]. Intheir slightly different problem formulation of searching foravailable bandwidth by a single flow, they conclude that (ageneralized version of) the binary search algorithm is near-optimal for certain reasonable cost functions of overshoot-ing and undershooting.

The performance of binary search can be illustratedusing the same example that we considered in Section 2.The results are shown in Table 2a. Note that the binarysearch algorithm converges quickly for a stationary load.

The basic binary search algorithm assumes that thetraffic loading is stationary. As noted by Karp et al. [9],when the traffic conditions are dynamic, the situation ismore complicated. They propose and analyze alternative

algorithms for bounded variations in the traffic conditions.In our case, traffic can vary drastically since attackers mayoften vary their attack rates to reduce the chance of beingrevealed. The binary search algorithm must therefore detectthe situation that the shrunken search range for rs can nolonger deal with the changed traffic conditions, and re-initialize the search range accordingly. Such a modificationis included in the algorithm given in Fig. 3.

However, it takes time to realize the need for re-initializingthe search, and it often takes more time than necessary for thenew search to converge. For example, Table 2b shows that, ifthe last two sources in Fig. 1 with offered rates of 0.61 and 0.95both change to 3.5 after round 4 in Table 2a, both sources arestill within the rate throttle limit of rs, but rs cannot change to avalue lower than low ¼ 5:0 without reinitialization, and thesystem will be overloaded indefinitely. Table 2c shows thatreinitialization corrects the algorithm by resetting the vari-able low. Although the binary search algorithm is fair, fullyadaptive, and fast, it is not entirely satisfactory for dynamictraffic and packet delay.

3.2 Proportional Aggregate Control (PAC) with anEstimate on the Number of Throttles

The binary search algorithms directly adjust rs, without usingthe magnitude of the overshoot (or undershoot) of theaggregate rate � beyond the target range. We now explorealgorithms that do make use of this information, as well as anestimate of the number of routers in RðkÞ that actually droptraffic. For clarity of exposition, we will call a deploymentrouter whose offered rate exceeds rs (and, thus, will droptraffic because of throttling) an effective throttling router. Wecall the throttle at such a router an effective throttle. If thenumber of effective throttles is n, then it is reasonable to setthe change of rs to

�rs ¼j�� Cjn

; ð1Þ

where C represents the target rate (a value within ½LS; US �).Since the traffic at the routers changes dynamically, it is

not possible to know the exact number of effective throttling

986 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007

Fig. 3. Pseudocode for the binary search fair throttle algorithm.

TABLE 2Trace of rs and Server Load Using the Binary Search Algorithm

(a) Binary search under stationary load. (b) Binary search becomesunstable without reinitialization. (c) The algorithm corrects itself using� ¼ 0:05.

Page 5: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

routers. Furthermore, as rs is adjusted, the number of these

routers changes as well. By monitoring the past traffic and

rs, however, we can estimate n. The idea is to keep track of

the last throttle rate rlast and last aggregate rate �last. This

allows us to estimate n as

n ¼ j�� �lastjjrs � rlastj:

A pseudocode of the estimator is shown in Fig. 4.

Exponential averaging is added to minimize inaccuracies

due to fast changing components of the traffic load (see

Line 3 of Fig. 4). Using this estimator, we can compute a

suitable change to rs using (1). This is the essence of the

new algorithm, as shown in Fig. 5.With the new algorithm, we also wish to address our

fourth objective, namely, system stability in the face of delay

between a throttle request sent by the server and the throttle

taking effect at a deployment router. This objective can be

achieved by correcting only a fraction KP of the discrepancy

between the aggregate rate and the rate target (see Line 19

of Fig. 5). The change �rs in (1) is thus proportional to the

discrepancy; hence, we refer to such control as Proportional

Aggregate Control (PAC). We will show in Section 4 how one

can choose automatically (i.e., without the extra burden of

tricky parameter configuration) an appropriate value forKP based on the maximum delay between the server and adeployment router.

Another detail in the pseudocode is how to pick thevalue of C in (1). The value should be in the target range of½LS; US�. Our current implementation sets the targetcapacity C to LS when we have overload (see Line 7 ofFig. 5) and to US when we have underload (see Line 13 ofFig. 5). By advertising a larger aggregate rate discrepancy,we can maximize the likelihood that the resulting target ratewill end up in the target range.

The PAC algorithm is applied to the same example usedbefore. The results are shown in Table 3. An estimate of thenumber of effective throttles is given by the change inserver load over the change in throttle rate, i.e.,4�=4rs. Forstationary traffic demand, the rate of convergence ismonotonic and relatively fast. The speed of convergencedepends on the setting of KP . However, the setting of KP

can affect system stability. This is discussed next.

3.3 Proportional-Aggregate-Fluctuation-Reduction(PAFR) Algorithm

We now introduce one more optimization to obtain analgorithm that is better than the PAC algorithm presented inthe last section. Whenever there is a change in the offered loadat the routers in RðkÞ, a higher proportional gain can adaptfaster to the change. However, such “strong” control mayresult in large fluctuations in rs, which may be annoying to theend users and also cause instability. On the other hand, a“weak” control causes prolonged congestion at the server.The goal is thus to reduce fluctuations while still achievingfast convergence. The stability problem that arises when ahigh proportional gain is used can be mitigated by consider-ing a derivative control parameter, KD. As we will show inSection 4, it turns out that KD can be optimized to a constantvalue, independent of the operating conditions, for a givendesign cost function. Hence, KD does not introduce any extraburden of parameter configuration. This Proportional-aggre-gate-fluctuation-reduction (PAFR) fair throttle algorithm isillustrated in Fig. 6.

As shown at Line 18 in Fig. 6, whenever we haveunderload or overload, the total expected change in the totalload in RðkÞ at the next interval is given by

4�0iþ1 ¼ �KP ð�i � CÞ �KD4�i; 4�01 ¼ 0; ð2Þ

where 4�i is the actual change and 4�0i is the expectedchange in the interval i.

The first term in (2) is necessary for the mismatchregulation, and the second term tracks the rate of change in

TAN ET AL.: A DISTRIBUTED THROTTLING APPROACH FOR HANDLING HIGH BANDWIDTH AGGREGATES 987

Fig. 4. Pseudocode for the algorithm to estimate the number of effective

throttles n in RðkÞ that assumes an upper bound on the size of RðkÞ,jRðkÞjmax.

Fig. 5. Pseudocode for the proportional aggregate control fair throttle

algorithm.

TABLE 3Trace of the Throttle Rate and Server Load Using

the PAC Algorithm

Target capacity used is Ls ¼ 18. Here, KP ¼ 1=2 so it reduces thesearch interval by half at every round.

Page 6: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

the feedback. The first term has the largest impact on theaggressiveness of the algorithm. A smaller KP will haveslower convergence rate. On the other hand, a largerKP mayresult in larger overshoots, sometimes even causing thesystem to become unstable, i.e., the load oscillates indefinitelyaround the band ½Ls; Us� without convergence. The secondterm is more responsive to changes in the mismatch than thefirst term, and also has a damping effect on the fluctuation ofthe throttle rate. A larger KD implies more damping, whichimproves the stability of the system. As shown in thesimulation results in Section 5, the PAFR algorithm canreduce the amplitude of overshoot significantly duringtransient, while maintaining fast convergence.

Table 4 shows the PAFR algorithm for KP ¼ 0:73 andKD ¼ 0:48 using the previous example. An estimate of thenumber of effective throttles is given by the change inserver load over the change in throttle rate, i.e., 4�=4rs.Although this simple example does not show anyimprovement in convergence time, the simulation resultsin Section 5 will clearly illustrate such improvements. Inaddition, the PAFR algorithm tracks the target rate rangebetter. The tracking performance metric will be madeprecise in the next section.

3.4 Setting the Window Size W

The criterion for the right window size is twofold: obtain amore accurate estimate of the effective number of throttledrouters n, which improves the adaptive throttle algorithms,and enforce rate limiting. For example, the filter at eachthrottle may employ some form of deterministic orprobabilistic policy that drops packet according to specificfeatures in the packet. Furthermore, it is recommended in[15] that a filter policy cannot be completely memorylesswhile enforcing rate limiting, i.e., in order for such kind of afilter to operate effectively based on its past measurement ofthe number of packets in a window length W , rs cannotchange too drastically in consecutive windows. Similarly, toobtain an accurate estimate of n, W cannot be too small. Onthe other hand, a large W affects the convergence time ofthe overload control system.

To cope with this problem, we use a moving windowwith a parameter, which is a function of the maximumround trip time in the system to obtain a smoothmeasurement of the incoming aggregate rate �. Specifically,we define � ¼ 2D=W , where D is the average round tripdelay from each router in RðkÞ to S. Averaging betweenconsecutive windows, we have �i ¼ ��i�1 þ ð1� �Þ�i,where the subscript denotes the ith window. We verify bysimulations in Section 5 that the above approximationworks very well in measuring the aggregate traffic rate,which is used as an input in our throttle algorithms.

4 AN ANALYSIS OF THE FAIR THROTTLE

ALGORITHMS

In this section, we use a control-theoretic approach to studyrouter throttling. Particularly, a fluid flow model is used toanalyze the stability of the PAC and PAFR fair throttlealgorithms. We model S as a single bottleneck node, anddenote the total traffic load from RðkÞ to S by yðtÞ, whichserves the same role as �i in our algorithms. In both thePAC and PAFR algorithms, the expected change inaggregate traffic load is computed using (2) where KD ¼ 0in the PAC algorithm.

Since the offered traffic at each router changes dynami-cally, yðtÞ will be larger than the expected change in (2) if1) the load at some previously unthrottled routers increases,but is still less than the throttle limit, or 2) additional routersare added to RðkÞ while the throttle rates are changing. Onthe other hand, yðtÞ is less than the expected change if 1) theoffered load at some previously throttled routers fallsignificantly below the throttle limit, or 2) some routersare removed from RðkÞ while the throttle rates arechanging. These can be viewed as uncertainties in thecontrol system. Here, stability is defined in the bounded-input-bounded-output sense (see [13], p. 319). Besides, theestimation of parameters such as the number of throttlesintroduces additional uncertainty to the system. The use offeedback is thus to combat uncertainty, and to determinehow to adjust the throttle rate while maintaining a stablesystem. A properly designed negative feedback controller isour focus in the following.

First, we introduce a fluid model that governs the systemwe are analyzing. In our fluid model, we treat the targetcapacity to be a constant value,C, which can be the midpointbetween LS and US . The rate of change in yðtÞ is given as

dyðtÞ=dt ¼ �KP ðyðt�DÞ � CÞ �KDðdyðt�DÞ=dtÞ; ð3Þ

where D is the delay, which is upper bounded by a constantD0, and KP and KD are the proportional and differentialgains, respectively. Selecting appropriate KP and KD is part

988 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007

Fig. 6. Pseudocode for the proportional-aggregate-fluctuation-reduction

fair throttle algorithm.

TABLE 4Trace of the Throttle Rate and Server Load Using

the PAFR Algorithm

Target capacity used is Ls ¼ 18.

Page 7: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

of the design of the throttle algorithms. We first consider thePAC algorithm where the change in the search rangeconverges to zero if the traffic is stationary. Our first resultshows that the amount of feedback is inversely proportionalto the amount of delay that the system can tolerate.

Lemma 1. The system in (3) is asymptotically stable if and only if0 < KP < minð�=ð2D0Þ; 1Þ, where D0 is the largest delay inthe system.

Proof. Please refer to the Appendix. tuRemark 1. When KP ¼ 1, the exact mismatch between yðtÞ

and C is used as a feedback. Intuitively, the system ismost responsive. However, the system is also the leasttolerant to time delay. Hence, there is a tradeoff betweenthe speed of response and robustness to delay (see proofof Theorem 1).

Remark 2. Recall that overestimating the number ofthrottles in RðkÞ has the effect of reducing the propor-tional gain. For example, if the estimator always assumesthat the number of effective throttles is the maximumnumber of routers in RðkÞ, we have a form of doublyconservative PAC algorithm.

Next, we introduce a cost function, which is a perfor-mance metric that measures the effectiveness of a fairthrottle algorithm, and is given as

J ¼XTsi¼1

ðxiÞ2; ð4Þ

where xi is the corresponding mismatch between C and � atinterval i, and Ts is a given terminating interval of thealgorithm. Note that xi is positive if the system isoverloaded, or negative if the system is underloaded, hencethe use of the squared function. Note that (4) is similar tothe proposed cost function (gentle cost function) in [9]where both overloading and underloading are penalized. Inthis paper, we treat the cost of overloading and under-loading to be the same.

In (4), Ts plays the same role as settling time in controltheory (see [13], p. 272), which is defined as the timerequired for the system response to decrease and staywithin a specified percentage of its final value (e.g., afrequently used value is 5 percent). The criterion in (4) istolerant to small but punitive to large errors at every interval.In other words, a smaller J is preferred. For a given ½Ls; Us�that is relatively narrow, a sufficiently long Ts and a giventraffic demand, J is useful for differentiating the perfor-mance of various algorithms. An overconservative algo-rithm tends to have a larger J . Besides, an algorithm thatcauses large variation in rs at most of the intervals oftenresults in a larger J .

The design of the PAC and PAFR algorithms is based onan optimization approach where KP and KD are selected byminimizing J in (4) subject to the constraint that the integralis finite for a given input sequence of xi. To this end, weconsider infinitesimally small intervals, and let Ts !1,thus we obtain the following integral of squared error:

J ¼Z 1

0

x2ðtÞ dt; ð5Þ

which belongs to a family of integral cost functionals incontrol theory for determining fixed order control para-meters for a given input [13]. For the PAC fair throttlealgorithm, we have the following main result.

Theorem 1. The PAC algorithm that minimizes (5) has KP ¼ðffiffiffi3p� 1Þ=D0 for a delay D0. Furthermore, it stabilizes the

system for any positive delay D � D0.

Proof. Please refer to the Appendix. tu

For the PAFR fair throttle algorithm, we have thefollowing main result.

Theorem 2. The PAFR algorithm that minimizes (5) has KP ¼0:80=D0 and KD ¼ 0:48 for a delay D0. Furthermore, it

stabilizes the system for any positive delay D � D0.

Proof. Please refer to the Appendix. tuRemark 3. In both PAC and PAFR algorithms, only a single

optimized parameter KP needs to be set with respect tothe maximum delay in the system.

Remark 4. Theorems 1 and 2 are valid for any number ofrouters in RðkÞ. If KP stabilizes the system for a givenD0, the RðkÞ router deployment using the same KP isfeasible anywhere in the network as long as the delayis less than D0.

Corollary 1. The PAFR algorithm that minimizes (5) always has

a smaller J than the PAC algorithm for all D > 0.

Proof. Please refer to the Appendix. tu

Corollary 1 implies that the PAFR algorithm outperformsthe PAC algorithm in minimizing (4) given that thealgorithms run for a sufficiently long time, and is validatedby the simulation results in the next section.

5 SIMULATION RESULTS

In this section, we study the performance of the various fairthrottle algorithms in terms of performance metrics such asconvergence time, throttle rate fluctuation, and adaptation todifferent network parameters using ns-2. We denote the

instantaneous offered traffic load at router i asTiðtÞ; 8i. For allthe experiments, we set Ls ¼ 100Mbps and Us ¼ 115Mbps.The initial throttle rate rs is set as 200 with W ¼ 3D, and thelow pass filter time constant of estð�; rsÞ is set as 0.25, unlessnoted otherwise.

5.1 Experiment A (Heterogeneous Network Sourceswith Constant Input)

First, we consider 50 heterogeneous sources. Initially, eachsource has a constant offered traffic load TiðtÞ ¼ 30; 8 i. Next,10 sources change their offered load to 1Mbps and 4Mbps att ¼ 50 and t ¼ 100, respectively. The network delay D

between S and each source is 100ms. The cost metric J andsettling time Ts for each algorithm are reported under thefigures for the input disturbance at t ¼ 50 only if thealgorithm converges. Fig. 7 shows the aggregate server load

when AIMD (� ¼ 0:02Mbps, � ¼ 0:2Mbps, and � ¼ 0:4Mbps),binary search ð� ¼ 1MbpsÞ, PAC ðKP ¼ 0:5Þ and PAFR ðKP ¼0:5; KD ¼ 0:48Þalgorithms are used, respectively. From Fig. 7,

TAN ET AL.: A DISTRIBUTED THROTTLING APPROACH FOR HANDLING HIGH BANDWIDTH AGGREGATES 989

Page 8: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

we observe that the PAFR algorithm is stable, converges, andhas a much less over/undershoot.

5.2 Experiment B (Heterogeneous Network Sourceswith Varying Input)

Next, we repeat the above experiment, but the first 30 sourcesare configured to be constant sources with TiðtÞ ¼ 4Mbps fori ¼ 1; . . . ; 30. The next 10 sources are sinusoidal sourceswhere TiðtÞ ¼ 1:5 sinð0:16tÞ þ 2:5Mbps for i ¼ 31; . . . ; 40. Thenetwork delay for each of these sinusoidal sources is 50ms.The last 10 sources are square pulse sources with TiðtÞ ¼4Mbps for the time interval 20k � t � 10ð2kþ 1Þ and TiðtÞ ¼1Mbps for the time interval 10ð2kþ 1Þ � t � 20ðkþ 1Þ,i ¼ 41; . . . ; 50, and k 2 f0; 1; 2; . . .g. The network delay foreach of these square pulse sources is 50ms. Fig. 8 shows theaggregate server load for the second experiment.

Based on the above experiments, we can make thefollowing observations.

Remark 5. The stability and performance of the AIMDalgorithm is dependent on the step size used. The systemis stable and slow for small �, but is unstable when � islarge. Furthermore, there is a significant overshoot in theinitial transient response for all the three different stepsizes used.

Remark 6. When the binary search algorithm is used, thetransient response becomes unsatisfactorily if the num-ber of sources gets large as in the experiments. Largeovershoots are observed in the transient before thealgorithm converges, and are due to the large variationin rs as shown in Fig. 9. In the dynamic traffic case, slowresponse to reinitialization is the main cause for the largeovershoots between t ¼ 50 and t ¼ 100.

Remark 7. The PAC ðKP ¼ 0:5Þ algorithm is comparativelyfaster and stable, but there are large overshoots whenthere is a disturbance to the system at t ¼ 50 and t ¼ 100.As a result, the convergence time might be longer asshown in Fig. 8.

Remark 8. As compared to the other algorithms, the PAFRalgorithm is more robust to external disturbances withmuch smaller overshoots, and has faster convergencetime. Fig. 9 compares the throttle rate variation betweenthe binary search and the PAFR algorithms. We observethat damping reduces the throttle rate variation consider-ably. From Fig. 10, we note that the estimator algorithm isextremely effective in estimating accurately the number ofthrottles in RðkÞ at t ¼ 50 and t ¼ 100.

5.3 Experiment C (Effect of System Parameters onthe Binary Search Algorithm)

We next evaluate an improved version of the binary searchalgorithm using the estimator algorithm. Reinitializing theparameters in the binary search algorithm that leads to agradual change in rs can be done by taking into account theload mismatch and the total number of effective throttles.The goal of a gradual change is to reduce the large throttlerate variation that we observe in the previous experiments.Particularly, if we have overload, we replace Line 11 inFig. 3 by the following:

low ¼ minðmaxðrs þ ðC � �Þ=estð�; rsÞ; 0Þ; CÞ: ð6Þ

990 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007

Fig. 7. (a) AIMD algorithm with step size of � ¼ 0:02M ðTs ¼ 4:0s,J ¼ 74:1MbÞ� ¼ 0:2M ðTs ¼ 2:5s; J ¼ 50MbÞ, and � ¼ 0:4M ðTs ¼ 42:7;664:3MbÞ. (b) Binary search algorithm. � ¼ 1M ðTs ¼ 6:6s; J ¼ 183:9MbÞ.(c) PAC ðKP ¼ 1=2Þ algorithm ðTs ¼ 4:8s; J ¼ 38:7MbÞ. (d) PAFR algo-rithm with KP ¼ 0:5 and KD ¼ 0:48 ðTs ¼ 1:5s; J ¼ 18:6MbÞ.

Fig. 8. (a) AIMD algorithm with step size of � ¼ 0:02M ðTs ¼ 9:3s;

J ¼ 114MbÞ, � ¼ 0:2M ðTs ¼ 2:0s; J ¼ 28:1MbÞ, and � ¼ 0:4M ðTs ¼16:1s; J ¼ 236:4MbÞ. (b) Binary search algorithm. � ¼ 1M ðTs ¼ 9:8s;

J ¼ 229:2MbÞ. (c) PAC ðKP ¼ 1=2Þ algorithm ðTs ¼ 2:4s; J ¼ 12:9MbÞ.(d) PAFR algorithm withKP ¼ 0:5 andKD ¼ 0:48 ðTs ¼ 1:3s; J ¼ 7:1MbÞ.

Fig. 9. (a) Binary search algorithm exhibits large throttle rate variation.(b) PAFR algorithm with KP ¼ 0:5 and KD ¼ 0:48. Damping providesbetter transient response while maintaining excellent response time.

Page 9: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

Likewise, if we have underload, we execute (6) with thevariable “low” replaced by the variable “high” at Line 16 inFig. 3. We term this the improved binary search algorithm.

We repeat the second experiment using the improvedbinary search, and observe the system response when thestep disturbance occurs at t ¼ 50. Fig. 11 shows the effect ofdifferent � on Ts and J of the improved binary searchalgorithm. We observe that if � is too small, the convergencetime is much longer. A longer time is also needed to detectthat reinitialization is required, which results in a transientresponse characterized by large overshoots. If � is too large,the algorithm becomes more sensitive to small perturbationin the offered load. As compared to the previous experi-ment, we observe that using (6) tends to stabilize the binarysearch algorithm. In fact, by using information from theload mismatch, the gradual reinitialization mechanism in(6) brings the improved binary search a step closer andcomparable to a PAC algorithm. However, the systemresponse is still sensitive to the parameter setting of �, whichis hard to tune. Hence, the performance of the improvedbinary search algorithm is unsatisfactorily as compared tothe PAC or PAFR algorithm.

5.4 Experiment D (Effect of Parameters Us and Ls)

We repeat the second experiment, but increase the range½Ls; Us� to [100Mbps, 125Mbps]. Fig. 12 shows the aggregateserver load for binary search, PAC ðKP ¼ 0:5Þ and PAFRalgorithms, respectively. From Experiments B and D, we notethat a narrow ½Ls; Us� makes the binary search algorithm

unstable, again due to the need for repeated reinitializationsand delays. The PAC algorithm is stable, but overshootsremain in the transient response, while the PAFR algorithmmaintains its excellent performance with fast convergencetime. In general, a narrow ½Ls; Us� is more demanding for thethrottle algorithms.

5.5 Experiment E (Heterogeneous Network Sourceswith Randomly Varying Input)

In this experiment, we introduce stochastic effects in the loadlevel of theRðkÞ routers. The presence of stochastic elementsrepresents traffic with background flows in a network, whichmay lead to a larger variation in the input with a lowerstability margin. The delay in each link is uniformlydistributed from 1s to 3:5s, andW ¼ 2s. There are altogether50 routers in RðkÞ. Fifteen routers have sinusoidal varyingsources—five with a rate of 0:5 sinð0:16tÞ þ 3Mbps, five with arate of 0:5 sinð0:16tþ 0:5�Þ þ 1:5Mbps, and five with sinð0:16tþ1:5�Þ þ 3:5Mbps. Ten routers have square pulses that varybetween 1.5Mbps and 2.5Mbps with a period of 80s. Twentyfive routers have Gaussian distributed varying source rates—10 with mean 3Mbps and variance 1Mbps, 10 with mean3.5Mbps and variance 0.5Mbps, and five with mean 4Mbpsand variance 1Mbps.

We perform the experiment for two scenarios: In the firstscenario, S assumes jRðkÞjmax as 50. As shown in Figs. 13a,13b, and 13c, the response of the control system is relativelysmooth for smallKp in the presence of large delays and largevariation in the sources. A large Kp makes the systemsensitive to noise in the load level. Fig. 13d shows thatestð�; rsÞpredicts that there are close to 50 effective throttles inthe system. In the second scenario, we assume a smallerjRðkÞjmax ¼ 10. As shown in Figs. 13a, 13b, and 13c andFigs. 14a, 14b, and 14c, the responses forKP ¼ 0:05 andKP ¼0:1 are relatively smoother than that for KP ¼ 0:2. However,as shown in Fig. 14d, even though the number of routers canbe grossly underestimated, the PAFR algorithm can stillfunction properly and meet the target load. We repeat theexperiment for the case of jRðkÞjmax ¼ 50, and increase the low

TAN ET AL.: A DISTRIBUTED THROTTLING APPROACH FOR HANDLING HIGH BANDWIDTH AGGREGATES 991

Fig. 10. Estimating the number of throttles in RðkÞ using the estð�; rsÞalgorithm.

Fig. 11. Improved binary search algorithm with (a) � ¼ 0:2M ðTs ¼ 3:7s;J ¼ 27:0MbÞ and (b) � ¼ 1M ðTs ¼ 1:7s; J ¼ 13:2MbÞ.

Fig. 12. (a) Binary search with � ¼ 1M ðTs ¼ 3:9; J ¼ 47:0MbÞ.(b) Improved binary search with � ¼ 1M ðTs ¼ 1:5s; J ¼ 12:9MbÞ.(c) PAC algorithm with Kp ¼ 1=2 ðTs ¼ 3:2; J ¼ 19:4MbÞ (d) PAFRalgorithm with Kp ¼ 0:5;KD ¼ 0:48 ðTs ¼ 1:0s; J ¼ 7:2MbÞ. As com-pared to Experiment B, a larger ½Ls; Us� reduces the cost J.

Page 10: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

pass filter time constant to ¼ 0:95. As shown in Fig. 15d, a

larger tracks a larger variation in the number of effective

throttle routers, but the equilibrium points are not affected

much as shown in Figs. 15a, 15b, and 15c.

Remark 9. The PAFR algorithm is more effective in

reducing the fluctuation when there are many unknown

parameters, e.g., attack distribution and the number of

effective throttles, and is also more robust than the other

algorithms in minimizing load variation regardless of the

window size used, thus allowing statistical filtering to be

done with a higher degree of confidence.

6 RELATED WORK

Handling high bandwidth aggregates using router throt-tling can be viewed as a congestion control problem. A largebody of work on congestion control assumes that thefeedback is in the form of a binary signal. In our context, wehave the luxury of sending an explicit throttling rate to thetraffic regulators (routers in this case). So, it is simpler for usto achieve max-min fairness without using AIMD-styledistributed algorithms. In this respect, our approach issimilar to XCP [8], in being able to separate the problem offair allocation with the problem of controlling the totalaggregate traffic.

In searching for the simplest control (for the totalaggregate traffic), we consider binary search, which isinspired by the work in [9]. But, this approach quicklybecomes more complex and less effective when consideringdynamic traffic and delay.

Our work is similar to those efforts in analyzing Internetand ATM congestion control, for example, the ATM Avail-able Bit Rate (ABR) model in [3], [11] and the TCP model in [6],[12]. The departure is in the detailed model where we useexplicit rate feedback, and introduce a hysteresis for thedesired capacity. In addition, unlike congestion control in theInternet, an effective estimator that estimates the number ofeffective throttles and the role of the window size is crucial fordistinguishing legitimate users and attackers.

Defending against DDoS flooding-based attack or flashcrowds using different rate throttling mechanism at up-stream routers can be found in [16]. In [16], a pushback max-min fairness mechanism used for controlling large band-width aggregates is proposed, which is initiated at thecongested source and applied recursively to the upstreamrouters. However, this approach may cause upstreamrouters with low traffic demand to be excessively penalized.

Other router-assisted network congestion controlschemes that handle high bandwidth aggregates in a fairmanner include detection and dropping using Active

992 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007

Fig. 13. Performance of the PAFR throttle algorithm with anestimate that jRðkÞjmax ¼ 50, KD ¼ 0:48 as KP takes the value of(a ) KP ¼ 0:05 ðTs ¼ 13:4s; J ¼ 92:0MbÞ, ( b ) KP ¼ 0:1 ðTs ¼ 8:6s;J ¼ 50:6MbÞ, and (c) KP ¼ 0:2 ðTs ¼ 13:5s; J ¼ 71:0MbÞ. (d) Esti-mated number of effective throttles using the estð�; rsÞ algorithm forthe case KP ¼ 0:1.

Fig. 14. Performance of the PAFR throttle algorithm with completeinformation that jRðkÞjmax ¼ 10, KD ¼ 0:48 as KP takes the value of(a) KP ¼ 0:05 ðTs ¼ 35:4s; J ¼ 286:8MbÞ, (b) KP ¼ 0:1 ðTs ¼ 19:5s;J ¼ 145:0MbÞ, and (c) KP ¼ 0:2 ðTs ¼ 60:0s; J ¼ 790:1Þ. (d) Esti-mated number of effective throttles using the estð�; rsÞ algorithmfor the case KP ¼ 0:1.

Fig. 15. Performance of the PAFR throttle algorithm using low pass filtertime constant ¼ 0:95 for estð�; rsÞ with an estimate of upper boundjRðkÞjmax ¼ 50 for (a) KP ¼ 0:05 ðTs ¼ 16:1s; J ¼ 84:6MbÞ, (b) KP ¼ 0:1ðTs ¼ 12:9s; J ¼ 73:7MbÞ, and (c) KP ¼ 0:2 ðTs ¼ 18:6s; J ¼ 180:0MbÞ.(d) Estimated number of effective router for the case of KP ¼ 0:1.

Page 11: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

Queue Management (AQM) mechanism in congestedrouters such as balanced RED [1], RED with PreferentialDropping [17], Stochastic Fair Blue [5], and CHOKe [19].The Network Border Patrol (NBP) with enhanced core-stateless fair queueing [2] is a framework used to controlflow aggregates. However, some of these schemes requireeither full or partial state information and some schemes donot guarantee max-min fairness in the shortest possibletime. Besides, these schemes are more complex than ratethrottling, which has low runtime overhead.

7 CONCLUSION

We analyze how a class of distributed server-centric routerthrottling algorithms can be used to achieve max-minfairness in a number of routers that are k hops away whenthe server is overloaded by large bandwidth aggregates,which may occur under DDoS attack or flash crowdscenarios. Our proposed throttle algorithms, which arebased on a control-theoretic approach, are shown to behighly adaptive to dynamic traffic situations. The stabilityand convergence issues of these algorithms are also studied.In particular, we show that the PAFR fair throttle algorithmis robust against a wide range of settings—not only does itguarantee convergence, but, in addition, it has the least loadfluctuation. Level-k max-min fairness is guaranteed for allthe throttling routers in RðkÞ.

Designing throttle algorithms to counter DDoS attackswith limited information is challenging, thus it is notsurprising that many issues remain open in this topic. First,it will be desirable to have more in-depth analysis of theproposed throttle algorithms in terms of convergence rateand transient performance. Second, we can investigateother cost functions for optimal setting, which may lead toother optimal fair throttle algorithms. The effectiveness ofvarious throttle algorithms can thus be examined under avariety of cost functions that may suit different networkscenarios under DDoS attacks. Third, we have taken delayinto account in analyzing our throttle algorithms. The nextstep would be to test our algorithms in network scenariosunder which the deployment depth varies. This hasimplication on the practical deployment of throttle algo-rithms in a real network as well as providing different linesof defense against DDoS attacks. Related issues on theimpact of deployment depth on throttle algorithms can alsobe found in [22].

Future directions of our work include testing ouralgorithms in networks with diverse delays. We also planto implement our proposed throttle algorithms in OPERA,which is an open-source programmable router platform thatprovides flexibility for researchers to experiment withsecurity services [18]. Particularly, OPERA allows dynami-cally loadable software modules to be installed on the flywithout shutting down the routing functionalities in therouter, which facilitates testing different throttle andstatistical filtering algorithms on a security testbed.

APPENDIX A

Proof of Lemma 1

We first let KD ¼ 0 in (3) to obtain dyðtÞ=dt ¼ KP ðC � yðt�DÞÞ, which is rewritten as dxðtÞ=dt ¼ �KPxðt�DÞ,

where xðtÞ ¼ yðtÞ � C. Next, a necessary and sufficient proofthat follows the frequency domain method in [23] is given asfollows: For stability, the roots of any characteristic equationmust lie on the left half complex plane (see [13], p. 321). Wewish to determine KP such that the characteristic equationsþKPe

�sD ¼ 0 has stable poles with delay from 0 up to D0,and this set ofKP is denoted as SR (following the conventionin [23]). First, by the Routh-Hurwitz criterion (see [13], p. 322),the set ofKP s that stabilize the delay-free plant (given asS0 in[23]) is S0 ¼ fKP > 0g. Second, there exists KP such that thecharacteristic equation is stable if D > 0. Third, we computethe set SL ¼ fKP jKP 62 SN and 9 D 2 ½0; D0�; w 2 R; s:t: jwþKPe

�jwD ¼ 0g, which is given by SL ¼ fKP � �=ð2D0Þg.Hence, SR ¼ S0 n SL and, thus, SR ¼ f0 < KP < �=ð2D0Þg.But,KP must necessarily be not more than 1 for convergenceas the system dynamics is a search for a value between thecurrent server load and C. Hence, 0 < KP < minð�=2D0; 1Þ.

Next, we denote the open loop transfer function asLðs;KP ;DÞ ¼ KPe

�sD=s. For a fixed w and for any positiveD < D0, ffLðjw;KP ;DÞ > ffLðjw;KP ;D0Þ, i.e., we alwayshave positive phase margins. Hence, the PAC algorithmcan stabilize the system for D � D0. To prove convergenceof xðtÞ to zero, note that jKPe

�jsD=sj ! 1 as s! 0. Hence,zero steady state error is ensured in the face of a stepdisturbance. The tradeoff between system response anddelay tolerance can be verified by noting that the closedloop system bandwidth wB satisfies KP < wB < 2KP , andwB is roughly inversely proportional to the response time ofthe system in response to a step disturbance.

APPENDIX B

Proof of Theorem 1

For J in (5) to exist, xðtÞ must converge to zero in the limitas t!1. From Lemma 1, this is true if and only if0 < KP < minð�=2D0; 1Þ. Hence, any optimal KP , if it exists,must satisfy Lemma 1. Next, we use Parseval’s theorem tocompute J , which is rewritten as J ¼ ð1=j2�Þ

R j1�j1XðsÞ

Xð�sÞ ds where XðsÞ denotes the Laplace transform of xðtÞ(cf. [13], p. 583). Ignoring initial conditions, we let XðsÞ ¼1=ðsþKPe

�sDÞ for the PAC algorithm. Next, we obtain

J ¼ ðtanKPDþ secKPDÞ=ð2KP Þ¼ ðtanðKPD=2þ �=4ÞÞ=ð2KP Þ:

To compute the optimal KP , we let dJ=dKP ¼ 0, whichresults in cosKPD ¼ KPD. Using Taylor’s series approxima-tion, this results in 1� ðKPDÞ2=2 � KPD, which givesKP;opt ¼ ð

ffiffiffi3p� 1Þ=D. Since

d2J=dK2P jKP¼KP;opt

¼ 1þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ðKP;optDÞ2

q� �=ð2K2

P;optDÞ > 0;

J is indeed the minimum. Moreover, for all D, KP;opt

satisfies Lemma 1, thus KP;opt indeed stabilizes the systemfor all delay D < D0.

APPENDIX C

Proof of Theorem 2

Continuing along the same line of proof as Theorem 1, we

let XðsÞ ¼ 1=ðsþ ðKP þKDsÞe�sDÞ. Now, J in (5) exists

TAN ET AL.: A DISTRIBUTED THROTTLING APPROACH FOR HANDLING HIGH BANDWIDTH AGGREGATES 993

Page 12: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

since jðKP þKDsÞe�jsD=sj ! 1 as s! 0. The parameters

KP and KD that satisfy the sufficient and necessary

condition for asymptotic stability can again be obtained

by the approach in [23] and the Nyquist criterion to yield

SPD ¼ f0 < KP < 1; and KPD=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1�K2

D

p< arccosð�KDÞg.

Hence, the optimal set of control parameters KP;opt and

KD;opt must lie in the set, SPD. Next, we have

J ¼1=ð2KP

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1�K2

D

qÞðKP cos�DþKD� sin�DÞ=ð��KP

sin�Dþ �KD cos�DÞ;

where � ¼ KP=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1�K2

D

p, which can be simplified as

J¼D tanðð�þ�Þ=2Þ=ð2� sin2 �Þ, where �¼KPD=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1�K2

D

pand � ¼ arccosKD. To yield the optimalKP andKD, we have

@J=@� ¼ 0 and @J=@� ¼ 0 for 0 < � < � and 0 < � < �� �.

Finally, we have � ¼ sinð�þ�Þ and � ¼ 0:5 tan �. Solving

these two equations numerically, we obtain the optimal �opt

a s � ¼ 0:92. S i n c e �opt ¼ 0:5 tan �, w e h a v e KP ¼2�2

opt=ðDffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1þ 4�2

opt

qÞ and KD ¼ 1=

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1þ 4�2

opt

q, which we

obtainKP � 0:8=D andKD � 0:48. We invoke the same steps

as before by fixing w and, for all D � D0, the loop transfer

function always has a positive phase margin.

APPENDIX D

Proof of Corollary 1

For a fixed D, from Theorem 1, the cost function for the

PAC algorithm is JP ¼ ð1þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1� ðKP;optDÞ2

qÞ=2K2

P;optD,

which is approximately 1:6D, and, for the PAFR algorithm,

from Theorem 2, � ¼ KP;opt=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1�K2

D;opt

q� 0:91=D, thus we

have JPD � 1:1D. For all D > 0, JP > JPD.

ACKNOWLEDGMENTS

The authors thank the editor for coordinating the reviewprocess and the anonymous reviewers for their insightfulcomments and constructive suggestions. They also thankWang Yue for his help in the ns-2 simulation. The researchof D.M. Chiu was supported in part by the RGC EarmarkGrant 4232/04Grant. The research of John C.S. Lui wassupported in part by the RGC Grant 4218/04.

REFERENCES

[1] F.M. Anjum and L. Tassiulas, “Fair Bandwidth Sharing amongAdaptive and Non-Adaptive Flows in the Internet,” Proc. IEEEINFOCOM ’99, Mar. 1999.

[2] C. Albuquerque, B.J. Vickers, and T. Suda, “Network BorderPatrol: Preventing Congestion Collapse and Promoting Fairness inthe Internet,” IEEE/ACM Trans. Networking, vol. 12, no. 1, pp. 173-186, Feb. 2004.

[3] L. Benmohamed and S.M. Meerkov, “Feedback Control ofCongestion in Packet Switching Networks: The Case of a SingleCongested Node,” IEEE/ACM Trans. Networking, vol. 1, no. 6,pp. 693-708, Dec. 1993.

[4] D.M. Chiu and R. Jain, “Analysis of the Increase/DecreaseAlgorithms for Congestion Avoidance in Computer Networks,”J. Computer Networks and ISDN, vol. 17, no. 1, pp. 1-14, June 1989.

[5] W. Feng, K. Shin, D. Kandlur, and D. Saha, “Stochastic Fair Blue:A Queue Management Algorithm for Enforcing Fairness,” Proc.IEEE INFOCOM ’01, Apr. 2001.

[6] C.V. Hollot, V. Misra, D. Towsley, and W. Gong, “Analysis andDesign of Controllers for AQM Routers Supporting TCP Flows,”IEEE Trans. Automatic Control, vol. 47, no. 6, pp. 945-959, June2002.

[7] K.Y.K. Hui, J.C.S. Lui, and D.K.Y. Yau, “Small-World Overlay P2PNetworks,” Computer Networks, vol. 50, no. 15, Oct. 2006.

[8] D. Katabi, M. Handley, and C. Rohrs, “Congestion Control forHigh Bandwidth-Delay Product Networks,” Proc. ACM SIG-COMM, pp. 89-102, Aug. 2002.

[9] R. Karp, E. Koutsoupias, C. Papadimitriou, and S. Shenker,“Optimization Problems in Congestion Control,” Proc. 41st Ann.IEEE CS Conf. Foundations of Computer Science, Nov. 2000.

[10] Y. Kim, W.C. Lau, M.C. Chuah, and J.H. Chao, “PacketScore:Statistics-Based Overload Control against Distributed Denial-of-Service Attacks,” Proc. IEEE INFOCOM ’04, Mar. 2004.

[11] A. Kolarov and G. Ramamurthy, “A Control-Theoretic Approachto the Design of an Explicit Rate Controller for ABR Service,”IEEE/ACM Trans. Networking, vol. 7, no. 5, pp. 741-753, Oct. 1999.

[12] S. Kunniyur and R. Srikant, “Analysis and Design of an AdaptiveVirtual Queue Algorithm for Active Queue Management,” Proc.ACM SIGCOMM, pp. 123-134, Aug. 2001.

[13] B.C. Kuo, Automatic Control Systems. Prentice Hall, 1975.[14] K.T. Law, J.C.S. Lui, and D.K.Y. Yau, “You Can Run, But You

Can’t Hide: An Effective Statistical Methodology to Trace BackDDoS Attackers,” IEEE Trans. Parallel and Distributed Systems,vol. 16, no. 9, Sept. 2005.

[15] Q. Li, E. Chang, and M.C. Chan, “On the Effectiveness of DDoSAttacks on Statistical Filtering,” Proc. IEEE INFOCOM ’05, Mar.2005.

[16] R. Mahajan, S.M. Bellovin, S. Floyd, J. Ioannidis, V. Paxson, and S.Schenker, “Controlling High Bandwidth Aggregates in the Net-work,” ACM SIGCOMM Computer Comm. Rev., vol. 32, no. 3,pp. 62-73, July 2003.

[17] R. Mahajan, S. Floyd, and D. Wetherall, “Controlling High-Bandwidth Flows at the Congested Router,” Proc. Ninth IEEE Int’lConf. Network Protocols, Nov. 2001.

[18] B.C.B. Chan, J.C.F. Lau, and J.C.S. Lui, “OPERA: An Open-SourceExtensible Router Architecture for Adding New Network Servicesand Protocols,” J. Systems and Software, vol. 78, no. 1, pp. 24-36,Oct. 2005.

[19] R. Pan, B. Prabhakar, and K. Psounis, “CHOKe: A Stateless AQMScheme for Approximating Fair Bandwidth Allocation,” Proc.IEEE INFOCOM ’00, Mar. 2000.

[20] H. Wang, D. Zhang, and K.G. Shin, “Detecting SYN FloodingAttacks,” Proc. IEEE INFOCOM ’02, June 2002.

[21] T.Y. Wong, J.C.S. Lui, and M.H. Wong, “An Efficient DistributedAlgorithm to Identify and Traceback DDoS Traffic,” Computer J.,vol. 49, no. 4, 2006.

[22] D.K.Y. Yau, J.C.S. Lui, F. Liang, and Y. Yeung, “Defending againstDistributed Denial-of-Service Attacks with Max-Min Fair Server-Centric Router Throttles,” IEEE/ACM Trans. Networking, vol. 13,no. 1, Feb. 2005.

[23] H. Xu, A. Datta, and S.P. Bhattacharyya, “PID Stabilization of LTIPlants with Time-Delay,” Proc. 42nd IEEE Conf. Decision andControl, Dec. 2003.

994 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 18, NO. 7, JULY 2007

Page 13: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED … · against flooding-based DDoS attacks (i.e., one form of high bandwidth aggregates) as a resource management problem of protecting

Chee Wei Tan received the MA degree inelectrical engineering from Princeton University,Princeton, New Jersey, in 2006. He is currentlyworking toward the PhD degree in the Depart-ment of Electrical Engineering at PrincetonUniversity. He was a research associate withthe Advanced Networking and System ResearchGroup at the Chinese University of Hong Kong in2004. He spent the summer of 2005 at FraserResearch Lab, Princeton, New Jersey. His

current research interests include queueing theory, nonlinear optimiza-tion, communication networks, and wireless and broadband commu-nications. He is a student member of the IEEE.

Dah-Ming Chiu (SM’03) received the BScdegree from Imperial College London in 1975,and the PhD degree from Harvard University in1980. He worked for Bell Labs, Digital EquipmentCorporation, and Sun Microsystems Labora-tories. Currently, he is a professor in the Depart-ment of Information Engineering at the ChineseUniversity of Hong Kong. He is on the editorialboard of the International Journal of Communica-tion Systems. He is a senior member of the IEEE.

John C.S. Lui (SM’02) received the PhD degreein computer science from the University ofCalifornia at Los Angeles (UCLA). He workedat the IBM T.J. Watson Research Laboratoryand the IBM Almaden Research Laboratory/SanJose Laboratory after his graduation. He parti-cipated in various research and developmentprojects on file systems and parallel I/O archi-tectures. He later joined the Department ofComputer Science and Engineering, Chinese

University of Hong Kong. His research interests are in systems and intheory/mathematics. His current research interests are in theoretic/applied topics in data networks, distributed multimedia systems, networksecurity, OS design issues, and mathematical optimization andperformance evaluation theory. Dr. Lui has received various depart-mental teaching awards and the CUHK Vice-Chancellor’s ExemplaryTeaching Award in 2001. He is an associate editor of the PerformanceEvaluation Journal, a member of the ACM, and an elected member ofthe IFIPWG7.3. He was the TPC cochair of ACM Sigmetrics 2005, andis currently on the Board of Directors for ACM SIGMETRICS. He is asenior member of the IEEE.

David K.Y. Yau (M’97) received the BSc (firstclass honors) degree from the Chinese Univer-sity of Hong Kong, and the MS and PhD degreesfrom the University of Texas at Austin, all incomputer sciences. From 1989 to 1990, he waswith the Systems and Technology group ofCitibank, NA. He is currently an associateprofessor of computer sciences at PurdueUniversity, West Lafayette, Indiana. Dr. Yauwas the recipient of an IBM graduate fellowship.

He received a US National Science Foundation (NSF) CAREER awardin 1999, for research on network and operating system architectures andalgorithms for quality of service provisioning. His other researchinterests are in network security, value-added services routers, andmobile wireless networking. He is a member of the IEEE and the ACM,and serves on the editorial board of the IEEE/ACM Transactions onNetworking.

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

TAN ET AL.: A DISTRIBUTED THROTTLING APPROACH FOR HANDLING HIGH BANDWIDTH AGGREGATES 995