Fluctuation Bounds for the Max-Weight Policy, with ...jnt/Papers/P-18-MW+SSC-fin.pdf · The SSC results then follow from the property that the uid so-lutions are attracted to a low-dimensional

FLUCTUATION BOUNDS FOR THE MAX-WEIGHTPOLICY, WITH APPLICATIONS TO STATE SPACE

COLLAPSE

By Arsalan Sharifnassab∗,‡, John N. Tsitsiklis†, S. JamaloddinGolestani∗,

June 12, 2019

We consider a multi-hop switched network operating under aMax-Weight (MW) scheduling policy, and show that the distance be-tween the queue length process and a fluid solution remains boundedby a constant multiple of the deviation of the cumulative arrival pro-cess from its average. We then exploit this result to prove matchingupper and lower bounds for the time scale over which additive statespace collapse (SSC) takes place. This implies, as two special cases,an additive SSC result in diffusion scaling under non-Markovian ar-rivals and, for the case of i.i.d. arrivals, an additive SSC result overan exponential time scale.

CONTENTS

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 General Background . . . . . . . . . . . . . . . . . . . . . . . 31.2 State Space Collapse Literature . . . . . . . . . . . . . . . . . 31.3 Preview of Results . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 System Model and Preliminaries . . . . . . . . . . . . . . . . 62.1 Notation and Conventions . . . . . . . . . . . . . . . . . . . . 62.2 The Network Model and the MW Policy . . . . . . . . . . . . 72.3 The Fluid Model . . . . . . . . . . . . . . . . . . . . . . . . . 92.4 State Space Collapse . . . . . . . . . . . . . . . . . . . . . . . 11

3 Main Result: Sensitivity . . . . . . . . . . . . . . . . . . . . . . 133.1 Convergence to Fluid Model Solutions . . . . . . . . . . . . . 14

∗A. Sharifnassab and S. J. Golestani are with the Department of ElectricalEngineering, Sharif University of Technology, Tehran, Iran; email: [email protected],[email protected]†J. N. Tsitsiklis is with the Laboratory for Information and Decision Systems, Electrical

Engineering and Computer Science Department, MIT, Cambridge MA, 02139, USA; email:[email protected]‡This work was partially done while A. Sharifnassab was a visiting student at the

Laboratory for Information and Decision Systems, MIT, Cambridge MA, 02139, USA.

1

http://www.i-journals.org/ssy/

2

4 State Space Collapse . . . . . . . . . . . . . . . . . . . . . . . . 144.1 Definitions and Preliminaries . . . . . . . . . . . . . . . . . . 144.2 Main SSC Result . . . . . . . . . . . . . . . . . . . . . . . . . 164.3 Special Cases of SSC . . . . . . . . . . . . . . . . . . . . . . . 18

5 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . 195.1 From WMW to MW . . . . . . . . . . . . . . . . . . . . . . . 205.2 FPCS Dynamical Systems . . . . . . . . . . . . . . . . . . . . 215.3 Reduction of the MW Dynamics to an FPCS System . . . . . 225.4 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . . . . . . 317 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

7.1 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 407.2 Other Applications and Extensions . . . . . . . . . . . . . . . 417.3 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 42

A Proof of Lemma 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 42B Proof of Lemma 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 45References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

1. Introduction. The subject of this paper is a new line of analysis ofthe Maximum Weight (MW) scheduling policy for single-hop and multi-hopnetworks. The main ingredient is a purely deterministic qualitative propertyof the queue dynamics: the trajectory followed by the queue vector under aMW policy tracks the trajectory of an associated deterministic fluid model,within a constant multiple of the cumulative fluctuation of the arrival pro-cesses. With this property at hand, it is then a conceptually simple matter totranslate concentration properties of the arrival processes to concentrationproperties for the queue vector. As a consequence, we can obtain:

(a) New, simple derivations of existing results on the convergence to afluid solution/trajectory and on state space collapse (SSC).

(b) Stronger versions of existing SSC results, involving more general arrivalprocesses, and tighter concentration bounds.

(c) An approach to obtaining new results that would seem rather difficultto establish with existing methods.

The core of our approach is the trajectory tracking result mentioned above.The latter is in turn an adaptation of a similar result established in [29],for a general class of continuous-time hybrid systems that move along thesudifferential of a piecewise linear convex potential function with finitelymany pieces; other than an additional restriction to the positive orthant,a continuous-time variant of the MW dynamics turns out to be exactly ofthis type. However, a fair amount of additional work is needed to translate

3

the general result to the standard, discrete-time, MW setting; cf. Theorem 2and its proof.

1.1. General Background. We consider a multi-hop switched networkwith fixed routing, such as those arising in wireless networks [13] or switchfabrics [10]. The network operates in discrete time, and is driven by jobs(or packets) that arrive according to a stochastic, deterministic, or adver-sarial process. There is a scheduler which, at each time step, selects one offinitely many possible service vectors. These service vectors can be fairlyarbitrary, reflecting interdependence constraints between different servers,e.g., interference constraints in the context of wireless networks.

We focus on the popular MW scheduling policy [34], which operates asfollows. At any time step, a MW policy associates to each queue a weightproportional to its length, and selects a service vector that maximizes thetotal weighted service. MW policies are known to have a number of attractiveproperties such as maximal throughput [34, 7, 8]. In addition, under certainconditions, e.g., a resource pooling assumption, they minimize the workloadin the heavy traffic regime [31]. On the other hand, the queue size dynamics,under MW policies, are quite complex, and a detailed analysis is difficult.

A common way of reducing the complexity of the analysis involves a fluidapproximation, also known as a fluid model. The fluid model relies on twosimplifications that lead to a description in terms of a set of differentialequations (cf. Subsection 2.3): (a) the dynamics evolve in continuous —rather than discrete — time, and (b) the arrival process is replaced by aconstant flow with the same average. The fluid model underlies a generaltechnique for dealing with discrete-time networks: approximate the queuelengths by fluid solutions and then analyze the fluid model. This approachhas proved useful in the study of the MW dynamics, leading to results onstability ([5, 1]), SSC ([31, 26, 27, 4]) and delay stability under heavy tailedarrivals ([19, 20]). A key ingredient behind such results is an understandingof the accuracy with which fluid solutions approximate the original queuelength processes; this paper contributes to this understanding.

1.2. State Space Collapse Literature. A prominent application of fluidmodels is in establishing state space collapse (SSC), i.e., that in the heavytraffic regime, the queue length process stays close to a low-dimensional set,for a long time, and with high probability.1

1We note here the important distinction between multiplicative and additive (or strong)state space collapse, which is discussed further in Section 2.4. The literature review hereis mostly about multiplicative state space collapse.

4

Seminal SSC results for communication networks were given in the worksof Reiman [22], Bramson [2], and Williams [36]. Subsequently, several works[31, 26, 27, 11] followed the general framework of Bramson [2] to prove SSCunder different scheduling policies, including for the case of MW policies.The general approach involves splitting an O(r2)-long interval into inter-vals of length O(r), and then showing that the fluid-scaled processes (i.e.,q(t) = 1

rQ(brtc)) stay close to the fluid solutions in each one of these smallerintervals. The SSC results then follow from the property that the fluid so-lutions are attracted to a low-dimensional set, called the set of invariantpoints.

For single-hop networks with Markovian arrivals operating under a gen-eralization of the MW policy, SSC was proved in [31]. It was also shown,in [31], as a consequence of SSC, that the workload process converges to areflected Brownian motion, and that every MW-α policy2 with α > 0 min-imizes this workload among all scheduling algorithms. The results of [31]were extended to multi-hop networks in [3], and to another generalizationof MW policies in [28]. For multi-hop networks with non-Markovian arrivalsoperating under MW-α, SSC under diffusion scaling was studied in [27].Several works [12, 24] then used the results of [27] to provide diffusion ap-proximations for the MW dynamics. Finally, SSC has also facilitated thestudy of the steady-state expectation of the number of jobs in a network[6, 16, 17, 14, 37, 35].

1.3. Preview of Results. Our approach to the analysis of MW policiesrelies on a bound on the distance of the queue length processes from the fluidsolutions, in terms of the fluctuations of the cumulative arrival processes. Inmore detail, we consider a queue length process Q(·), driven by an arrivalprocess A(·) with average rate λ, and compare Q(·) with a fluid solutionq(·) driven by a steady arrival stream with the same rate λ, under the sameinitial conditions q(0) = Q(0).

We already know that, under suitable scaling, the trajectories of the orig-inal discrete-time process remain close to the fluid solutions. Furthermore,the fluid model is well-known to be non-expansive3 [33]. By combining thesefacts, it is quite plausible that one should be able to derive bounds of the

2For any given α > 0, the MW-α policy is an extension of the MW policy in which the“weight” of queue i is proportional to Qαi , where Qi is the length of the queue at node i.

3A dynamical system is called non-expansive if for any two trajectories, x(·) and y(·),we have d

dt

∥∥x(t)− y(t)∥∥ ≤ 0.

5

form

(1)∥∥Q(t)− q(t)

∥∥ ≤ c +t−1∑τ=0

∥∥A(τ)− λ∥∥,

where A(t) is the vector of arrivals at each one of the queues at time t, andc is a constant which is independent of A(·). However, our goal is to derivea stronger bound, of the form

(2)∥∥Q(t)− q(t)

∥∥ ≤ c + C maxk<t

∥∥∥ k∑τ=0

(A(τ)− λ

)∥∥∥,for some constants c and C, independent of A(·) and λ. The bounds inEqs. (1) and (2) are qualitatively different. Under common probabilisticassumptions, and with high probability,

∑t−1τ=0

∥∥A(τ) − λ∥∥ grows at a rate

of t, whereas maxk<t∥∥∑k

τ=0

(A(τ)− λ

)∥∥ only grows as (roughly)√t.

The sensitivity bound (2) allows us to make several contributions to thestudy of the MW policy.

(a) We obtain a very simple proof of the convergence of fluid-scaled pro-cesses to fluid solutions; cf. Corollary 1.

(b) We establish a strong SSC result for the MW policy. In particular, wederive an upper bound and a matching lower bound on the time scaleover which additive SSC takes place; cf. Theorem 3. As a corollary,when the arrivals are i.i.d, we establish SSC for the process q(t) =Q(beαrtc)/r, for some constant α, i.e., over an exponentially long timescale; cf. Corollary 2.

(c) In another corollary, we establish an additive SSC result in diffusionscaling and under non-Markovian arrivals, which strengthens the cur-rently available diffusion scaling results under the MW policy in severalrespects; see Section 2.4 for more details.

(d) As will be reported elsewhere, the sensitivity bound (2) provides toolsthat allow us to resolve an open problem from [20], on the delay sta-bility in the presence of heavy-tailed traffic.

On the technical side, the proof of the sensitivity bound (2) exploits asimilar bound from our earlier work [29] on the sensitivity of a class ofhybrid subgradient dynamical systems to fluctuations of external inputs ordisturbances. The main challenges here concern the transition from discreteto continuous time, as well as the presence of boundary conditions, as queuesizes are naturally constrained to be non-negative. For the proof of ourSSC results, we follow the general framework of Bramson [2], while also

6

taking advantage of the sensitivity bound (2). We believe that our tightcharacterization of the time scale over which SSC holds would have beenvery difficult without the strong sensitivity bound (2).

1.4. Outline. The rest of the paper is organized as follows. In the nextsection, we describe the network model and our conventions, along withsome background on fluid models and SSC. In Section 3, we present ourcentral result, which is an inequality of the form (2); cf. Theorem 2. Then,in Section 4, we present our SSC results. We provide the proofs of our resultsin Sections 5 and 6, while relegating some of the details to appendices, forimproved readability. Finally, in Section 7, we offer some concluding remarksand discuss possible extensions.

2. System Model and Preliminaries. In this section, we list ournotational conventions, define the network model that we will study, andgo over the necessary background on fluid models and State Space Collapse(SSC).

2.1. Notation and Conventions. We denote by R, R+, R++, Z, Z+, andN the sets of real numbers, non-negative reals, positive reals, integers, non-negative integers, and positive integers, respectively.

A vector v ∈ Rn, will always be treated as a column vector, with com-ponents vi, for i = 1, . . . , n. We use vT and ‖v‖2, to denote the transposeand the Euclidean norm of v, respectively. For any two vectors v and u inRn, the relation v � u indicates that vi ≤ ui, for all i. Furthermore, weuse min(v, u) to denote the componentwise minimum, i.e., the vector withcomponents min(vi, ui). For a vector µ ∈ Rn and a set J of indices, we useσ−J(µ) to denote the vector whose ith entry is equal to the ith entry of µ ifi 6∈ J , and is equal to zero if i ∈ J . Finally, we let 1n be the n-dimensionalvector with all components equal to 1.

The notation Conv(·) stands for the convex hull of a set of vectors in Rn.Given a vector v ∈ Rn and a set A ⊆ Rn, we let v + A =

{v + x : x ∈

A}

. We use d(v , A) to denote the Euclidean distance of v from the set A.Furthermore, if W is an n × n matrix, we let WA =

{Wx

∣∣x ∈ A} be theimage of the set A under the linear transformation associated with W . Givena vector v ∈ Rn, diag(v) denotes the n× n diagonal matrix with the entriesof v on its main diagonal.

Finally, for a function f : R→ R, and with a slight departure from stan-dard conventions, we use either f(t) or d+

dt f(t) to denote the right derivativeof f at t, assuming that it exists.

7

2.2. The Network Model and the MW Policy. A discrete-time multi-hopnetwork with fixed deterministic routing is specified by n queues, a non-negative n×n routing matrix R, and a finite set S ⊂ Rn+ of actions (or servicevectors) that correspond to the different schedules that can be applied atany time.

The input to a network is a collection of n discrete-time, non-negativearrival processes, described by functions Ai : Z+ → R+, where Ai(t) standsfor the workload that arrives to queue i during the tth time slot. Wheneverthe arrival processes are ergodic stochastic processes, we define the arrivalrate vector λ ∈ Rn+ as the vector whose ith component is the average ofthe process Ai(·). We will use Qi(t) to denote the (always non-negative)workload at queue i at time t, and Q(t) to denote the corresponding work-load vector. In the sequel, we will use the terms workload, queue size, andqueue length, interchangeably. The evolution of Q(t) is determined by theparticular policy used to operate the network.

Given a network and an arrival process A(·), the evolution of the queuelengths is given by:

(3) Q(t+ 1) = Q(t) +A(t) + (R− I) min(µ(t), Q(t)

), ∀ t ∈ Z+,

where µ(t) is the service vector chosen by the policy at time t, and as men-tioned earlier, min

(µ(t), Q(t)

)is to be interpreted componentwise. Equa-

tion (3) corresponds to the situation where a time slot begins with a queuevector Q(t), and then a service vector µ(t) is chosen and applied. Finally,the new arrivals A(t) are recorded at the end of the time slot and contributeto the new queue vector Q(t+ 1).

Note that the routing matrix R is deterministic, pre-specified, and is notaffected by the queue sizes or the scheduling policy. Single-hop networkscorrespond to the special case where R is the zero matrix. More generally,the most common case (single-path routing) is one where the routing matrixhas entries in {0, 1}, with at most one nonzero entry in each column, andwhere the ijth entry being one indicates that any work completed at queuej is transferred to queue i for further processing. However, we allow for moregeneral non-negative matrices R because this additional freedom does notaffect the main proofs, and also allows for a simpler treatment of weightedMW policies; see Lemma 4, in the proof of Theorem 2.

The following assumption will be in effect throughout the paper, and isnaturally valid in typical application contexts.

Assumption 1. For any µ ∈ S, and any i ∈ {1, . . . , n}, the set Salso contains the vector σ−{i}(µ), i.e., the vector obtained by setting the ithcomponent of µ to zero.

8

According to Assumption 1, if a certain service vector µ is allowed, it isalso possible to follow µ at all queues other than queue i, while providingno service to queue i. In particular, the zero vector is always an element ofS. On the technical side, Assumption 1 appears innocuous; however, it isindispensable for the proof technique used in this paper, and has also beenmade in earlier work (cf. Section 4.1 of [32] and Assumption 2.3 of [27]).

We now proceed to define weighted Max-Weight (WMW) policies, whichcan be viewed as either a generalization of MW policies or as a special caseof the broader class of MW-f policies4 considered in [7], and backpressure-based utility maximization algorithms5 considered in [32, 8, 21]. We aregiven a multihop network with n queues, as described above, along with apositive vector w ∈ Rn++, and the associated diagonal matrix W = diag(w).For any Q ∈ Rn+, we let Sw(Q) be the set of maximizers of QT W

(I −R

)µ:

(4) Sw(Q) , argmaxµ∈S

QTW(I −R

)µ.

A WMW policy associated with w (or w-WMW, for short) chooses, at eachtime t, an arbitrary service vector µ(t) ∈ Sw

(Q(t)

).6 A Max-Weight (MW)

policy is a special case of a WMW policy, in which w = 1n. When dealingwith MW policies, we drop the subscript w, and write S(Q) instead ofS1n(Q).

Consider an ergodic and Markovian arrival process with arrival rate vectorλ ∈ Rn+, for which there exists some scheduling policy that stabilizes thenetwork, i.e., results in a positive recurrent process. The closure of the setof all such vectors λ is called the capacity region and is denoted by C.

We now record a fact that will be used later, in the proofs of Lemma 5 andClaim 4. Fix some λ ∈ C in the capacity region and consider a stabilizingpolicy. We define fi as the averrage departure rate from queue i. Then, theflow conservation property f = λ+Rf implies that λ =

(I−R

)f . Moreover,

following an argument similar to the one in Section 3.C of [34], there existsa vector c such that f � c ∈ Conv(S). Assumption 1 then implies thatf ∈ Conv(S), and as a result λ ∈

(I −R

)Conv(S). In conclusion,

(5) C ⊆(I −R

)Conv(S).

4A MW-f policy is obtained by replacing QTW in (4) by f(Q), where f : Rn → Rn isa function in an appropriate class.

5Max-Weight is a special case, with the utility function equal to zero.6For a concrete example, if µ corresponds to serving only queue j, with unit service

rate, and if work completed at queue j is routed to queue i, the term QTW (I −R)µ is ofthe form wjQj − wiQi.

9

A remarkable property of MW and WMW policies is that they are through-put optimal in the sense that for any λ in the interior of C, and any ergodicMarkovian arrival process with average arrival rate vector λ, the resultingprocess is positive recurrent [34]. Similar throughput optimality results areavailable for extensions of MW, e.g., for the so-called f -MW policies [7].

2.3. The Fluid Model. The fluid model associated with the MW policyis a deterministic dynamical system that runs in continuous time, and inwhich the arrival stream is replaced by a steady “fluid” arrival stream withrate vector λ. We will be working with the following definition of the fluidmodel; somewhat different but equivalent definitions can be found in [27]and [20].

Definition 1 (Fluid Solutions). We are given an arrival rate vector λand an initial queue length vector q(0) ∈ Rn+. A fluid model solution (or,simply, fluid solution) is an absolutely continuous function q : R+ → Rn+that together with a collection of functions sµ : R+ → [0, 1], for µ ∈ S,and another function y : R+ → Rn+, satisfies the following relations, almosteverywhere:

(6) q(t) = λ+(R− I

)∑µ∈S

sµ(t)µ − y(t)

,

(7)∑µ∈S

sµ(t) = 1,

(8) yi(t) ≤∑µ∈S

sµ(t)µi, i=1, . . . , n,

(9) if qi(t) > 0, then yi(t) = 0, i=1, . . . , n,

(10) if µ 6∈ Sw(q(t)

), then sµ(t) = 0, ∀ µ ∈ S.

It is known that for any multi-hop network and any initial condition, afluid solution always exists (cf. Appendix A of [4] and Lemma 9 of [20]), andis unique (cf. Lemma 10 of [20]), even though the corresponding sµ(·) andy(·) need not be unique. Moreover, for q(0) � 0, (6)–(10) imply that q(t)remains non-negative for all subsequent times t. Later on, in Proposition 2,

10

we will show that fluid solutions admit an alternative description, as thetrajectories of a related subgradient dynamical system. 7

We will be particularly interested in the set of invariant states of the fluidmodel, which, for any λ in the capacity region, is defined by (cf. Theorem5.4(iv) of [27])

(11) I(λ) ,{q0 ∈ Rn+

∣∣ q(t) = q0, ∀t, is a fluid solution}.

Our notation is chosen to emphasize the dependence on λ of the set ofinvariant states. We note that if λ belongs to the interior of the capacityregion, then I(λ) is a singleton, equal to {0}. Thus, I(λ) can be non-trivialonly if λ lies on the boundary of C.

We now record a scaling property of the set of fluid solutions.

Lemma 1. Consider a fluid solution q(·) and a constant r > 0. Letq(t) = q(rt)/r, for all t ≥ 0. Then, q(·) is also a fluid solution.

Proof. Note that the set of maximizing schedules in Eq. (4) does notchange when we scale the queue vector by a positive constant. Therefore,for any t ≥ 0,

(12) S(q(t)

)= S

(q(rt)/r

)= S

(q(rt)

).

Consider the functions y(·) and sµ(·) that together with q(·) satisfy the fluidmodel relations (6)–(10). Let y(t) = y(rt) and sµ(t) = sµ(rt), for all t ≥ 0and all µ ∈ S. Then, it is easy to verify that q(·), y(·), and sµ(·) also satisfy(6)–(10). Therefore, q(·) is also a fluid solution.

Suppose that λ is in the capacity region and that q0 ∈ I(λ), so thatq(t) = q0 is a fluid solution. Then, Lemma 1 implies that for any scalarr > 0, q(t) = q0/r is also a fluid solution, and therefore q0/r ∈ I(λ).Furthermore, it is not hard to see that the identically zero function is alsoa fluid solution, so that 0 ∈ I(λ). We conclude that I(λ) is a cone, i.e.,

(13) αI(λ) = I(λ), ∀ α > 0.

The interest in fluid solutions stems from the fact that they provide ap-proximations to suitably scaled versions (i.e., under “fluid scaling”) of theoriginal process. We summarize here one such result, which is a special caseof Theorem 4.3 in [27]; similar results are given in [4] (Lemmas 4 and 5).

7This alternative description also explains why uniqueness holds, in contrast to thecase of more general multiclass queueing networks; cf. the last paragraph of the proof ofProposition 2.

11

Proposition 1. Fix some λ ∈ Rn+, T > 0, and q0 ∈ Rn+. Letting r rangeover the positive integers, consider a sequence of arrival processes Ar(·) thatsatisfies

(14)1

rmaxt≤rT

∥∥ t∑τ=0

(Ar(τ)− λ

)∥∥ −−−−→r→∞

0,

almost surely. Let Qr(·) be the process generated according to Eq. (3) whenthe arrival process is Ar(·) and the initial condition is Qr(0) = rq0. Wedefine the continuous-time scaled processes qr(t) = Qr(brtc)/r, and notethat qr(0) = q0, for all r. Finally, let q(·) be a fluid solution, under thatparticular vector λ, initialized with q(0) = q0. Then,

(15) supt≤T

∥∥qr(t)− q(t)∥∥ −−−−→r→∞

0,

almost surely.

Condition (14) is typically satisfied under common probabilistic assump-tions, e.g., when Ar(·) is an i.i.d. process with mean λ and bounded domain,or more generally of exponential type. Thus, loosely speaking, convergenceof the arrival processes leads to convergence of the queue processes.

As we shall see in Section 3, our results will allow for stronger statements;namely, we will show that the rate of convergence in Eq. (14) provides boundson the rate of convergence to the fluid solution, in Eq. (15); cf. Corollary 1.

2.4. State Space Collapse. In this section, we discuss known results aboutState Space Collapse (SSC) under a MW policy, thus setting the stage fora comparison with the results we will present in Section 4.

We consider the heavy traffic regime, where the arrival rate vector getsarbitrarily close to some point λ on the outer boundary of the capacityregion. In this regime, the average queue lengths typically tend to infinity,yet it is often the case that the queue length vector stays close to the set ofinvariant states, I(λ). This phenomenon is called SSC, and has been studiedextensively, mostly under the so-called diffusion scaling. In this scaling, westart with a sequence Qr(·) of stochastic processes, indexed by r ∈ N, andthen proceed to study a sequence of scaled processes q r(·), referred to asdiffusion-scaled processes, defined by

(16) q r(t) =1

rQr(br2tc

), t ≥ 0.

The extent to which the queue length process stays close to the set of in-variant states is in general determined by the magnitude of the fluctuations

12

of the arrival process. It is therefore natural to start the analysis with someassumptions on these fluctuations. General SSC results, under the MW pol-icy and some of its extensions, were provided in [27], under the followingassumption.8

Assumption 2 (Assumption 2.5 of [27]). Let Ar(·) be a sequence ofarrival processes indexed by r ∈ N. We assume that for each r, Ar(·) isstationary,9 with mean λr, and that λr → λ as r → ∞. We furthermoreassume that there exists a sequence δr ∈ R+ converging to 0 as r →∞, suchthat(17)

r ·log2 r ·P

(maxt≤r

1

r

∥∥ t∑τ=0

(Az(τ) − λz

)∥∥ ≥ δr) −−−−→r→∞

0, uniformly in z.

Note that Assumption 2 is quite general, not requiring the arrival pro-cesses to be i.i.d. or Markovian. Theorem 7.1 of [27], slightly rephrased,10

establishes that for a network operating under a MW-f policy (a generaliza-tion of WMW policies, and under certain conditions on f), for any T > 0,and under Assumption 2, the diffusion-scaled queue length processes qr(·)satisfy, for any δ > 0,

(18) P

supt∈[0,T ] d(qr(t) , I(λ)

)max

(1, supt∈[0,T ] q

r(t)) > δ

−−−−→r→∞

0,

when limr→∞ qr(0) = q0, for some q0 ∈ I(λ).

The bound in (18) is referred to as multiplicative SSC. Yet, there is astronger notion, called additive SSC, which involves a bound similar to (18),but with the term max

(qr(t), 1

)absent from the denumerator, and which

is known to hold under i.i.d. arrivals.

Theorem 1 ([24] Theorem 7.7). Consider a network operating under aMW-α policy, with α ≥ 1, with i.i.d. and uniformly bounded arrivals withrate λr → λ, for some λ ∈ C, and the associated diffusion-scaled queue

8In our statement of the assumption, we modify the notation of [27], interchanging theroles of z and r, to preserve consistency with the rest of this paper.

9“Stationary” means that the Ar(t) have the same distribution for all t, but withoutnecessarily being independent.

10Our rephrasing consists of replacing the term denoted by ∆W(qr(t)

)in [27] by I(λ).

This is legitimate, because ∆W(qr(t)

)∈ I(λ) (cf. Theorem 5.4 (iv) in [27]) and therefore

d(qr(t) , I(λ)

)≤ d(qr(t) , ∆W

(qr(t)

)).

13

length processes qr(·). Assume that limr→∞ qr(0) = q0, for some q0 ∈ I(λ).

Then,11 for any δ > 0,

(19) P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)−−−−→r→∞

0.

Compared to the above literature, our results only apply to the case whereα = 1 (i.e., the MW policy), but allow for queue-dependent weights, so thatthe weight of queue i is wiQi. More crucially, our results (cf. Section 4 andTheorem 3, in particular):

(a) remain valid as long as limr→∞ d(qr(0), I(λ)

)= 0, which is a weaker

condition than limr→∞ qr(0) = q0, for a fixed q0 ∈ I(λ);

(b) unlike [24], we do not require the arrival process to be i.i.d. or bounded,as long as the arrival process has certain concentration properties. Fur-thermore, the concentration properties that we require (cf. Definition2) are weaker than Assumption 2, for the case of diffusion scaling (cf.Corollary 3);

(c) apply to scalings other than diffusion scaling, and include a converseresult that characterizes the possible scalings for which additive SSCholds.

We finally note another related line of work which studies a propertysimilar to SSC, namely, the extent to which the steady-state distribution isconcentrated in a neighbourhood of the set of invariant points. In particular,[15] and [14] have characterized the tail of the steady-state distribution ofthe distance from the set of invariant points for the case of an input-queuedswitch.

3. Main Result: Sensitivity. The backbone behind all of the resultsis the following main theorem.

Theorem 2 (Sensitivity of WMW policy). For a network operating un-der a WMW policy, there exists a constant C, to be referred to as the sensi-tivity constant, that satisfies the following. Consider an arrival process A(·)and the corresponding queue length process Q(·). Let q(·) be a fluid solutioncorresponding to some λ � 0, and initialized with q(0) = Q(0). Then, forany k ∈ Z+,

(20)∥∥Q(k)− q(k)

∥∥ ≤ C

(1 + ‖λ‖ + max

t<k

∥∥ t∑τ=0

(A(τ) − λ

)∥∥) ,11The result in [24] assumed that q0 = 0; however, the proof extends to the case of

general q0.

14

Note that the result holds without having to assume that λ lies insidethe capacity region. The proof is given in Section 5, and the key steps areas follows. We show that the study of WMW policies can be reduced to thestudy of MW policies. Furthermore, given a network operating in discretetime under the MW policy, we introduce an associated continuous-time dy-namical system, which we call the induced dynamical system. Next, we showthat the fluid solutions and the queue length processes of the network can beviewed as unperturbed and perturbed trajectories of the induced dynamicalsystem, respectively. We finally argue that the induced dynamical systemfalls within the class of subgradient systems that were studied in [29], andapply the main result in that reference to prove (20). The reductions thatare developed in the course of the proof, may be of independent interest.

3.1. Convergence to Fluid Model Solutions. An immediate consequenceof Theorem 2, together with Lemma 1, is a bound on the distance of thefluid-scaled process qr(t) = Q(brtc)/r from a fluid solution q(·).

Corollary 1. Consider a network operating under the WMW policyand let C be the constant in Theorem 2. Fix an arrival function A(·) andsome q0 ∈ Rn+. Let Qr(·) be the process generated according to Eq. (3) whenthe arrival process is A(·) and the initial condition is Qr(0) = rq0. Letqr(t) = Qr(brtc)/r. Let q(·) be a fluid solution corresponding to some λ ∈ Rn+and initialized at q(0) = q0. Then, for any T > 0,

(21) supt≤T

∥∥qr(t)− q(t)∥∥ ≤ C

rmaxt<rT

∥∥ t∑τ=0

(A(τ) − λ

)∥∥ + O(1/r).

Corollary 1 strengthens (15) significantly. Any statistical assumptionson the fluctuations of the arrival process A(·) readily yield concrete upperbounds on the distance of the original process from its fluid counterpart.

4. State Space Collapse. In this section, we apply Theorem 2 to es-tablish a general additive SSC result; cf. Theorem 3. We then continue withsome corollaries on exponential scaling or diffusion scaling. Our approachcan also be used to obtain results that apply in steady-state. However, wedo not go into that latter topic because such results can also be proved usingsimpler, more direct methods, as in [15] and [14].

4.1. Definitions and Preliminaries. At the core of our proofs lies thefollowing lemma, which asserts that fluid solutions are attracted to the setI(λ) of invariant states, which was defined in Eq. (11). The proof of thelemma is given in Appendix A.

15

Lemma 2 (Attraction to the Set of Invariant States). Consider a net-work operating under the MW policy and a vector λ in its capacity region.There exists a constant α(λ) > 0 such that for any fluid solution q(·) asso-ciated with λ, and any time t,

q(t) 6∈ I(λ) =⇒ d+

dtd(q(t) , I(λ)

)≤ −α(λ),

with this right-derivative being guaranteed to exist.

We continue with a definition that quantifies the rate at which a familyof processes concentrates on its mean.

Definition 2 (f -Tailed Sequence of Random Processes). Consider afunction f : N × R+ → R+ and a vector λ ∈ Rn. Let Ar(·) be a sequenceof random processes indexed by r ∈ N. Assume that for each r, Ar(·) isstationary, has expected value λr, and that limr→∞ λ

r = λ. Suppose that forevery δ > 0,

(22) f(r, δ) P

(1

rsupt≤r

∥∥ t∑τ=0

(Ar(τ)− λr

)∥∥ > δ

)−−−−→r→∞

0.

Then, Ar(·) is said to be an f -tailed sequence of random processes with limitmean λ, and we refer to f as the concentration rate function.

Later, we will show that the time scale over which SSC holds is almostproportional to the best possible concentration rate function f . We observethat any sequence of random processes that satisfies Assumption 2 is asequence of f -tailed processes, with f(r, δ) = r · log2 r. However, the reverseis not true: Assumption 2 involves an additional requirement of uniformconvergence over all values of an additional indexing parameter z, whereasDefinition 2 essentially only considers the case z = r. Thus, Definition 2 isless restrictive, easier to check, and also seems more natural.

There are many processes whose concentration properties are well under-stood, and which translate to the requirements in Definition 2, for a suitableconcentration rate function f . We record one such fact in Lemma 3 below,which deals with bounded i.i.d. arrival processes, and which is proved inAppendix B.

Lemma 3 (Bounded I.I.D. Processes are Exponential-Tailed). Fix a vec-tor λ ∈ Rn+ and a constant a > 0. Consider a sequence of random processesAr(·) indexed by r ∈ N. Suppose that for every r, the random variables Ar(t)

16

are i.i.d., and that Ar(t) ∈ [0, a]n, for all t. Denote the mean of Ar(t) byλr, and suppose that limr→∞ λ

r = λ. Take any constant β ∈ (0, 2), andlet f

(r, δ)

= exp(βrδ/na2

). Then, Ar(·) is an f -tailed sequence of random

processes with limit mean λ.

Similar results are possible for arrival processes that are modulated by afinite and ergodic Markov chain. The boundedness assumption can also beremoved under standard conditions on the moment generating function ofAr(t).

We now define processes involving a more general scaling of time, as ageneralization of the fluid and diffusion-scaled processes.

Definition 3 (g-Time-Scaled Processes). Consider an increasing func-tion g : N → R+ and a sequence Qr(·) of random processes. Then, thecorresponding sequence of g-time-scaled processes qr(·) is defined as

(23) qr(t) =1

rQr(⌊g(r)t

⌋),

for all r ∈ N and all t ∈ R+.

The fluid scaling and the diffusion scaling of a random process are par-ticular g-time-scaled, processes corresponding to g(r) = r and g(r) = r2,respectively. Definition 3 allows for a more general scaling of time.

4.2. Main SSC Result. We now present our main SSC result.

Theorem 3 (Strong State Space Collapse). Consider a network operat-ing under a WMW policy, and a vector λ in its capacity region, with a cor-responding set of invariant states I(λ). Fix some T ∈ R+, and let {λr} be asequence that converges to λ. Consider two functions f : N×R+ → R+ andg : N→ R+, with lim infr→∞ g(r)/r > 0. Let Ar(·) be an f -tailed sequence ofarrival processes with limit mean λ, and let qr(·) be a corresponding sequence

of g-time-scaled queue length processes. Suppose that d(qr(0) , I(λ)

)→ 0,

as r →∞.

(a) Suppose that for every ε > 0, we have lim infr→∞ rf(r, ε)/g(r) > 0.

Then, for any δ > 0,

(24) P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)−−−−→r→∞

0.

17

(b) Under the same assumptions as in Part (a), we can also bound therate of convergence in (24): for any δ > 0, there exists an ε > 0 suchthat

(25)rf(r, ε)

g(r)P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)−−−−→r→∞

0.

Moreover, for the case of a MW policy, (25) holds for every ε ≤min

(δ, α)/2C, where C is the sensitivity constant of the network (cf. The-

orem 2) and α = α(λ) is the constant in Lemma 2.(c) Conversely, suppose that f : N× R+ → R+ and g : N→ R+, are such

that limr→∞ rf(r, ε)/g(r) = 0, for every ε > 0, and limr→∞ g(r)/r =

∞. Then, for any network operating under a MW policy, any arrivalrate λ in its capacity region (excluding its extreme points), and anyq0 ∈ I(λ), there exists an f -tailed sequence of arrival processes sat-isfying (22) and a corresponding sequence of g-time-scaled processesqr(·), r ∈ N, initialized at qr(0) = q0, such that

(26) P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)−−−−→r→∞

1,

for all δ > 0.

The proof of Theorem 3 is given in Section 6. Part (b) relies on a reductionof WMW dynamics to MW dynamics together with the facts that the queuelength process stays close to a fluid solution (Theorem 2), and that a fluidsolution is attracted to the invariant set I (Lemma 2). The proof of Part (c)relies on an explicit construction.

We note that Part (a) is a straightforward corollary of Part (b). Never-theless, we have included the statement of Part (a) because it is in a formcomparable to SSC results in the literature, and also because it facilitates acomparison with the converse result in Part (c).

Theorem 3 ties together the time scaling g over which SSC occurs andthe concentration rate function, f , of the arrival processes. The underlyingintuition is that if the queue length process is initialized sufficiently close toI, then it will stay in an rδ-neighborhood of I, with high probability, for aperiod of time proportional to g(r). This enables us to prove additive SSCover time scales much longer than those underlying the diffusion scaling, asin the next subsection.

18

4.3. Special Cases of SSC. In this section, we apply Theorem 3 to ob-tain more concrete SSC results. The first result concerns SSC over an ex-ponentially large time scale. While it refers to bounded i.i.d. processes, itadmits straightforward extensions to arrival processes with a concentrationrate function f that grows exponentially with r, as is the case whenever asuitable Large Deviations Principle holds.

Corollary 2 (Bounded I.I.D. Arrivals: SSC over an Exponential TimeScale). Consider a network operating under a MW policy, a vector λ inits capacity region, a δ > 0, and a sequence Ar(·) of arrival processes thatsatisfy the assumptions of Lemma 3. Consider a γ < min

(δ, α)/(2Cna2

),

where C is the input sensitivity constant of the network, α = α(λ) is theconstant in Lemma 2, and a is an upper bound on the size of arriving jobs(cf. Lemma 3). Consider the eγr-time-scaling of the queue length processes,

(27) qr(t) =1

rQr(⌊eγrt

⌋),

and suppose that d(qr(0) , I(λ)

)→ 0, as r →∞. Then, for any T ∈ R+,

(28) eγr P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)−−−−→r→∞

0.

Proof. Let β = γ/[

min(δ, α)/(2Cna2

)]. Then, β < 1, and Lemma 3

implies thatAr(·) is an f -tailed sequence of processes for f(r, ε)

= exp(2βrε/na2

).

Let ε = min(δ, α)/2C. Then, γ = βε/na2. Let g(r) = exp

(γr)

= exp(βrε/na2

)be the time scaling in the definition (27) of qr(t). Then,

(29)rf(r, ε)

g(r)=

r exp(2βrε/na2

)exp

(βrε/na2

) = r exp(βrε/na2

)> exp(γr).

Therefore, the assumptions in Part (b) of Theorem 3(b) are satisfied, and

lim supr→∞

eγr P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)

≤ lim supr→∞

rf(r, ε)

g(r)P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)= 0.

(30)

Thus, (28) holds, which is the desired result.

19

We note that Part (c) of Theorem 3 provides a partial converse to Corol-lary 2: under i.i.d. arrivals with nonzero variance, additive SSC does nothold over a super-exponential time scale.

The next corollary of Theorem 3(a) concerns additive SSC under diffusionscaling.

Corollary 3 (State Space Collapse in Diffusion Scaling). Consider anetwork operating under a WMW policy, and a function f : N × R+ →R+ such that lim infr→∞ f(r, δ)/r > 0, for all δ > 0. Consider a λ in thecapacity region, an f -tailed sequence of arrivals Ar(·) with limit mean λ,and a corresponding diffusion-scaled queue length processes qr(·) (cf. (16)).

Suppose that d(qr(0) , I(λ)

)→ 0, as r →∞. Then, for any T ∈ R+,

(31) P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)−−−−→r→∞

0.

Corollary 3 strengthens Theorem 1, for the case of WMW policies, in thatthe assumption of i.i.d. arrivals is removed. We only require a concentrationproperty for the arrival process, such as

(32) rP

(supt≤r

1

r

∥∥ t∑τ=0

(Ar(τ) − λ

)∥∥ ≥ δ) −−−−→r→∞

0, ∀δ > 0,

which is even weaker than Assumption 2. Moreover, under a MW policy andi.i.d. arrivals, our Corollary 2 extends Theorem 1 by establishing SSC overan exponential time scale (as opposed to the diffusion scaling). For furtherperspective with respect to existing results, please refer to the discussionfollowing the statement of Theorem 1, in Section 2.4.

5. Proof of Theorem 2. In this section, we present the proof of The-orem 2, organized in a sequence of subsections. We first show in Subsection5.1 that for any network operating under a WMW policy, there is anothernetwork operating under a MW policy whose queue length process is a lineartransformation of the queue length process of the original network. Thus,we can just focus on the MW policy. In Section 5.2 we review a general sen-sitivity result on a class of dynamical systems with piecewise constant drift.Next, in Subsection 5.3 we introduce an induced continuous-time dynamicalsystem that provides the bridge between the original discrete-time processunder a MW policy and the fluid model. The proof concludes in Subsection5.4 by applying the general sensitivity result to the induced system.

20

5.1. From WMW to MW. In order to leverage the tools that we willdevelop for MW policies and apply them to the more general WMW policies,we start with a reduction from WMW policies to a MW policy. This isaccomplished through the following lemma, which shows that the queuelengths and fluid solutions under a WMW policy are linear transformationsof queue lengths and fluid solutions under a MW policy, in a transformednetwork.

Lemma 4 (Reduction of WMW Dynamics to MW Dynamics). Considera network N with action set S and a routing matrix R. Fix a weight vectorw, an arrival function A(·), and an arrival rate vector λ. Let Q(·) be aqueue length process of N corresponding to the arrival A(·), under a w-WMW policy. Let W = diag(w), λ = W 1/2λ, and A(t) = W 1/2A(t), for allt ∈ Z+. Let N be a network with action set S = W 1/2S and routing matrixR = W 1/2RW−1/2. Then,

(a) Q(t) = W 1/2Q(t) is a queue length process of N corresponding to thearrival A(·), under a MW policy.

(b) q(t) = W 1/2q(t) is a fluid solution of N corresponding to arrival rateλ and unit weights (as in MW) if and only if q(t) is a fluid solutionof N corresponding to arrival rate λ and WMW weights w.

Proof. Given some µ ∈ S and Q ∈ Rn+, we let µ = W 1/2µ and Q =W 1/2Q. Then,

QT(I − R

)µ =

(W 1/2Q

)T (I −W 1/2RW−1/2

)W 1/2µ

= QT W 1/2W 1/2(I −R

)W−1/2W 1/2 µ

= QTW(I −R

)µ.

Therefore, µ ∈ S is a maximizer of Q(I − R

)µ if and only if µ ∈ S is a

maximizer of QTW(I −R

)µ, i.e., S(Q) = W 1/2Sw(Q).

For Part (a), for any t ∈ Z+,

Q(t+ 1) = W 1/2Q(t+ 1)

= W 1/2Q(t) + W 1/2A(t) + W 1/2(R− I

)min

(µ(t), Q(t)

)= Q(t) + A(t) + W 1/2

(R− I

)W−1/2W 1/2 min

(µ(t), Q(t)

)= Q(t) + A(t) +

(R− I

)min

(µ(t), Q(t)

).

Therefore, Q(·) satisfies the evolution rule (3) of N , and is a queue lengthprocess corresponding to the arrival function A(·). Since Q(t) evolves ac-cording to a w-WMW policy, we have µ(t) ∈ Sw

(Q(t)

). As shown earlier,

this implies that µ(t) ∈ S(Q), and thus Q(t) indeed follows a MW policy.

21

For Part (b), consider a set of functions y(·) and sµ(·) for µ ∈ S, thattogether with q(·) satisfy (6)–(10). It is not difficult to see that all equationsremain valid when q(·), λ, S, sµ(·), y(·), and w are replaced with q(·), λ, S,sµ(·) = sµ(·), W 1/2y(·), and 1n, respectively. The reverse direction is also

true. Therefore, q(·) is a fluid solution of N corresponding to the arrivalrate vector λ, with unit weights, if and only if q(·) is a fluid solution of Ncorresponding to the arrival rate vector λ, with weight vector w.

5.2. FPCS Dynamical Systems. In this subsection, we review some def-initions and results from [29]. A dynamical system is identified with a set-valued function F : Rn → 2R

nand the associated differential inclusion

x(t) ∈ F (x(t)). We start with a formal definition, which allows for the pres-ence of perturbations.

Definition 4 (Trajectories of a Dynamical System). Consider a dy-namical system F : Rn → 2R

n, and let U : R → Rn be a right-continuous

function, which we refer to as the perturbation. Suppose that X(·) and ζ(·)are measurable functions of time that satisfy

(33) X(t) =

∫ t

0ζ(τ) dτ + U(t), ∀ t ≥ 0,

(34) ζ(t) ∈ F(X(t)

), ∀ t ≥ 0.

We then call X a perturbed trajectory corresponding to U . In the specialcase where U is identically zero, we also refer to X as an unperturbedtrajectory.

For a convex function Φ : Rn → R, we denote its subdifferential by∂Φ(x). We say that F is a subgradient dynamical system if there exists aconvex function Φ : Rn → R, such that for any x ∈ Rn, F (x) = −∂Φ(x).Furthermore, if Φ is of the form

Φ(x) = maxi

(− µTi x+ bi

),

for some µi ∈ Rn, bi ∈ R, and with i ranging over a finite set, we say thatF is a Finitely Piecewise Constant Subgradient (FPCS, for short) system.Note that for such systems, F (x) is always equal to the convex hull of thevectors µi that maximize −µTi x+ bi.

FPCS systems admit a very special sensitivity bound.

22

Theorem 4 ([29] Theorem 1). Consider an FPCS system F . Then,there exists a constant C such that for any unperturbed trajectory x(·), andfor any perturbed trajectory X(·) with corresponding perturbation U(·) andthe same initial conditions X(0) = x(0), we have

(35)∥∥X(t)− x(t)

∥∥ ≤ C supτ≤t

∥∥U(τ)∥∥, ∀ t ∈ R+.

Moreover, for any λ ∈ Rn, the bound (35) applies to the (necessarily FPCS)system F (·) + λ with the same constant C.

5.3. Reduction of the MW Dynamics to an FPCS System. Throughoutthis subsection, we restrict attention to a network operated under an (un-weighted) MW policy. In order to take advantage of Theorem 4, we showthat a discrete-time network can also be represented as an associated (“in-duced”) FPCS dynamical system.

Definition 5 (Induced FPCS system). For a network with action set Sand routing matrix R, the induced FPCS system is the subgradient dynamicalsystem F associated with the convex function

(36) Φ(x) = maxµ∈S

((I −R)µ

)Tx.

In particular, F (x) is the convex hull of the image of S(x) under the lineartransformation R− I, where S(x) is the set of vectors µ ∈ S that maximize((I −R)µ

)Tx.

We start with the observation that fluid solutions of a network are trajec-tories of the induced FPCS system. Roughly speaking, this is because theservice vectors chosen by the MW policy in (4) are maximizers of the set

of linear functions((I − R)µ

)TQ over µ ∈ S, and the fluid solution moves

along the negative of a convex combination of such maximizing service vec-tors. Thus, fluid solutions move along the subgradients of Φ (defined in (36)),and are therefore trajectories of the induced FPCS system.

Proposition 2 (Fluid Model Solutions as Trajectories of the InducedFPCS System). Consider a network and its induced FPCS system F . Letq(·) be a fluid solution of the network corresponding to arrival rate λ. Then,q(·) is an unperturbed trajectory of the dynamical system q ∈ F (q) + λ.Conversely, any unperturbed trajectory x(·) of F (·) + λ, with x(0) ∈ Rn+, isa fluid solution corresponding to λ.

23

Proof. For a vector µ ∈ Rn+ and a set J ⊆{

1, . . . , n}

of indices, we let(37)

DJ(µ) ,{ξ ∈ Rn+

∣∣∣ ξi = µi, for all i 6∈ J, and 0 ≤ ξj ≤ µj , for all j ∈ J}.

Equivalently,

(38) DJ(µ) = Conv({σ−K(µ)

∣∣K ⊆ J}),where σ−K(µ) is a vector whose ith entry is equal to the ith entry of µ ifi 6∈ K, and equal to zero if i ∈ K. Recall that S(q) is defined as the set of

all µ ∈ S that maximize((I −R)µ

)Tq; cf. (4).

Claim 1. Fix a q ∈ Rn+ and a µ ∈ S(q). Let J ={j∣∣ qj = 0

}. Then,

(39)(R− I

)DJ(µ) ⊆ F (q).

Proof of Claim. Note that for any µ ∈ S and any set K of indices, thevector σ−K(µ) also belongs to S, because of Assumption 1. We now fix someq and the set J , as in the statement of the claim. For any K ⊆ J , we haveσ−K(µ) � µ. Furthermore, since the entries of R and µ are non-negative, wehave qTRσ−K(µ) ≤ qTRµ. Therefore,

qT(I −R

)σ−K(µ) = qTσ−K(µ)− qTRσ−K(µ)

= qTµ− qTRσ−K(µ)

≥ qTµ− qTRµ= qT

(I −R

)µ,

where the second equality holds because qj = 0 whenever the jth entry ofσ−K(µ) is not equal to µj . We have therefore established that if µ ∈ S(q),then σ−K(µ) ∈ S(q). Since F (q) is the image under R− I of the convex hullof S(q), we obtain

(40)(R− I

)σ−K(µ) ∈ F (q).

Therefore,(R− I

)DJ(µ) =

(R− I

)Conv

({σ−K(µ)

∣∣K ⊆ J})= Conv

({(R− I

)σ−K(µ)

∣∣K ⊆ J})⊆ F (q),

where the first equality is due to (38), and the last relation is due to (40) andthe convexity of F (q). This establishes the validity of the claim (39).

24

We now return to the proof of the proposition. Consider a function y(·)and a set of functions sµ(·), µ ∈ S, that together with q(·) satisfy the fluidmodel equations (6)–(10). We will show that q(·) satisfies the differentialinclusion q ∈ F (q) + λ. Fix some t ≥ 0 and let y = y(t). For any µ ∈ S, letsµ = sµ(t) and consider an n-dimensional vector yµ with entries

yµi =

{yiµi/

∑ν∈S sννi, if

∑ν∈S sννi 6= 0,

0, otherwise.

It follows that for i = 1, . . . , n,

(41)∑µ∈S

sµyµi =

∑µ∈S

sµyiµi∑ν∈S sννi

= yi.

Then,

(42)∑µ∈S

sµyµ = y.

On the other hand, for any µ ∈ S and for i = 1, . . . , n, either yµi = 0 or(8) implies that yµi = µi

(yi/∑

ν∈S sννi)≤ µi. Moreover, for any µ ∈ S and

any i ≤ n, if qi(t) > 0, then from (9), yµi = yi(µi/∑

ν∈S sννi)

= 0. LettingJ =

{j∣∣ qj(t) = 0

}, it then follows from the definition of DJ(µ) that for any

µ ∈ S, µ−yµ ∈ DJ(µ). Claim 1 then implies that(R−I

)(µ−yµ) ∈ F

(q(t)

).

Therefore, in light of (7) and the convexity of F(q(t)

), we have

(43)∑µ∈S

sµ(R− I

)(µ− yµ) ∈ F

(q(t)

).

Finally, from (6),

q(t) = λ+(R− I

)∑µ∈S

sµµ− y

= λ+

(R− I

)∑µ∈S

sµµ−∑µ∈S

sµyµ

= λ +

∑µ∈S

sµ(R− I

)(µ− yµ)

∈ F(q(t)

)+ λ,

(44)

where the second equality is due to (42).

25

We now prove the converse part of the proposition, that every unper-turbed trajectory x(·) of F (·) + λ, initialized in the positive orthant, is afluid solution. Consider a fluid solution q(·) corresponding to the arrival rateλ, initialized with q(0) = x(0) (for proofs of existence, see Appendix A of[4] and Lemma 9 of [20]).

It then follows from the first part of this proposition that q(·) is also anunperturbed trajectory of F (·)+λ. On the other hand, it is shown in [23] thatany subgradient dynamical system is a maximal monotone map12. Then,Corollary 4.6 of [30] implies that there is a unique unperturbed trajectorywith initial point x(0). Therefore, x(t) = q(t), for all t ≥ 0, and the desiredresult follows.

Proposition 2 has established that a fluid solution is a trajectory of theinduced FPCS system. We now show, in the next proposition, that theactual discrete-time queue length process is close to a perturbed trajectoryof the induced FPCS system. Note that even if the discrete-time system hascompletely deterministic and steady arrivals (no stochastic fluctuations) itcan still “chatter” around the boundary separating two regions with differentdrifts. The idea behind the proof is that this chattering can also be viewed asa perturbation of a straight trajectory. This is conceptually straightforward,but some of the details of the behavior in the vicinity of such boundariesare tedious.

Proposition 3 (Queue Length Processes as Trajectories of the InducedFPCS System). For any network, there exists a constant β that satisfiesthe following statement. Fix a λ ∈ Rn+, and let A(·) be an arrival functionand Q(·) be a corresponding queue length process. Then, there exists a right-continuous (perturbation) function U(·) : R+ → Rn, satisfying

(45) supτ≤t

∥∥U(τ)∥∥ ≤ ‖λ‖+ β + sup

τ≤t

∥∥ ∑k<τk∈Z+

(A(k)− λ

)∥∥, ∀ t ∈ R+,

and a corresponding perturbed trajectory X(·) of F (·) + λ such that

(46)∥∥X(k)−Q(k)

∥∥ ≤ β, ∀k ∈ Z+.

12A set-valued function F : Rn → 2Rn

is a monotone map if for any x1, x2 ∈ Rn and

any v1 ∈ F (x1) and v2 ∈ F (x2), we have(v1 − v2

)T (x1 − x2

)≤ 0. It is called a maximal

monotone map if it is monotone, and for any monotone map F , that satisfies F (x) ⊆ F (x)

for all x, we have F = F .

26

It is possible to strengthen Proposition 3 and ensure that we actuallyhave X(k) = Q(k) for every k ∈ Z+, thus strengthening (46). However,this stronger result is not needed for our future development, and wouldrequire a much more tedious construction of U(·). It is also worth pointingout here that the perturbed trajectory X(·) in our construction is alwaysnon-negative.

Before presenting the detailed proof, we provide some intuition on theissues that arise. Recall the network evolution rule Q(k + 1) = Q(k) +A(k) +

(R− I

)min

(µ(k), Q(k)

)in (3). Consider a time k at which there is

a unique maximizer µ(k), so that F (Q(k)) is a singleton, consisting of thesingle element (R − I)µ(k). Suppose furthermore that µ(k) � Q(k). In thiscase, (3) becomes Q(k + 1) = Q(k) +A(k) +

(R− I

)µ(k). Let

U(t) =

{−(t− k)

[(R− I

)µ(k) + λ

], for t ∈ (k, k + 1),

A(k)− λ, for t = k + 1.

In the interval t ∈ (k, k + 1), we have U(t) = −[(R − I

)µ(k) + λ

]∈

−F(Q(k)

)−λ. Suppose now that X(k) = Q(k). From the dynamics of X(·)

(cf. Definition 4), we have

X(t) = F(X(t)

)+ λ+ U(t) = F

(Q(k)

)+ λ+ U(t) = 0,

and the perturbed trajectory remains constant: X(t) = Q(k), for t ∈ (k, k+1). Finally, a discontinuity in U(·), at t = k + 1, forces X(t) to jump to thenew value Q(k + 1). Thus, in this example, we have a perturbed trajectorythat agrees with the queue process at integer times.

The above argument will however fail when min(µ(k), Q(k)

)6= µ(k),

because the received service in time slot k, i.e.,(R − I

)min

(µ(k), Q(k)

),

need not belong to F(Q(k)

). To circumvent this problem, we find a nearby

point y(k) such that(R−I

)min

(µ(k), Q(k)

)∈ F

(y(k)

). We then construct

U(·) so that it forces X(t) to jump to y(k) at time t = k, and stay there fort ∈ [k, k + 1).

Proof. We now provide the detailed proof of Proposition 3. We will bemaking use of the following known result:

Lemma 5 ([18] Lemma 5.1). Given a finite collection of half-spaces Hi ⊂Rn with non-empty intersection, there exists a constant c > 0 such that

(47) d(x ,⋂i

Hi

)≤ c ·max

id(x , Hi

), ∀ x ∈ Rn.

27

We begin with a claim.

Claim 2. There exists a constant κ that satisfies the following. Considerany x ∈ Rn+ and any µ ∈ S(x). Let J =

{j∣∣xj ≤ µj

}. Then, there exists a

y ∈ Rn+, such that∥∥y − x∥∥ ≤ κ and

(48)(R− I

)DJ(µ) ⊆ F (y),

where DJ(µ) is defined in (37).

Proof of Claim. We will leverage Lemma 5 to find y, and then useClaim 1 to prove (48). To every µ ∈ S, we associate an effective region Rµ:

(49) Rµ ={z∣∣µ ∈ S(z)

}.

Fix some x ∈ Rn+ and some µ ∈ S(x). The effective region Rµ is then theintersection of half-spaces of the form

(50) Hπ ={z ∈ Rn

∣∣ zT (I −R)µ ≥ zT (I −R)π} , π ∈ S.

Since µ ∈ S(x), we have x ∈ Hπ and d(x,Hπ

)= 0, for all π ∈ S. Let

(51) b , maxπ∈S

maxi≤n

πi

be the maximum service capacity of any queue over all service vectors.We also define, for every i ≤ n, two half spaces Hj+ =

{z ∈ Rn

∣∣ zj ≥ 0}

and Hj− ={z ∈ Rn

∣∣ zj ≤ 0}

. It follows from the definition of J that for anyj ∈ J , d

(x,Hj−

)≤ b, and from x ∈ Rn+ that d

(x,Hi+

)= 0, for all i ≤ n.

We define a set B, which is determined by the chosen x ∈ Rn+ and µ ∈ S(x),as follows:

B ={z ∈ Rn+ ∩Rµ | zj = 0 whenever xj ≤ µj

}.

Note that the set B is the intersection of finitely many half-spaces of theform Hπ, Hj+ , and Hj− . Note furthermore that B contains the origin andis therefore non-empty. Finally note that x has a distance of at most b fromeach of the half-spaces defining B. Therefore, Lemma 5 implies that

(52) d(x,B

)≤ c b,

for some constant c. In general, the constant c will depend on the particularx and µ under consideration. Note, however, that the set B is completelydetermined by µ ∈ S and the set of indices J =

{j∣∣xj ≤ µj

}. There are

28

finitely many choices for µ and for J , hence finitely many possible sets B.By taking the largest of the constants c associated with different sets B, wesee that c in (52) can be taken to be an absolute constant, independent ofx and µ.

Let y be the closest point to x in the set B. Letting κ = cb, (52) impliesthat

(53)∥∥y − x∥∥ ≤ κ,

where κ is an absolute constant.Since y ∈ B, we have y ∈ Rµ, so that µ ∈ S(y). Moreover, for any j ∈ J

i.e., if xj ≤ µj , we have yj = 0. Let J ′ = {j | yj = 0}. We then have J ⊆ J ′.We now apply this inclusion together with Claim 1, with y and J ′ playingthe role of q and J in the statement of the claim, to obtain(

R− I)DJ(µ) ⊆

(R− I

)DJ ′(µ) ⊆ F (y).

This completes the proof of the Claim.

For every t ∈ Z+, let µ(t) ∈ S(Q(t)

)be the action taken by the scheduler

at time t, J(t) ={j∣∣Qj(t) ≤ µj(t)}, and y(t)∈ Rn+ be a vector that satisfies∥∥y(t)−Q(t)

∥∥ ≤ κ and

(54)(R− I

)DJ

(µ(t)

)⊆ F

(y(t)

),

as in Claim 2.We now proceed to the main part of the proof of the proposition. With

a slight abuse of notation, we will write expressions such as∑

k<t even if tis non-integer, which we will interpret as

∑{k∈Z+|k<t}. We define the right-

continuous perturbation function U(·) as

U(t) =∑k≤t−1

(A(k)− λ

)+ y(btc)−Q(btc)

−(t− btc

)(λ+

(R− I

)min

(µ(btc), Q(btc)

)),

(55)

for all t ∈ R+.Let

(56) b , maxµ∈S

supQ∈Rn+

∥∥(R− I)min(µ,Q

)∥∥,

29

which is a finite constant. For any t ∈ R+,

supτ≤t

∥∥U(τ)∥∥ ≤ sup

τ≤t

∥∥∑k<τ

(A(k)− λ

)∥∥ + supk≤τ

∥∥y(k)−Q(k)∥∥ + ‖λ‖

+ supk≤τ

∥∥(R− I)min(µ(k), Q(k)

)∥∥≤ sup

τ≤t

∥∥∑k<τ

(A(k)− λ

)∥∥ + κ + ‖λ‖ + b.

(57)

Hence, (45) follows for β = κ+ b.For every t ∈ R+, let

(58) X(t) = y(btc).

Since∥∥y(t)−Q(t)

∥∥ ≤ κ, by construction, the desired equality (46) at integertimes is trivially true, with κ playing the role of β. It remains to show thatX(·) is a perturbed trajectory.

Let

(59) ξ(t) =(R− I

)min

(µ(btc), Q(btc)

)+ λ, ∀ t ∈ R+.

Since min(µ(btc), Q(btc)

)∈ DJ(btc)

(µ(btc)

), it follows from (54) and (58)

that ξ(t) ∈ F(X(t)

)+ λ, for all t ∈ R+.

We now show that

(60) X(t) =

∫ t

0ξ(τ) dτ + U(t), ∀t ≥ 0.

For any k ∈ Z+,

∫ k+1

kξ(τ) dτ + U(k + 1)− U(k) =

[(R− I

)min

(µ(k), Q(k)

)+ λ]

+[A(k)− λ+ y(k + 1)−Q(k + 1)

−(y(k)−Q(k)

)]=[(R− I

)min

(µ(k), Q(k)

)+A(k)

]−(Q(k + 1)−Q(k)

)+(y(k + 1)− y(k)

)= y(k + 1)− y(k)

= X(k + 1)−X(k),

(61)

30

where the third equality follows from the evolution rule (3). Moreover, forany t 6∈ Z,∫ t

btcξ(τ) dτ + U(t)− U(btc)

=(t− btc

)(λ+

(R− I

)min

(µ(btc), Q(btc)

))−(t− btc

)(λ+

(R− I

)min

(µ(btc), Q(btc)

))= 0

= X(t)−X(btc).

(62)

Then, a simple induction based on (61) and (62) implies (60), and thereforeX(·) is a perturbed trajectory of F (·) + λ corresponding to U(·), which isthe desired result.

5.4. Proof of Theorem 2. Having established a reduction from WMWpolicies to a MW policy (in Lemma 4), and a reduction from a network,operating under MW policy, to its induced FPCS system (cf. Propositions2 and 3), we can now leverage Theorem 4 to prove Theorem 2.

Proof of Theorem 2. Consider a network N that operates under a w-WMW policy. Let W = diag(w). Consider a queue length process Q(·) of Ncorresponding to an arrival A(·), and a fluid solution q(·) of N correspondingto an arrival rate vector λ, initialized at q(0) = Q(0). For each time t, letQ(t) = W 1/2Q(t), q(t) = W 1/2q(t), A(t) = W 1/2A(t), and λ = W 1/2λ. Let

θmin , miniw1/2i and θmax , maxiw

1/2i . Then, for any time t ∈ Z+,

(63)∥∥Q(t)− q(t)

∥∥ =∥∥W 1/2

(Q(t)− q(t)

)∥∥ ≥ θmin

∥∥Q(t)− q(t)∥∥,

and

(64)∥∥∑k≤t

(A(k)−λ

)∥∥ =∥∥W 1/2

∑k≤t

(A(k)−λ

)∥∥ ≤ θmax

∥∥∑k≤t

(A(k)−λ

)∥∥.Let N be, as in Lemma 4, a network that operates under a MW policy,

and for which Q(·) and q(·) are a queue length process and a fluid solution,respectively. Consider the induced FPCS system F of N . It follows fromProposition 3 that there exists a right-continuous perturbation function U(·)satisfying for any t ∈ R+,

(65) supτ≤t

∥∥U(τ)∥∥ ≤ ‖λ‖+ β + sup

τ≤t

∥∥∑k<τ

(A(k)− λ

)∥∥,

31

and a corresponding perturbed trajectory X(·) of F (·) + λ such that

(66)∥∥X(k)− Q(k)

∥∥ ≤ β, ∀k ∈ Z+,

where β is a constant independent of λ. Moreover, from Proposition 2, q(·)is an unperturbed trajectory of F (·) + λ. Then, applying Theorem 4 for theFPCS sytem F , we obtain for any t ∈ R+,

(67)∥∥X(t)− q(t)

∥∥ ≤ C supτ≤t

∥∥U(τ)∥∥,

for some constant C ≥ 1 that is independent of λ. Let C = C max(θmax , 2β) /θmin .Then, for any t ∈ Z+,∥∥Q(t)− q(t)

∥∥ ≤ 1

θmin

∥∥Q(t)− q(t)∥∥

≤ 1

θmin

(∥∥X(t)− q(t)∥∥ + β

)≤ 1

θmin

(C sup

τ≤t

∥∥U(τ)∥∥ + β

)≤ C

θmin

(supτ≤t

∥∥U(τ)∥∥ + β

)≤ C

θmin

(‖λ‖+ 2β + sup

τ≤t

∥∥∑k<τ

(A(k)− λ

)∥∥)

≤ C

θmin

(θmax ‖λ‖+ 2β + θmax sup

τ≤t

∥∥∑k<τ

(A(k)− λ

)∥∥)≤ C

(1 + ‖λ‖+ sup

τ≤t

∥∥∑k<τ

(A(k)− λ

)∥∥),where the relations are due to (63), (66), (67), C ≥ 1, (65), (64), and thedefinition of C, respectively.

6. Proof of Theorem 3. In this section we present the proof of The-orem 3. Part (a) is a corollary of Part (b). In the following, we first provepart (b), and then Part (c).

Proof of Part (b). We first consider a MW policy. We then use Lemma4 to extend the result to the case of WMW policies. The proof for a MWpolicy goes along the following lines. We break down a g(r)-long interval intosubintervals of length r. We define a “good” event Er, that the aggregatearrival in each r-long interval does not deviate much from its average, and

32

show that this event happens with high probability. We then use Theorem 2to show that Er implies that the queue length process stays close to a fluidsolution, in every subinterval. These fluid solutions are attracted to I(λ) (cf.Lemma 2), and hence also keep the queue length process near I(λ).

We now present a detailed proof. We fix some δ > 0 and some ε > 0 suchthat

(68) ε ≤ min(δ, α)/4C,

where C is the sensitivity constant provided by Theorem 2, and α = α(λ)is the constant in Lemma 2, associated with λ. For r ∈ N, we define a goodevent Er:

(69) Er ={ 1

r· supt∈[ir,(i+1)r)

∥∥ t∑τ=ir

(Ar(τ)− λr

)∥∥ ≤ ε, ∀ i ∈[0, g(r)T/r

]}.

We denote the complement of an event E by Ec. Then, for any r,

P(Ecr)

= P

(∃ i ∈ [0, g(r)T/r], s.t.

1

rsup

t∈[ir,(i+1)r)

∥∥ t∑τ=ir

(Ar(τ)− λr

)∥∥ > ε

)

≤bg(r)T/rc∑

i=0

P

(1

rsup

t∈[ir,(i+1)r)

∥∥ t∑τ=ir

(Ar(τ)− λr

)∥∥ > ε

)

=(⌊g(r)T/r

⌋+ 1)

P

(1

rsupt∈[0,r)

∥∥ t∑τ=0

(Ar(τ)− λr

)∥∥ > ε

)=(⌊g(r)T/r

⌋+ 1)o(

1/f(r, ε)),

(70)

where the inequality is due to the union bound, the second equality holdsbecause Ar(·) is a stationary process, and the last equality is because Ar(·),r ∈ N, is an f -tailed sequence of processes (cf. Definition 2). Also note thatin the last line, ε is a fixed constant, and the o(·) notation is with respect tor, as r goes to infinity.

From now on, and since λ is fixed, we use the simpler notation I, insteadof I(λ). Consider an r0 ∈ N such that for every r ≥ r0,∥∥λr − λ∥∥ ≤ Cε(71)

d(qr(0) , I

)≤ 2Cε,(72)

‖λr‖+ 1 ≤ rε.(73)

33

Such an r0 exists because of the convergence assumptions in the statementof the theorem, which also imply that λr is a bounded sequence. For everyr, i ∈ Z+, we define two events, Er,i and E′r,i:

Er,i , the event that d(Qr(ir), I)≤ 2Crε,

E′r,i , the event that d(Qr(t) , I

)≤ rδ, ∀ t ∈ [ir, (i+ 1)r),

(74)

where Qr(·) is the queue length process corresponding to the arrival Ar(·).Using Theorem 2, we will now show that for any r ≥ r0, Er implies Er,i andEr,i, for all i < g(r)T/r.

Claim 3. Fix some r ≥ r0. The occurrence of the event Er implies theoccurrence of the events Er,i and E′r,i, for all i < g(r)T/r.

Proof. The proof is by induction on i. For the base case, Er,0 followsfrom (72), because of Qr(0) = rqr(0) and the conic property of I, in (13).For the induction step, we will show that for any i < g(r)T/r, the events Erand Er,i imply Er,i+1 and E′r,i.

For any r ∈ Z+, let qri (t) be the fluid solution corresponding to arrivalrate λr, and initialized with qri (ir) = Qr(ir) at time ir. Fix an arbitraryt0 ≥ ir, and let q(·) be a fluid solution corresponding to arrival rate λ,initialized at q(t0) = qri (t0). From Proposition 2, q(·) and qri (·) are solutionsof q ∈ F (q) + λ and qri ∈ F (qri ) + λr, respectively, where F is the inducedFPCS system of the network. It then follows from Lemma 4.5 of [30] thatfor any t ≥ t0,

∥∥qri (t)− q(t)∥∥ ≤ (t− t0)∥∥λr − λ∥∥. As a result,

(75)d+

dt

∥∥qri (t)− q(t)∥∥ ∣∣∣t=t0

≤∥∥λr − λ∥∥.

Suppose that qri (t0) 6∈ I. Then, we also have q(t0) 6∈ I and Lemma 2 impliesthat

d+

dtd(qri (t), I

)∣∣∣t0≤ d+

dt

∥∥qri (t)− q(t)∥∥∣∣∣t=t0

+d+

dtd(q(t), I

)∣∣∣t=t0

≤∥∥λr − λ∥∥− α

≤ Cε− α≤ Cε− 4Cε

< −2Cε,

(76)

where the second inequality is from (75), and the fourth inequality is due to(68). Moreover, under Er,i, we have

d(qri (ir), I

)= d(Qr(ir), I

)≤ 2Crε.

34

Therefore, under Er,i,

d(qri (t), I

)≤ 2Crε, ∀ t ∈

[ir, (i+ 1)r

),(77)

d(qri((i+ 1)r

), I)

= 0.(78)

Then, for any r, i ∈ Z+, and under Er,i,

d(Qr((i+ 1)r

), I)≤∥∥Qr((i+ 1)r

)− qri

((i+ 1)r

)∥∥ + d(qri((i+ 1)r

), I)

=∥∥Qr((i+ 1)r

)− qri

((i+ 1)r

)∥∥≤ C

(1 + ‖λr‖ + sup

t∈[ir,(i+1)r)

∥∥ t∑τ=ir

(Ar(τ) − λr

)∥∥)≤ C

(rε+ rε

)= 2Crε,

where the second inequality is due to Theorem 2, and the last inequalityfollows from (73) and Er. This implies Er,i+1. Moreover,

supt∈[ir,(i+1)r)

d(Qr(t), I

)≤ sup

t∈[ir,(i+1)r)

(∥∥Qr(t)− qri (t)∥∥+ d(qri (t), I

))≤ sup

t∈[ir,(i+1)r)

∥∥Qr(t)− qri (t)∥∥ + 2Crε

≤ C

(1 + ‖λr‖ + sup

t∈[ir,(i+1)r)

∥∥ t−1∑τ=0

(Ar(τ)− λr

)∥∥) + 2Crε

≤ C(rε+ rε

)+ 2Crε

≤ rδ,

where the second inequality is due to (77), the third inequality follows fromTheorem 2, the fourth inequality is from (73) and Er, and the last inequalityis due to the definition of ε in (68). This implies E′r,i and completes the proofof the claim.

35

Back to the proof of the theorem, let us again fix some r ≥ r0. We have

P

(supt∈[0,T ]

d(qr(t) , I

)> δ

)= P

(sup

t≤g(r)Td(Qr(t)/r , I

)> δ

)

= P

(sup

t≤g(r)Td(Qr(t) , I

)> rδ

)

≤ P

⋃i≤g(r)T/r

E′cr,i

≤ P

(Ecr),

(79)

where the second equality is due to (13), the first inequality is from thedefinition of E′r,i, and the last inequality is due to Claim 3. Thus,

rf(r, ε)

g(r)P

(supt∈[0,T ]

d(qr(t) , I

)> δ

)≤

rf(r, ε)

g(r)P(Ec)

≤rf(r, ε)

g(r)

(⌊g(r)T/r

⌋+ 1)o(

1/f(r, ε))

≤ o

(⌊g(r)T/r

⌋+ 1

g(r)/r

)= o(1) −−−−→

r→∞0,

where the first two inequalities are due to (79) and (70), respectively; and theequality follows from the assumption lim infr→∞ g(r)/r > 0. This completesthe proof of Part (b) for the case of a MW policy.

We now present the proof of Part (b) for WMW policies. Suppose thata network N operates under a w-WMW policy, and consider an associatednetwork N as in Lemma 4, along with the variables and processes therein.It follows from Lemma 4 that if the constant function q(t) = q0 is a fluidsolution for network N , then the constant function q(t) = W 1/2q0 is a fluidsolution for N under a MW policy, and vice versa. Therefore, I = W 1/2I isthe set of invariant states for N , corresponding to arrival rate λ = W 1/2λ.

Let θmin , miniw1/2i and θmax , maxiw

1/2i . Let q(·) = W 1/2q(·), which is

the scaled version of the MW-driven process Q(·). Then, for any r ∈ N andany time t,

(80) d(qr(t) , I

)= d

(W−1/2qr(t) , W−1/2I

)≤ 1

θmind(qr(t) , I

).

36

In the same vein,

(81)∥∥Ar(t)− λr∥∥ =

∥∥W 1/2(Ar(t)− λr

)∥∥ ≤ θmax

∥∥Ar(t)− λr∥∥.As a result, Ar(·), r ∈ N, is a

(θmax f

)-tailed sequence of processes.

As in (68), fix some δ > 0 and some ε > 0 such that ε ≤ min(δθmin , α

)/4C,

where C is the sensitivity constant of the network operating under a MWpolicy (cf. Theorem 2) and α = α(λ) is the constant in Lemma 2, associatedwith λ. Using what we have already established for MW policies, it followsthat

(82)rθmax f

(r, ε)

g(r)P

(supt∈[0,T ]

d(qr(t) , I

)> δθmin

)−−−−→r→∞

0.

This together with (80) implies that

rf(r, ε)

g(r)P

(supt∈[0,T ]

d(qr(t) , I

)> δ

)

≤rf(r, ε)

g(r)P

(supt∈[0,T ]

1

θmind(qr(t) , I

)> δ

)−−−−→r→∞

0,

(83)

and Part (b) of the theorem follows.

Proof of Part (c). Throughout this proof we assume that we have fixed anetwork operated under a MW policy, as well as functions f(·) and g(·) withthe properties in the statement of the result, namely, limr→∞ rf

(r, ε)/g(r) =

0, for all ε > 0, and limr→∞ g(r)/r = ∞. It is not hard to see that theseproperties guarantee that there exists a function h : N→ R+ such that

limr→∞

rf(r, δ)

h(r)= 0, ∀ δ > 0,(84)

limr→∞

h(r)

g(r)= 0,(85)

limr→∞

h2(r)

rg(r)= ∞.(86)

Let h(r) = rg(r)/h(r). Then,

limr→∞

h(r)

r= lim

r→∞

g(r)

h(r)= ∞,(87)

limr→∞

h(r)

h(r)= lim

r→∞

rg(r)

h2(r)= 0.(88)

37

Before continuing with the main part of the proof, we establish that I(λ)is contained in a low-dimensional subspace. The intuition behind this fact isthat I(λ) is contained in the intersection of different effective regions, eachof which is a polyhedron. Recall that C stands for the capacity region of thenetwork.

Claim 4. Suppose that λ ∈ C but λ is not an extreme point of C. Then,there exists a nonzero vector v ∈ Rn such that vTI(λ) = {0}.

Proof. Since λ ∈ C, it follows from (5) that λ ∈(I−R

)Conv(S). There-

fore,

(89) λ =(I −R

)∑µ∈S

αµµ,

for some non-negative coefficients αµ that sum to one. Let us assume thatwe have fixed one particular set of such coefficients.

Consider some x ∈ I(λ), and let F be the induced dynamical system ofthe network. It follows from Proposition 2 and the definition of I(λ) that0 ∈ F (x) + λ. Therefore, λ ∈ −F (x). Since F (x) is the convex hull of thevectors µ that maximize xT (I −R)µ, we have

(90) λ =(I −R

) ∑ν∈S(x)

βνν,

for some non-negative coefficients βν that sum to one. This together with(89) implies that

(91)(I −R

)∑µ∈S

αµµ =(I −R

) ∑ν∈S(x)

βνν,

and as a result,

(92)∑µ∈S

αµ xT(I −R

)µ =

∑ν∈S(x)

βν xT(I −R

)ν.

Since S(x) is the set of maximizers of xT(I−R

)ν over ν ∈ S, it follows from

(92) that if αµ > 0, then

(93) µ ∈ S(x),

and this relation is true for all x ∈ I(λ). This is because otherwise, theleft-hand side of (92) would be strictly smaller than the right-hand side.

38

On the other hand, since λ is not an extreme point of C, it follows from (5)that λ is not an extreme point of

(I −R

)Conv(S). Then, there are at least

two service vectors µ, ν ∈ S, for which αµ, αν > 0 and(I−R

)µ 6=

(I−R

)ν.

Let v =(I − R

)(µ − ν

), which is a nonzero vector. As already shown in

(93), for any x ∈ I(λ), we have µ, ν ∈ S(x). Therefore, for any x ∈ I(λ), wehave xT

(I −R

)µ = xT

(I −R

)ν, i.e., vTx = 0, and the claim follows.

Using the above claim, consider an (n− 1)-dimensional subspace Z con-taining I(λ) and let w be a vector in Rn+\Z. By suitably scaling w, we canassume that d

(w,Z

)= 1. Then, w can be decomposed as w = z + v for

some z ∈ Z and some v 6= 0 which is orthogonal to Z and has unit norm,‖v‖ = 1. We also let

(94) b , maxµ∈S

supQ∈Rn+

∥∥(I −R)min(µ,Q

)∥∥,which is the maximum instantaneous change in the queue lengths due toservice, where R and S are the routing matrix and the set of service vectors,respectively. It follows from (87) and (88) that there exists some r0 ∈ N suchthat for any r ≥ r0,

(95) 2rδ + ‖λ‖+ b < h(r).

For every r ∈ N, let Ar(·) be an i.i.d. process with values

(96) Ar(t) =

{λ+ h(r)w, w.p. 1

h(r) ,

λ, w.p. 1− 1h(r) .

Since w ∈ Rn+, Ar(t) is non-negative for all r and t, and from (88),

(97) E {Ar(t)} = λ+h(r)

h(r)w −−−−→

r→∞λ.

For any r ∈ N and any T ∈ Z+, consider an event ErT :

ErT : the event that Ar(t) = λ+ h(r)w, for at least one t ∈[0, T ).

Consider some r ≥ r0. If Err does not occur, then for any t < r and anyδ > 0, ∥∥1

r

t∑τ=0

(Ar(τ)− λ

)∥∥ = 0 < δ.

39

Therefore,

P

(1

rsupt<r

∥∥ t∑τ=0

(Ar(τ)− λ

)∥∥ > δ

)≤ P

(Err)≤ r

h(r),

where the second inequality follows from (96) and the union bound. Thus,for any δ > 0,

limr→∞

f(r, δ) P

(1

rsupt<r

∥∥ t∑τ=0

(Ar(τ)− λr

)∥∥ > δ

)≤ lim

r→∞f(r, δ)

r

h(r)= 0,

where the equality is due to (84). Hence, Ar(·), r ∈ N, is an f -tailed sequenceof processes.

According to our definition of v, we have v = w − z, v is orthogonal tothe subspace Z containing I(λ), and ‖v‖ = d

(w,Z

)= 1. Therefore, for any

r ≥ r0, if Ar(t) = λ+ h(r)w for some t, then

vT(Qr(t+ 1)−Qr(t)

)= vT Ar(t) + vT

(R− I

)min

(µ(t), Q(t)

)≥ vT Ar(t) − b ‖v‖= h(r) + vTλ − b

≥ h(r) − ‖λ‖ − b

> 2rδ,

where the inequalities are due to (94), ‖v‖ = 1, and (95), respectively. Nowrecall that v is orthogonal to the subspace Z containing I(λ). Whenever wehave Ar(t) = λ + h(r) v, we have a jump of size at least 2rδ in a directionorthogonal to I(λ), from which it is not hard to see that

(98) max

(d(Qr(t+ 1), I(λ)

), d(Qr(t), I(λ)

))> rδ.

This implies that for any r ≥ r0, if the event Erg(r)T occurs, then

supt∈[0,g(r)T ]

d(Qr(t) , I(λ)

)> rδ.

40

Therefore,

limr→∞

P

(supt∈[0,T ]

d(qr(t) , I(λ)

)> δ

)= lim

r→∞P

(sup

t∈[0,g(r)T ]d(Qr(t) , I(λ)

)> rδ

)≥ lim

r→∞P(Erg(r)T

)≥ lim

r→∞

(1 −

(1− 1/h(r)

)g(r)T)≥ lim

r→∞

(1 − e−g(r)T/h(r)

)= 1,

where the last equality is due to (85). This completes the proof of Part (c).

Remark. The requirement, in Part (c) of the theorem, that λ is not anextreme point of C cannot be removed. For a trivial example, consider asingle queue with a single (one-dimensional) service vector µ = 1. Then,C = [0, 1]. If λ = 1, then every q ≥ 0 is an invariant state: I(1) = [0,∞).The conclusion of Claim 4 fails to hold and it is certainly impossible for thestate to be outside I(1).

Remark. The process Ar(t), used in the proof of Part (c) is not uniformlybounded, as it can have bursts of size h(r). With some additional effort,and a slightly more complicated proof, it is possible to carry out a construc-tion under which each component of Ar(t) is bounded by some constant,independent of r or t. The basic idea is that having excess arrivals (but ofbounded size) over a time period of length O(h(r)) has an effect comparableto a single burst of size O(h(r)).

7. Discussion. In this section we review our main results, their impli-cations, and directions for future research.

7.1. Main Results. We have established a deterministic bound on thesensitivity of queue length processes with respect to arrivals, under a Max-Weight policy. In particular, we showed that the distance between a queuelength process and a fluid solution remains bounded by a constant multipleof the deviation of the aggregate arrival process from its average. The boundallows for tight approximations of the queue lengths in terms of fluid solu-tions, which are much easier to analyse, and leads to a simple derivation ofa fluid limit result; cf. Corollary 1. We then exploited this sensitivity resultto prove matching upper and lower bounds for the time scale over which ad-ditive SSC occurs under a MW policy. As a corollary, we established strong

41

(additive) SSC of MW dynamics in diffusion scaling under conditions moregeneral than previously available.

For the case of i.i.d. arrivals, we established additive SSC over time in-tervals whose length scales exponentially with r. Such a result could also beproved with a more elementary argument, by viewing the distance from theinvariant set as a Lyapunov function and using the drift properties that weestablished; cf. Lemma 2. Nonetheless, such Lyapunov-based approaches arehard to generalize to broader classes of arrival processes. In contrast, oursensitivity bounds in Theorem 2 allow for the arrival processes to be arbi-trary and yield strong approximation results as long as the driving processhas some reasonable concentration properties.

7.2. Other Applications and Extensions. A similar sensitivity bound canalso be proved, using the same line of argument, for continuous time net-works e.g., with Poisson arrivals, operating under a MW policy. Similarly,for a more general class of stochastic processing networks, and under As-sumption 1 of [4], it is not hard to see that the fluid dynamics will again bea subgradient flow, and that our results can be extended to such systems.

In the same spirit, we believe that the results, including additive SSC, canbe extended to the case of backpressure policies13 [34, 8, 21], for networksin which the routing is no longer fixed, and where the different vectors µdetermine the set of links to be activated.

Another direction concerns SSC results in steady-state, that is, the extentto which the steady-state distribution will be concentrated in a neighbour-hood of the invariant set. Following a Lyapunov-based approach, severalworks [15, 14, 25] have proved exponential tail bounds for the steady-statedistribution, for the case of i.i.d. arrivals. We believe that Theorem 2 pro-vides an approach for establishing similar bounds for the case of non-i.i.d.and non-Markovian arrivals.

Finally, another problem where the fluid model turns out to be analyti-cally beneficial concerns delay stability under a MW policy in the presence ofheavy-tailed traffic. The references [19] and [20] studied the question whethera certain queue has finite expected delay (“delay stability”) in the presenceof other queues that are faced with heavy-tailed arrivals. They provided anecessary condition for this to be the case, in terms of certain propertiesof the associated fluid model, and raised the question whether under some

13 Backpressure policies are extensions of the MW policy, in which routing is no longerfixed. In particular, there is a fixed set of service vectors, where each service vector µassociates a rate µij to each link ij. A backpressure policy then chooses at each time aservice vector µ that maximizes

∑ij µij(Qi − Qj), where the sum is taken over all links

ij.

42

assumptions, this condition is also sufficient. Using the sensitivity results ofthe current paper, we are able to resolve a variant of this question as willbe reported in a forthcoming publication.

7.3. Open Problems. The sensitivity bound in Theorem 2 applies onlyto MW and WMW policies. It is not clear whether a similar bound holds forthe more general classes of MW-α and MW-f Policies. A similar questionalso arises about SSC: does Theorem 3 hold under a MW-α policy?

APPENDIX A: PROOF OF LEMMA 2

In this appendix, we present the proof of Lemma 2. The high level ideais that q(·) is a trajectory of a subgradient dynamical system correspondingto a convex function that has I(λ) as its set of minimizers. We then showthat in such a system, all trajectories are attracted to the set of minimizers,at a uniform rate.

Consider the convex function Φ in (36),

Φ(x) = maxµ∈S

((I −R)µ

)Tx,

and let F = −∂Φ be its subgradient field. Then, F is the induced FPCSsystem of the network (cf. Definition 5), and Proposition 2 states that fluidsolutions are the same as the non-negative unperturbed trajectories of q ∈F (q) + λ.

We define another convex function Φλ by Φλ(x) = Φ(x)−λTx, for all x ∈Rn. Since λ is in the capacity region, (5) implies that λ ∈

(I −R

)Conv(S).

Then, for any x ∈ Rn,

Φλ(x) = Φ(x)− λTx

= maxµ∈S

((I −R)µ

)Tx− λTx

≥ λTx− λTx= 0.

(99)

On the other hand, we have Φλ(0) = 0. It follows that the set of minimizersof Φλ, denoted by Γ, is

(100) Γ ={x ∈ Rn

∣∣Φλ(x) = 0}.

We use the shorthand notation I for I(λ). We now develop a characteri-zation and a simple polyhedral description of the set I. Recall that I is the

43

set of all x0 ∈ Rn+ for which the constant trajectory q(t) = x0 is a fluid solu-tion. As pointed out earlier, fluid solutions are the same as the non-negativeunperturbed trajectories of the system F (·) + λ, where F + λ = −∂Φλ. Inparticular, q(t) = x0 is a constant trajectory of this system if and only if∂Φλ(x0) contains the zero vector, which is the case if and only if x0 is aminimizer of Φλ, i.e., x0 ∈ Γ. Therefore,

(101) I = Γ ∩ Rn+.

For any µ ∈ S, we define the half-space Hµ ={x ∈ Rn |

((I − R)µ

)Tx ≤

λTx}

. We also let, for i = 1, . . . , n, Hi ={x ∈ Rn |xi ≥ 0

}. We will now

show that

(102) I =( ⋂µ∈S

Hµ

) ⋂ ( n⋂i=1

Hi

).

Suppose that x0 ∈ I. Then, x0 ∈ Rn+, i.e., x0 ∈ Hi, for all i. Further-more, x0 ∈ Γ and, from (100), Φλ(x0) = 0. From (99), this implies that

maxµ∈S((I − R)µ

)Tx0 = λTx0, so that

((I − R)µ

)Tx0 ≤ λTx0, or equiva-

lently x0 ∈ Hµ, for all µ. This argument can be reversed. If x0 belongs toall of the half-spaces Hµ, we have Φλ(x0) ≤ 0, which in light of (99) impliesthat Φλ(x0) = 0, or x0 ∈ Γ. If in addition x0 ∈ Rn+, then, from (101), weobtain x0 ∈ I. This concludes the proof of (102).

Having characterized the set I, we now turn our attention to the dynamicsthat drive trajectories towards I. Fix some µ ∈ S. For any x 6∈ Hµ, theclosest point to x in Hµ lies on the boundary of Hµ, i.e., on the subspaceWµ =

{z ∈ Rn

∣∣ gTµ z = 0}

, where gµ = (I − R)µ − λ. For any x ∈ Rn andµ ∈ S, either d(x,Hµ) = 0 (which includes the case where gµ = 0, so thatWµ = Hµ = Rn), or gµ 6= 0 and x 6∈ Hµ, in which case

(103) d(x, Hµ

)= d

(x, Wµ

)=

gTµ x∥∥gµ∥∥ ,which is the length of the projection of x on the normal vector to Wµ. Thus,in both cases, we have

(104) ‖gµ‖ · d(x, Hµ

)= max

(gTµ x, 0

).

44

Then, for any x ∈ Rn,

Φλ(x) = maxµ∈S

((I −R)µ− λ

)Tx

= maxµ∈S

gTµ x

= maxµ∈S

max(gTµ x, 0

)= max

µ∈S

∥∥gµ∥∥ · d(x, Hµ

).

Let ε = min{‖gµ‖ : µ ∈ S, gµ 6= 0

}. Without loss of generality, we assume

that ε > 0; otherwise, we would be dealing with a trivial system where everygµ is zero, and Φλ is identically zero. Note that for any x ∈ Rn+, we have

Φλ(x) ≥ εmaxµ∈S

d(x , Hµ

)= εmax

(maxµ∈S

d(x , Hµ

), maxi=1,...,n

d(x , Hi

)),

(105)

where the equality is because when x ∈ Rn+, we have d(x , Hi

)= 0, for all i.

Consider a fluid solution x(·), and fix a time t such that x(t) 6∈ I. Letx0 = x(t) and let z be the element of I that is closest to x0. Then, at timet,

d

dtd(x(t), I

)= lim

h↓0

d(x(t+ h), I

)− d

(x(t), I

)h

= limh↓0

d(x(t+ h), I

)− d

(x(t), z

)h

≤ limh↓0

d(x(t+ h), z

)− d

(x(t), z

)h

=d+

dtd(x(t), z

)=

(x(t)− z

)Tx(t)∥∥x(t)− z∥∥

=

(x0 − z

)Tx(t)

d(x0, I

) .

45

Putting everything together, we obtain

d

dtd(x(t), I

)≤

(x0 − z

)Tx(t)

d(x0, I

)≤ Φλ(z)− Φλ(x0)

d(x0, I

)= − Φλ(x0)

d(x0 , I

)≤ −

ε max(

maxµ∈S d(x0 , Hµ

), maxi=1,...,n d

(x0 , Hi

))d(x0 ,

(⋂µ∈S Hµ

) ⋂ (⋂ni=1Hi

))≤ −c ε;

where the second inequality above is because −x(t) is a subgradient of Φλ atx0; the equality is because z ∈ I and Φλ vanishes on I, by (101), (100); thethird inequality is due to (105) and (102); and the last inequality follows fromLemma 5, for the constant c therein. This completes the proof of Lemma 2,with α(λ) = c ε.

APPENDIX B: PROOF OF LEMMA 3

Let X1, . . . , Xt be i.i.d. random variables, taking values in [0, a], and letX =

(X1 + · · ·+Xt

)/t. Then, for any δ > 0, Hoeffding’s inequality [9] yields

(106) P( ∣∣X − E

{X}∣∣ > δ

)≤ 2 exp

(−2tδ2

a2

).

For any fixed r ∈ N and t ≤ r,

P

(1

r

∥∥ t∑τ=0

(Ar(τ)− λr

)∥∥ > δ

)≤

n∑i=1

P

(1

r

∣∣ t∑τ=0

(Ari (τ)− λri

)∣∣ > δ√n

)

=n∑i=1

P

(1

t+ 1

∣∣ t∑τ=0

(Ari (τ)− λri

)∣∣ > rδ

(t+ 1)√n

)

≤ 2n exp

−2(t+ 1

)a2

(rδ(

t+ 1)√n

)2

≤ 2n exp

(−(

2r

r + 1

)rδ

na2

),

(107)

46

where the first inequality holds because if the Euclidean norm is above δ,then at least one of the components must be above δ/

√n, together with the

union bound, and the third inequality is due to (106). As in the statementof the lemma, let β ∈ (0, 2) and f(r, δ) = exp

(βrδ/na2

). We then have

f(r, δ)

P

(1

rsupt≤r

∥∥ t∑τ=0

(Ar(τ)− λr

)∥∥ > δ

)

≤ f(r, δ) ∑t≤r

P

(1

r

∥∥ t∑τ=0

(Ar(τ)− λr

)∥∥ > δ

)

≤ f(r, δ)

2nr exp

(−(

2r

r + 1

)rδ

na2

)= 2nr exp

(βrδ

na2−(

2r

r + 1

)rδ

na2

)−−−−→r→∞

0,

(108)

where the second inequality is due to (107), and the last implication isbecause β < 2.

REFERENCES

[1] Andrews, M., Kumaran, K., Ramanan, K., Stolyar, A., Vijayakumar, R. andWhiting, P. (2004). Scheduling in a queuing system with asynchronously varyingservice rates. Probability in the Engineering and Informational Sciences 18 191-217.

[2] Bramson, M. (1998). State space collapse with application to heavy traffic limits formulticlass queueing networks. Queueing Systems 30 89–140.

[3] Dai, J. and Lin, W. (2008). Asymptotic optimality of maximum pressure policies instochastic processing networks. The Annals of Applied Probability 18 2239–2299.

[4] Dai, J. G. and Lin, W. (2005). Maximum pressure policies in stochastic processingnetworks. Operations Research 53 197–218.

[5] Dai, J. G. and Prabhakar, B. (2000). The throughput of data switches with andwithout speedup. In Proceedings IEEE INFOCOM 2000. Conference on ComputerCommunications. Nineteenth Annual Joint Conference of the IEEE Computer andCommunications Societies 2 556-564.

[6] Eryilmaz, A. and Srikant, R. (2012). Asymptotically tight steady-state queuelength bounds implied by drift conditions. Queueing Systems 72 311–359.

[7] Eryilmaz, A., Srikant, R. and Perkins, J. R. (2005). Stable scheduling policiesfor fading wireless channels. IEEE/ACM Transactions on Networking 13 411–424.

[8] Georgiadis, L., Neely, M. J. and Tassiulas, L. (2006). Resource allocation andcross-layer control in wireless networks. Now Publishers Inc.

[9] Hoeffding, W. (1963). Probability inequalities for sums of bounded random vari-ables. Journal of the American Statistical Association 58 13–30.

[10] Ji, T., Athanasopoulou, E. and Srikant, R. (2009). Optimal scheduling policiesin small generalized switches. In Proceedings of INFOCOM 2009 2921–2925. IEEE.

47

[11] Kang, W., Kelly, F., Lee, N. and Williams, R. (2009). State space collapseand diffusion approximation for a network operating under a fair bandwidth sharingpolicy. The Annals of Applied Probability 19 1719–1780.

[12] Kang, W. and Williams, R. (2012). Diffusion approximation for an input-queuedswitch operating under a maximum weight matching policy. Stochastic Systems 2277–321.

[13] Lin, X., Shroff, N. B. and Srikant, R. (2006). A tutorial on cross-layer optimiza-tion in wireless networks. IEEE Journal on Selected Areas in Communications 241452–1463.

[14] Maguluri, S. T., Burle, S. K. and Srikant, R. (2016). Optimal Heavy-TrafficQueue Length Scaling in an Incompletely Saturated Switch. ACM SIGMETRICSPerformance Evaluation Review 44 13–24.

[15] Maguluri, S. T. and Srikant, R. (2015). Heavy-traffic behavior of the MaxWeightalgorithm in a switch with uniform traffic. ACM SIGMETRICS Performance Evalu-ation Review 43 72–74.

[16] Maguluri, S. T. and Srikant, R. (2016). Heavy traffic queue length behavior in aswitch under the MaxWeight algorithm. Stochastic Systems 6 211–250.

[17] Maguluri, S. T., Srikant, R. and Ying, L. (2014). Heavy traffic optimal resourceallocation algorithms for cloud computing clusters. Performance Evaluation 81 20–39.

[18] Mannor, S. and Tsitsiklis, J. N. (2005). On the empirical state-action frequen-cies in Markov decision processes under general policies. Mathematics of OperationsResearch 30 545–561.

[19] Markakis, M. G., Modiano, E. and Tsitsiklis, J. N. (2016). Delay Stability ofBack-Pressure Policies in the Presence of Heavy-Tailed Traffic. IEEE/ACM Trans-actions on Networking 24 2046–2059.

[20] Markakis, M. G., Modiano, E. and Tsitsiklis, J. N. (2018). Delay Analysisof the Max-Weight Policy Under Heavy-Tailed Traffic via Fluid Approximations.Mathematics of Operations Research 43 460-493.

[21] Neely, M. J. (2010). Stochastic network optimization with application to commu-nication and queueing systems. Synthesis Lectures on Communication Networks 31–211.

[22] Reiman, M. I. (1984). Some diffusion approximations with state space collapse. InModelling and performance evaluation methodology 207–240. Springer.

[23] Rockafellar, R. (1970). On the maximal monotonicity of subdifferential mappings.Pacific Journal of Mathematics 33 209–216.

[24] Shah, D., Tsitsiklis, J. N. and Zhong, Y. (2010). Qualitative properties of α-weighted scheduling policies. In ACM SIGMETRICS Performance Evaluation Review38 239–250.

[25] Shah, D., Tsitsiklis, J. N. and Zhong, Y. (2016). On queue-size scaling for input-queued switches. Stochastic Systems 6 1–25.

[26] Shah, D. and Wischik, D. (2006). Optimal scheduling algorithms for input-queuedswitches. In Proceedings of INFOCOM 2006. IEEE Computer Society.

[27] Shah, D. and Wischik, D. (2012). Switched networks with maximum weight policies:Fluid approximation and multiplicative state space collapse. The Annals of AppliedProbability 22 70–127.

48

[28] Shakkottai, S., Srikant, R. and Stolyar, A. L. (2004). Pathwise optimality ofthe exponential scheduling rule for wireless channels. Advances in Applied Probability1021–1045.

[29] Sharifnassab, A., Tsitsiklis, J. N. and Golestani, J. (2019). Sensitivity to Cu-mulative Perturbations for a Class of Piecewise Constant Hybrid Systems. IEEETransactions on Automatic Control, to appear.

[30] Stewart, D. E. (2011). Dynamics with Inequalities: Impacts and Hard Constraints.SIAM.

[31] Stolyar, A. L. (2004). MaxWeight scheduling in a generalized switch: State spacecollapse and workload minimization in heavy traffic. Ann. Appl. Probab. 14 1–53.

[32] Stolyar, A. L. (2005). Maximizing queueing network utility subject to stability:Greedy primal-dual algorithm. Queueing Systems 50 401–457.

[33] Subramanian, V. G. (2010). Large deviations of max-weight scheduling policies onconvex rate regions. Mathematics of Operations Research 35 881–910.

[34] Tassiulas, L. and Ephremides, A. (1992). Stability properties of constrained queue-ing systems and scheduling policies for maximum throughput in multihop radio net-works. IEEE Transactions on Automatic Control 37 1936–1948.

[35] Wang, W., Maguluri, S. T., Srikant, R. and Ying, L. (2018). Heavy-trafficdelay insensitivity in connection-level models of data transfer with proportionally fairbandwidth sharing. ACM SIGMETRICS Performance Evaluation Review 45 232–245.

[36] Williams, R. J. (1998). Diffusion approximations for open multiclass queueing net-works: sufficient conditions involving state space collapse. Queueing systems 30 27–88.

[37] Xie, Q. and Lu, Y. (2015). Priority algorithm for near-data scheduling: Throughputand heavy-traffic optimality. In Computer Communications (INFOCOM), 2015 IEEEConference on 963–972. IEEE.

Documents

Fluctuation Bounds for the Max-Weight Policy, with ...jnt/Papers/P-18-MW+SSC-fin.pdf · The SSC results then follow from the property that the uid so-lutions are attracted to a low-dimensional