44
Queueing systems manuscript No. (will be inserted by the editor) Balancing Admission Control, Speedup, and Waiting in Service Systems Galit B. Yom-Tov · Carri W. Chan Revised: January/2019 Abstract Admission control and speedup of service rates are used during periods of congestion in order to minimize customer waiting in different service settings. In a health care setting, this can mean sending patients to alternative care facilities which may take more time and/or provide less ideal treatment. Thus whilst congestion is detrimental to patient outcomes, controlling congestion also comes at a cost. In this work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming to characterize properties of the optimal control and find that in some instances, the optimal policy has a simple form of a threshold policy. Leveraging this insight, we examine a queueing system where speedup is used when the number in the system exceeds some threshold and ad- mission control is used when the number in the system exceeds some (potentially different) threshold. Using fluid analysis and a stochastic loss model, we establish approximations for the probability of speedup, the probability of admission control, and the expected queue length. We use the approximation analysis to characterize the region of the optimal solution and develop a greedy heuristic to derive a near optimal solution to the original optimization problem. We use simulation to demonstrate the quality of these approximations and find they can be quite accurate and robust. This analysis can provide insight to managers in service settings deciding how to balance admission and speedup controls: when and to what extent to use each. Keywords Queues and service · Applications · Admission control · Service rate control · Dynamic programming · State-dependent queues · Healthcare operations Mathematics Subject Classification (2000) 90B22 · 60K30 · 90B50 · 90C39 Galit B. Yom-Tov Technion—Israel Institute of Technology, Haifa, Israel E-mail: [email protected] Carri W. Chan Columbia Business School, New York, NY E-mail: [email protected]

Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Queueing systems manuscript No.(will be inserted by the editor)

Balancing Admission Control, Speedup, and Waiting inService Systems

Galit B. Yom-Tov · Carri W. Chan

Revised: January/2019

Abstract Admission control and speedup of service rates are used during periods ofcongestion in order to minimize customer waiting in different service settings. In ahealth care setting, this can mean sending patients to alternative care facilities whichmay take more time and/or provide less ideal treatment. Thus whilst congestion isdetrimental to patient outcomes, controlling congestion also comes at a cost. In thiswork, we examine a multi-server queueing system which considers both admissioncontrol and speedup. We use dynamic programming to characterize properties of theoptimal control and find that in some instances, the optimal policy has a simple formof a threshold policy. Leveraging this insight, we examine a queueing system wherespeedup is used when the number in the system exceeds some threshold and ad-mission control is used when the number in the system exceeds some (potentiallydifferent) threshold. Using fluid analysis and a stochastic loss model, we establishapproximations for the probability of speedup, the probability of admission control,and the expected queue length. We use the approximation analysis to characterize theregion of the optimal solution and develop a greedy heuristic to derive a near optimalsolution to the original optimization problem. We use simulation to demonstrate thequality of these approximations and find they can be quite accurate and robust. Thisanalysis can provide insight to managers in service settings deciding how to balanceadmission and speedup controls: when and to what extent to use each.

Keywords Queues and service · Applications · Admission control · Service ratecontrol · Dynamic programming · State-dependent queues · Healthcare operations

Mathematics Subject Classification (2000) 90B22 · 60K30 · 90B50 · 90C39

Galit B. Yom-TovTechnion—Israel Institute of Technology, Haifa, IsraelE-mail: [email protected]

Carri W. ChanColumbia Business School, New York, NYE-mail: [email protected]

Page 2: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

2 Galit B. Yom-Tov, Carri W. Chan

1 Introduction

Congestion creates a challenge in service systems especially since managing con-gestion often has conflicting requirements to the provision of high quality service.Admission control and increasing service rates are approaches which have been usedto overcome periods of congestion in different service systems. Admission control(through denial of service whereby customers find the service elsewhere or returnlater) has been used in hospital settings (e.g. Kim et al (2015), Allon et al (2013),and Song et al (2019)) and call-centers (e.g. Ormeci (2004)). Increasing service rate(by sacrificing service quality) has been used in production lines (Powell and Schultz2004), email contact centers (Hasija et al 2010), and Intensive Care Units (ICUs)(Kc and Terwiesch 2012). Both decreasing admission rate and increasing service ratemay come at a cost. Empirical and analytic studies on speedup, a form of service ratecontrol (Kc and Terwiesch 2012, Chan et al 2012), and admission control (Kim et al2015, Shmueli et al 2003) have demonstrated that each of these approaches can bedetrimental to service quality. Consider, for example, congestion in an ICU. DenyingICU admissions to critical patients (so they are sent to a regular ward) can have a di-rect negative effect on patient outcomes since the patients will generally wait longerto receive treatment in the regular medical units. Admission denial also results in aloss of financial compensation to the hospital. On the other hand, speedup shortensthe period of intensive care which can also result in poorer patient outcomes (Kc andTerwiesch 2012).

In this work, we consider how to balance admission control and service ratespeedup and provide high-quality service to as many customers as possible. We askunder what circumstances is speedup beneficial? When should customers service ratebe accelerated or on the flip side when should admission control be used to managecustomer demand? What is the trade-off between ensuring quality-of-care for ad-mitted customers versus providing access-to-service to new customers? The primaryobjective of this work is to develop an understanding of the pros and cons of jointspeedup and admission control in a service setting. While our initial motivation is thehealthcare setting, which is one interesting example of an environment where bothpolicies can be applied simultaneously, we emphasize that these controls can be uti-lized in many different kinds of service settings from call centers, to restaurants, andeven production lines.

1.1 Related Work

Previous works have considered admission and service rate control simultaneously(see for example, Adusumilli and Hasenbein (2010), Ata and Shneorson (2006), Leeand Kulkarni (2014)). Both Ata and Shneorson (2006), Adusumilli and Hasenbein(2010) and Dimitrakopoulos and Burnetas (2017) examined the joint arrival and ser-vice rate control in a single server setting of an M/M/1 queue. In contrast we considera multi-server setting for which the queue dynamic is different. As such, we can con-sider the possibility of using those policies when the system is not fully occupied.Note that Ata and Shneorson (2006) consider how to set prices for service, when cus-

Page 3: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 3

tomers are both price and delay sensitive. However, we do not consider price-setting,as it is not a main driver in the primary healthcare application of interest for us.Perhaps most closely related to our work is that of Lee and Kulkarni (2014), whichexamines arrival and service rate control for a multi-server system and characterizesproperties of the optimal policy. In contrast to this work we consider concave ratherthan convex cost functions and characterize the optimal policy as having a thresholdproperty. We find that it is optimal to only use the maximum or minimum arrival andservice rates.

Our analysis is also related to a line of papers that evaluated performance mea-sures of service systems with state-dependent admission and/or service rate variation.For example, Bekker and Boxma (2007) developed exact analysis for a single-serverqueue with workload-dependent service rates and derived its steady-state distribution.Bekker et al (2004), Boxma and Vlasiou (2007) and Choi and Lim (2016) considerboth arrival and service rates which depend on the workload in the single-server sys-tem. In the first two papers the rate functions are assumed to be continuous whichfits a number of reasonable service environments. However, as we shall later see, wefind that in our setting the optimal rate functions should be discontinuous. Workload-dependent service rates were shown to create system instability even when capacity issufficient to serve the nominal workload (i.e. ρ < 1) (see Chan et al (2014), Dong et al(2015)). Therefore, state-dependent systems should be analyzed carefully. Similar toChan et al (2014), we utilize fluid models with discontinuous differential equations.However, here we consider a system in which admission control can also be applied,but without feedback. In Section 7, we consider an extension that includes customerreturns.

Our work is primarily motivated by the healthcare setting where arrival rates andservice rates can be altered via various controls such as admission control (rerouting),ambulance diversion, and speedup. To understand this, we develop and analyze amodel of patient flow. There is a rich and growing body of literature which examinesstochastic models of patient flow (e.g. Mandelbaum et al (2012), Huang et al (2015),Shi et al (2015), Dong and Perry (2019) among many others). To the best of ourknowledge, none have considered the optimization of arrival rates and service rates.

1.2 Main Results and Discussion

We start with a stochastic optimization framework and characterize properties of theoptimal policy (§3). Some of these properties are similar to those established in Ataand Shneorson (2006) for a single-server system and Lee and Kulkarni (2014) for amulti-server system. We further identify properties of the optimal policy under char-acterizations of the system’s cost functions, which were not considered in these priorworks. Specifically, we are able to demonstrate the optimality of a threshold policy.This policy can be defined by two thresholds, Ns and Na, such that if the number ofcustomers in the system is larger thanNs (Na) it is optimal to use service rate (admis-sion) control. While we are able to identify settings in which threshold policies areoptimal, studying such policies is of broader interest as they are simple to implementand are often used in practice (e.g. Allon et al (2013)). As such, we use fluid models in

Page 4: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

4 Galit B. Yom-Tov, Carri W. Chan

Section 4 to derive high quality approximations for various steady-state performancemetrics. Specifically, we consider three quality measures: the probability of usinga reduced arrival rate, the probability customers will be spedup, and the expectedwaiting times. The threshold policy renders a discontinuous ODE, which requires themethods of di Bernardo et al (2008) and Filippov (1988) to analyze its equilibrium.We then develop a heuristic to approximate the expected queue length via a lossmodel whose parameters are based on the equilibrium analysis of the fluid model.These approximations allow for performance analysis given speedup and admissioncontrol thresholds. Moreover, they also enable one to solve a constraint satisfactionproblem. In particular, a hospital manager can utilize our performance approxima-tions to determine threshold values which will satisfy constraints in the amount ofadmission control and/or speedup used as well as the expected queue length.

We then utilize our performance measure approximations to solve a cost mini-mization problem, where costs are appropriately defined based on quality and finan-cial considerations (§6). We find a set of solutions which appear to be ‘zero cost’.This set of solutions suggests that—for all system parameters—admission controland/or speedup should begin before a queue builds. We find that it is important tocarefully select thresholds, Na and Ns, within this set and develop a greedy heuris-tic to do so that prioritizes admission control and speedup based on relative costs.In a case-study, we compare the performance of our heuristic to the optimal solu-tion found via exhaustive search. We find that our heuristic is quite accurate andprovides a near optimal solution to our problem. That is, the cost incurred by ourheuristic policy is very close, and sometimes equal, to that achieved via exhaustivesearch over all possible policies. Moreover, it is quite robust to misspecifications insystem cost parameters. We find that the optimal thresholds can be highly dependenton system dynamics—which are likely possible to estimate from empirical data—and cost functions—which may be possible to coarsely estimate— but difficult tocompare across the different sources of costs, i.e. admission rate versus service rateversus queue length. The potential challenges of precisely quantifying the relation-ship across these different costs, reinforces the need for the performance evaluationwe provide.

2 Stochastic Queueing Model

We now formally introduce our Markovian queueing model. This stochastic modelcaptures the ability to utilize admission and speedup control, if desired. We consider aqueueing system as depicted in Figure 1. This is a multi-server station withN servers.The queue captures the time a customer spends waiting when an admission request ismade until the customer is finally admitted to the server pool.

When the system becomes congested, the system administrator can utilize admis-sion control to reduce the arrival rate to the system. This is undesirable as this canleave some customers without service. On the other hand, customer speedup can beutilized to discharge customers faster. This is also undesirable as it can degrade thequality of service. Unfortunately, it is sometimes necessary to use admission control

Page 5: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 5

Patient discharge

Service Rate - 𝜇(𝕏)

N Arrivals Rate 𝜆(𝕏)

Fig. 1 Admission and Speedup control model: N -server system with arrival rate, λ(X), and service rate,µ(X), which can depend on the number of customers in the system, X.

and/or speedup and our goal is to understand how and when each mechanism shouldbe utilized.

Remark 1 In a healthcare setting, the server pool would correspond to a MedicalUnit (MU) such as an ICU with N beds. Admission control could correspond tomoving patients to another unit that is less preferable to their condition or cancellingsurgeries (e.g. Kim et al (2015), Song et al (2019)) and patient speedup could cor-respond to discharging patients from the ICU earlier than normally desired (e.g. Kcand Terwiesch (2012)). Neither of these options is desirable as they can have impactquality of care and revenues.

Similar to Lee and Kulkarni (2014), we consider a continuous time infinite hori-zon, discounted cost formulation. The arrival process of customers is Poisson withrate which is dependent on the selection of λ ∈ [λL, λH ]. Hence, we consider thesituation where admission control is possible. The nominal arrival rate is λH ; if ad-mission control is in place, the arrival rate is reduced. Note that even with admissioncontrol, in many settings, the arrival rate is likely to be non-zero as there may becertain types of customers that cannot be turned away (e.g. very severe patients whocannot be rerouted). If admission control is employed, a cost rate of φ(λ), which isnon-increasing in λ, is incurred. This cost captures the disutility of denied service.

Customer service times are exponentially distributed with nominal service rateµL. The system administrator can employ speedup which increases the service rate toµ ∈ [µL, µH ]. When speedup is utilized, a cost rate of ξ(µ), which is non-decreasingin µ, is incurred. This captures the undesirability of speedup, e.g. due to its impact onquality of service.

Let X denote the stochastic number of customers in our system. If there are Xcustomers in the system, a holding cost rate of h(X), which is non-decreasing in X,is incurred. Without loss of generality, let h(0) = 0. The functional form of h cancapture different types of systems. For example, if a waiting cost cw is incurred foreach customer waiting for service, then h(X) = cw(X−N)+. Alternatively, if thereis simply a cost for having a queue, h(X) = cw1{X>N}. Our goal is to minimize theexpected discounted cost incurred over an infinite horizon. Let X(t) be the state attime t, i.e. the number of customers in the system at time t. Our goal is to determinepolicy u(t)—which may depend on X(t)—such that:

limt→∞

E

[∫ t

0

e−βτg(X(τ), u(τ))dτ

](1)

is minimized, where the cost rate is given as:

g(X, u) = h(X) + φ(λ(u)) + ξ(µ(u)).

Page 6: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

6 Galit B. Yom-Tov, Carri W. Chan

These cost functions have been partially estimated in a number of different health-care settings (see, for instance, Kim et al (2015), Kc and Terwiesch (2012), Chan et al(2017)). As such, in this work, we assume that they are given and are characterizedby the following properties:

Assumption 1 We make the following assumptions on the cost functions:

1. φ(λ) ≥ 0 is non-increasing in λ.2. ξ(µ) ≥ 0 is non-decreasing in µ.3. h(X) ≥ 0 is non-decreasing in X.

Throughout our analysis we also assume that there are enough servers to satisfyall demand, irrespective of what control is employed. Thus, our control is about en-suring service quality, when stability is guaranteed.

Assumption 2 We make the following assumption about the number of servers inthe system:

N > λH/µL.

3 Characterizing the Optimal Policy: Arrival and Service Rates

We now turn our attention to characterizing the optimal policy which minimizes thecost, given in (1). Some of these results are similar to those derived in Ata and Sh-neorson (2006) and Lee and Kulkarni (2014), which we include here for complete-ness. However, we also establish new properties of the optimal control, which arenot included in the prior works. These properties, which emit a simple, easily imple-mentable policy, are vital for the performance evaluation and optimization discussedin Section 4.

Using the uniformization technique (Bertsekas 2001), we transform our continu-ous time problem into a discrete time equivalent model. In particular, we can see thatfor any action u = (λ, µ), the rate to the next state transition in state X is given as:

vX(u) =

{λ(u), X = 0;λ(u) + (X ∧N)µ(u), X ≥ 1.

Hence, the maximum possible transition rate is v = λH + NµH . We can write theBellman equation for this optimization problem. The minimum discounted cost-to-gois:

J(0) =1

β + vmin

λ∈[λL,λH ]{φ(λ) + (v − λ)J(0) + λJ(1)}

J(X) =1

β + vmin

λ∈[λL,λH ],µ∈[µL,µH ]{h(X) + φ(λ) + ξ(µ) +

λJ(X + 1) + (X ∧N)µJ(X− 1) + (v − λ− (X ∧N)µ)J(X)}.

We define the following differential of the optimal discounted cost:

∆(X) = J(X)− J(X− 1)

Page 7: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 7

where by convention we define ∆(0) = 0. Hence, the Bellman’s equation can berewritten as:

J(X) =1

β + v

[h(X) + vJ(X) + min

λ{φ(λ) + λ∆(X + 1)}

+ minµ{ξ(µ)− (X ∧N)µ∆(X)}

].

The optimal policy is then

u∗(X) = (λ∗(X), µ∗(X)) = arg minλ,µ

{φ(λ) +λ∆(X+ 1) + ξ(µ)− (X∧N)µ∆(X)}.

Our goal is to understand properties of the optimal policy. In particular, we will showthat the optimal policy is monotonic in the number of customers in the system. Thatis, the optimal service rate µ∗(X) is increasing in X and the optimal arrival rate λ∗(X)is decreasing in X. This result is similar to that in Lee and Kulkarni (2014). The proofis provided in the Appendix for completeness.

Theorem 1 The optimal policy is monotonic in X. That is, if it is optimal to usespeedup (admission control) in state X, it is also optimal to use speedup (admissioncontrol) in state X + 1. We have the following two results:1. The optimal service rate, µ∗(X), is non-decreasing in X.2. The optimal admission rate, λ∗(X), is non-increasing in X.

We now consider a special case of the cost functions φ(λ) and ξ(µ). In this case,we can further characterize the optimal policy as having binary notions of speedupand admission control. Note that this characterization of the cost functions was notconsidered in Ata and Shneorson (2006) or Lee and Kulkarni (2014); thus, the corre-sponding results are new.Assumption 3 We make the following concavity assumptions about our cost func-tions.1. The cost function φ(λ) ≥ 0 is concave in λ.2. The cost function ξ(µ) ≥ 0 is linear in µ.

We first consider the arrival rate cost function, φ(λ). One could consider a linearfunction φ which would capture the clinical (or financial) cost associated with eachdenied admission. Generalizing to a concave cost function would imply that the dif-ferential cost of reducing the arrival rate is highest when starting to use admissioncontrol. This may hold when considering financial or operational costs. Reducing thearrival rate in a healthcare setting can be done in a number of ways; for instance, viaambulance diversion or canceling surgeries. If one considers there is administrativeoverhead to start canceling surgeries, it may be reasonable to assume that once theinitial set up cost is incurred, further cancellations come at a lower cost.

A linear service rate cost function, ξ(µ), implies that the cost is convex in theexpected service times (1/µ). For instance, in a healthcare setting staying 1 less dayfor a patient who is expected to stay for 10 days is likely to be much less traumaticthan it is for a patient who is expected to stay for only 2 days.

Under these assumptions, we can establish the following property of the optimalpolicy, which makes it highly desirable for implementation.

Page 8: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

8 Galit B. Yom-Tov, Carri W. Chan

Theorem 2 Given Assumptions 2 and 3, the optimal admission control and speeduppolicy will only use the maximum and minimum arrival and service rates. That is:

1. λ∗(X) ∈ {λL, λH}.2. µ∗(X) ∈ {µL, µH}.

Any µ ∈ (µL, µH) or λ ∈ (λL, λH) is sub-optimal.

The proof is provided in the Appendix. Note that Theorem 2 also holds if ξ(µ) isconcave and non-decreasing; however, it is not clear when such a cost function mightarise in practice. Of course, the result also holds for linear φ(λ).

Theorem 2 implies that the optimal policy can be defined by two parameters, NaandNs, which represent thresholds at which to begin admission control and speedup,respectively. That is, the optimal policy is such that:

– Admission Control: λ∗(X) =

{λL, if X < Na;λH , if X ≥ Na.

– Speedup Control: µ∗(X) =

{µL, if X < Ns;µH , if X ≥ Ns.

Remark 2 We have identified conditions under which threshold policies are optimal.However, understanding the behavior of threshold policies is of broader interest—even in situations where Assumption 3 does not hold—as there is evidence that suchpolicies are often used in practice. For instance, hospitals often go on ambulancediversion (altering the arrival rate between λH and λL) once the number of patientswaiting exceeds some predefined threshold. Additionally, speedup in the ICU hasbeen shown to take place once the number of available beds goes below some value(Kc and Terwiesch 2012).

3.1 Average Cost Problem

Thus far, we have considered the infinite-horizon, discounted cost problem. It turnsout that our structural results for the discounted cost problem also extend to the aver-age cost problem. In this case, the objective is to minimize the average cost per-stage:

limt→∞

1

tE

[∫ t

0

g(X(τ), u(τ))dτ

],

where the cost rate is the same as before.

Proposition 1 Given Assumptions 2 and 3, the optimal admission control and speed-up policy which minimizes the average cost will only use the maximum and minimumarrival and service rates. That is:

1. λ∗(X) ∈ {λL, λH}2. µ∗(X) ∈ {µL, µH}

Any µ ∈ (µL, µH) or λ ∈ (λL, λH) is sub-optimal.

The proof is provided in the Appendix.

Page 9: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 9

3.2 Dynamics Under Threshold Policies

Since we established that the optimal policy is of threshold type, and in light of Re-mark 2, we restrict our analysis from now on to policies of this form. Our next stepis to connect the thresholds, Na and Ns, which specify when admission control andspeedup are used, to the system performance measures. Establishing such connectionis also helpful in case the cost functions (φ, ξ, h) are not explicitly defined or hard toestimate.

Under a threshold policy, our system becomes an Erlang-C queueing model withstate-dependent arrival and service rates. Its dynamic follows the following stochasticdynamic equations:

X(t) = X(0) +A

(∫ t

0

[1{X(s)<Na}λH + 1{X(s)≥Na}λL

]ds

)− D

(∫ t

0

[1{X(s)<Ns}µL + 1{X(s)≥Ns}µH

](X(s) ∧N)ds

)where A and D are independent Poisson processes with rate 1.

Figure 2 presents examples of sample paths and distributions of the the num-ber of customers in the system (X) for different speedup (Ns) and admission con-trol (Na) thresholds. These examples were simulated with the following parameters:µL = 0.2;µH = 0.2857;λL = 50;λH = 78;N = 400; Na, Ns ∈ [100, 500] (sim-ulation initiated at X(0) = 0; warm-up and total length were 100000 and 500000units of time, respectively). We observe two types of very different behaviours. Thefirst type of behaviour is characterized by a very concentrated distribution (similarto a Laplace distribution)—the corresponding sample path changes very fast arounda specific value (e.g. Na = 200;Ns = 200). The second type of dynamics is char-acterized by a sample path that moves “slower” in a wider range of values like aBrownian motion (e.g. Na = 200;Ns = 300)—in such cases, the distribution looksNormal/Poisson (or a truncated version of them, e.g. when Na = 450;Ns = 450).Interestingly, which dynamic appears depends not just on the system load, but alsothe specific thresholds chosen. Given these different types of distributions for thenumber of customers in the system (X), we aim to first understand the behavior ofthe expected number of customers in system using a fluid model. We will show thatits steady-state analysis can characterize the conditions in which each dynamic arises.

4 Performance Evaluation: Fluid Analysis

In order to get a better understanding of the impact of the thresholds on the steady-state performance metrics of interest—e.g., the probability of speedup, P (X ≥ Ns),the probability of admission control, P (X ≥ Na), and the expected queue length,E[(X−N)+]—we will use a fluid approximation and a heuristic. The fluid approxi-mation is used for the probability of speedup and admission control, and the heuristicsfor the queue length (since queue length is 0 in the fluid model). The fluid also helps

Page 10: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

10 Galit B. Yom-Tov, Carri W. Chan

(a) Sample path

0 100 200 300 400 500 600 700

Number in system

0

1

2

3

4

5

6

7

8

9

Fre

quency

106

na= 100, ns= 100

na= 200, ns= 200

na= 200, ns= 300

na= 300, ns= 200

na= 250, ns= 250

na= 260, ns= 260

na= 300, ns= 390

na= 450, ns= 450

na= 500, ns= 500

(b) Distribution

Fig. 2 Sample path and distribution of the number of people in the system (X) for different speedup (Ns)and admission control (Na) thresholds

in identifying the conditions in which queue accumulates. We will start with derivingthese approximations in this section and then use simulation to examine the qualityof the approximations in Section 5.

We denote the fluid function of our queueing network by X = {X(t), t ≥ 0}.Here X(t) is the fluid content of customers in the system at time t. We assume thatarrivals and departures occur deterministically at the specified rates and also regardthe number of customers and servers as continuous quantities. Thus, the fluid arrivesdeterministically and continuously at a state dependent rate λ(X). Fluid is serveddeterministically at rate µ(X)(X ∧ N), where (X ∧ N) is the number of occupiedservers (beds). The arrival rate function (λ(·)) and the service rate function (µ(·)) arediscontinuous. These functions are given by Equations (2) and (3), respectively:

λ(X) =

{λH , if X < Na,λL , if X ≥ Na,

(2)

µ(X) =

{µL , if X < Ns,µH , if X ≥ Ns.

(3)

The dynamics of our model can be captured by the following Ordinary Differen-tial Equations (ODE) with discontinuous Right Hand Side (RHS):

F (X(t)) , X(t) = 1{X(t)<Na}λH + 1{X(t)≥Na}λL (4)−1{X(t)<Ns}µL(X(t) ∧N)− 1{X(t)≥Ns}µH(X(t) ∧N).

This discontinuous ODE is discontinuous in X , but continuous in t. From (4),it is easy to see that the derivative values, X , which specify the flow dynamics arediscontinuous at X(t) = Na and X(t) = Ns. We first establish that there exists asolution to our ODE.

Theorem 3 There exists a solution of the problem defined by the ODE (4) for anyinitial condition x0 ∈ [0, Xmax] where Xmax <∞ is an arbitrary finite constant.

Page 11: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 11

100 150 200 250 300 350 400 450 500

150

200na=100; ns=100

100 150 200 250 300 350 400 450 500

200

220na=200; ns=200

100 150 200 250 300 350 400 450 500

250

300

na=300; ns=200

100 150 200 250 300 350 400 450 500240

260

280

na=260; ns=260

100 150 200 250 300 350 400 450 500280300320

na=300; ns=390

100 150 200 250 300 350 400 450 500

400

500

na=500; ns=500Sample path

Average

Fluid (X)

Fig. 3 A sample path, average dynamics of 50 replications and fluid model for different speedup (Ns)and admission control (Na) thresholds

This is a direct result of Theorem 1 on page 77, Chapter 2, Section 7 of Filippov(1988) (see Appendix).

We derived the fluid approximation directly. Proving convergence is beyond thescope of this paper, but we demonstrate in Figure 3 that the fluid model we suggestappears to coincide with the system average sample path in various conditions. To cre-ate the average sample path we repeated the simulation 50 times. (For this simulationwe used the same parameters as in Figure 2.) The average of these 50 sample pathsis marked in red and exhibit similar dynamics and values as the fluid approximation(in yellow). Additionally, Figure 4 demonstrates that the stochastic process seems toconverge to that fluid as the system size grows. For this simulation we used three sys-tem sizes: a small size system with λL = 4;λH = 7.5;N = 40; Na, Ns ∈ [18, 40];a medium size system with λL = 20;λH = 30;N = 155; Na, Ns ∈ [90, 150]; and alarge system with λL = 50;λH = 78;N = 400; Na, Ns ∈ [200, 400]. Service rateswere identical in all system sizes (µL = 0.2;µH = 0.2857). For every scenario wesimulated three threshold value. (All simulations initiated at X(0) = 0; warm-up andtotal length of simulation were 100000 and 500000 units of time, respectively.)

Our next goal is to understand the long-term behavior of this fluid system, i.e. thebehavior as t→∞. Let xx0

be the limiting value such that:

limt→∞

X(t) = xx0, for X(0) = x0,

where its potential dependence on the initial condition x0, is denoted by the subscriptin xx0 . In theory, this limit may be infinite, may not be unique, or may not even exist.As we will see later, the limit is finite and unique under Assumption 2.

We provide a couple of definitions which will be helpful in describing the systembehavior. The system is considered to be unstable if the fluid content of customers inthe system grows without bound over time. Formally,

Page 12: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

12 Galit B. Yom-Tov, Carri W. Chan

(a) Small system (b) Medium system (c) Large system

Fig. 4 Sample path (X; color) and fluid (X; black) of the number of people in the system for different sys-tem sizes (each graph presents three speedup (Ns) and admission control (Na) thresholds combinations

Definition 1 Unstable System We say that a system defined by the ODE X = F (X)is unstable if for any initial condition, x0:

limt→∞

[X(t)]→∞

As our system is an Erlang-C queue with bounded arrival and service rates, itis easy to see that the system is unstable if ρ = λ

µN > 1, where λ and µ capturethe average arrival and service rates, respectively. Since λ ≤ λH and µ ≥ µL, ifλH

µLN< 1, the system will be stable. Therefore, under Assumption 2 the system is

always stable.Following di Bernardo et al (2008), we consider the following definitions for

equilibria:

Definition 2 Equilibrium (or fixed point) A point x is an equilibrium of the ODEX = F (X) if xx0 = x0. That is,

X(t) = x0, for all t > 0 if X(0) = x0.

We refer to an equilibrium as a Globally Stable equilibrium if the trajectory willconverge to the same equilibrium point, irrespective of the initial condition.

Definition 3 Globally Stable Equilibrium x is said to be globally stable if the fol-lowing two conditions hold:

1. For any ε > 0, there exists a δ > 0 such that if ‖x0 − x‖ < δ ⇒ ‖X(t)− x0‖ <ε, ∀t > 0;

2. For all initial conditions, x0: limt→∞X(t) = xx0= x.

We define the following parameters, which will be useful in describing the systemdynamics:

xLL =λLµL

, xHL =λHµL

,

xLH =λLµH

, xHH =λHµH

.

(5)

Page 13: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 13

One can think of these parameters as the offered-load at the system under differ-ent arrival and service rate dynamics, i.e. when admission and/or speedup control isalways/never used. Note that by assumption, the following relationship holds:

xLH < xLL, xHH < xHL.

Theorem 4 Under Assumption 2, the long-term behavior of the fluid queueing sys-tem in (4) is broken into the following cases:

1. Case 1—Admission Control First (ACF) (Na < Ns):1.1 xHL is a globally stable equilibrium if xHL ≤ Na.1.2 Na is a globally stable equilibrium if xLL ≤ Na ≤ xHL.1.3 xLL is a globally stable equilibrium if Na ≤ xLL ≤ Ns.1.4 Ns is a globally stable equilibrium if xLH ≤ Ns ≤ xLL.1.5 xLH is a globally stable equilibrium if Ns ≤ xLH .

2. Case 2—Speedup Control First (SCF) (Ns < Na):2.1 xHL is a globally stable equilibrium if xHL ≤ Ns.2.2 Ns is a globally stable equilibrium if xHH ≤ Ns ≤ xHL.2.3 xHH is a globally stable equilibrium if Ns ≤ xHH ≤ Na.2.4 Na is a globally stable equilibrium if xLH ≤ Na ≤ xHH .2.5 xLH is a globally stable equilibrium if Na ≤ xLH .

3. Case 3—Simultaneous Admission and Speedup Control (SASC) (Ns = Na):3.1 xHL is a globally stable equilibrium if xHL ≤ Ns = Na.3.2 Na = Ns is a globally stable equilibrium if xLH ≤ Na = Ns ≤ xHL.3.3 xLH is a globally stable equilibrium if Ns = Na ≤ xLH .

Theorem 4 summarizes the long-term behavior of the fluid model presented inEquation (4). As expected, the long-term behavior is highly dependent on systemparameters for arrival and service times, as well as the control variable for whento begin speedup and admission control, (Na, Ns). Speedup and admission controlmay begin before a queue forms, if the thresholds are less than N , or after, if theyare greater than N . The proof of the theorem can be found in the Appendix andutilizes Lyapunov techniques under the Filippov (1988) and di Bernardo et al (2008)approach for differential equations with discontinuous RHS. This approach uses asmoothing technique for the ODE around the points of discontinuity, which results ina probabilistic version of the fluid model.

Figure 5 graphically summarizes the equilibria of Theorem 4, demonstrating itsbehavior as a function of the thresholds Na and Ns. A most important observationwe make is that there is a clear connection between the cases defined in Theorem 4and the two dynamics identified in Section 3.2 using Figure 2. The cases in which theequilibrium is on one of the thresholds (Cases 1.2, 1.4, 2.2, 2.4, and 3.2) are charac-terized by the narrow (Laplace type) distributions observed in Figure 2(b); the casesin which the equilibrium is on xLL, xHL, xLH or xHH (all the other cases) are char-acterized by the wider (Normal/Poisson type) distributions observed in Figure 2(b).The characterization of these equilibria is an important first step to understanding thelong-term behavior of our system so that we can specify thresholds, Na and Ns, inorder to minimize costs.

Page 14: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

14 Galit B. Yom-Tov, Carri W. Chan

𝑥𝐿𝐿 𝑥𝐻𝐻

𝑥𝐿𝐻

𝑥𝐿𝐿

𝑥𝐻𝐿

𝑁𝑎

𝑁𝑠

𝑥𝐻𝐻

𝑥𝐿𝐻 𝑥𝐻𝐿

𝑁𝑎

𝑁𝑎

𝑁𝑠

𝑁𝑠

𝑥𝐻𝐿

𝑥𝐿𝐻

𝑥𝐿𝐿

𝑥𝐻𝐻

Fig. 5 Equilibrium for various admission control and speedup threshold values, Na and Ns.

Corollary 1 Under Assumption 2, the fluid queue length (X ∧N)+ = 0 for all Naand Ns chosen.

Corollary 1 is a straight forward result of Theorem 4, since all equilibrium valuesfound are smaller than N by our assumptions on system parameters and staffing.Of course, even under these parameters and staffing levels, in the original stochasticmodel, random fluctuations will make it so that a queue will sometimes form. Assuch, fluid analysis appears to provide limited insight into the behavior of the queuein our original stochastic system. In Section 4.2, we will propose a way to leverageour analysis from the fluid model to understand the behavior of the queue.

4.1 Admission Control and Speedup Approximations

While the equilibrium values of the fluid model are interesting in their own rightand provide some insight into the behavior of the stochastic model, this does not yetprovide insight into the steady-state performance metrics of interest or, ultimately,the cost function we are interested in minimizing. Fortunately, as a byproduct of ourfluid analysis via Filippov (1988) techniques, we can derive approximations for theprobability of speedup and admission control of the original stochastic model.

As a corollary to Theorem 4, we establish the proportion of time the fluid contentis above the speedup threshold, and hence, using speedup:

limT→∞

1

T

∫ T

0

1{X(t)≥Ns}dt, (6)

and admission control threshold, and hence, using admission control:

limT→∞

1

T

∫ T

0

1{X(t)≥Na}dt. (7)

We formally provide the statement for the ACF case (Na < Ns) and note that theother two cases follow similarly and will be summarized in Table 1.

Page 15: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 15

Table 1 Performance level approximations for the probability of speedup and admission control in eachsubcase. The approximations come from the derived proportion of time the fluid content is about thespeedup and/or admission control thresholds.

Case P(Admission Control) P(Speedup)(1

)AC

F

1.1 0 01.2 λH−µL(N∧Na)

λH−λL0

1.3 1 01.4 1 λL−µL(N∧Ns)

(µH−µL)(N∧Ns)

1.5 1 1

(2)S

CF

2.1 0 02.2 0 λH−µL(N∧Ns)

(µH−µL)(N∧Ns)

2.3 0 12.4 λH−µH (N∧Na)

λH−λL1

2.5 1 1

(3)

SASC

3.1 0 03.2 λH−µL(N∧Na)

λH−λL−(µL−µH )(N∧Na)λH−µL(N∧Na)

λH−λL−(µL−µH )(N∧Na)

3.3 1 1

Corollary 2 Under Assumption 2 and ACF case (Na < Ns), the proportion of timethe fluid process is above the admission control threshold is given by:

limT→∞

1

T

∫ T

0

1{X(t)≥Na}dt =

0, xHL ≤ Na;λH−µL(N∧Na)

λH−λL, xLL ≤ Ns ≤ xHL;

1, Na ≤ xLL ≤ Ns;1, Ns ≤ xLH ;1, xLH ≤ Ns ≤ xLL.

(8)

Similarly, the proportion of time the fluid process is above the speedup control thresh-old is given by:

limT→∞

1

T

∫ T

0

1{X(t)≥Ns}dt =

0, xHL ≤ Na;0, xLL ≤ Ns ≤ xHL;0, Na ≤ xLL ≤ Ns;λL−µL(N∧Ns)

(µH−µL)(N∧Ns), xLH ≤ Ns ≤ xLL;

1, Ns ≤ xLH .

(9)

Consequently, these values can provide approximations for the probability thatspeedup and admission control are used in our original stochastic system from Sec-tion 2. In particular, we approximate the following probabilities of our original stochas-tic model as:

P (Speedup) = P (X ≥ Ns) ≈ limT→∞

1

T

∫ T

0

1{X(t)≥Ns}dt (10)

P (Admission Control) = P (X ≥ Na) ≈ limT→∞

1

T

∫ T

0

1{X(t)≥Na}dt (11)

Table 1 summarizes the approximations for the steady-state performance met-rics P(Admission Control) and P(Speedup) in each subcase. While these seem like

Page 16: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

16 Galit B. Yom-Tov, Carri W. Chan

arbitrary functions, a closer examination reveals that those functions are monotonicincreasing in the thresholds, and practically change only in a small range of thresholdvalues. This monotonicity is clearly observed in Section 5-Figure 8, were we use sim-ulation to examine the accuracy of these approximations and see that they are quiteaccurate.

As a direct result of these approximations, we can see that:

Corollary 3 Under Assumption 2, ifNa, Ns ≥ xHL, then P(Speedup)=0, P(AdmissionControl)=0, and (X ∧N)+ = 0. Hence, the system total cost is approximately 0.

Accordingly, the fluid approximation suggests there is a set of policies which incur nocosts. Of course, this is not the case in our original stochastic model. This is primarilydue to the inherent problem of using fluid models to approximate queue length in astable Erlang-C queue. In such situations, the fluid queue level is always zero whilein practice as ρ→ 1 the queue length explodes. Therefore, in the next subsection westrive to establish a simple heuristic that connects the thresholds, Na and Ns, to thequeue length of our system.

4.2 Queue Length Approximation

In this section we develop an estimate for the queue length.As stated in Corollary 1, the fluid model is too crude to capture queue length since

the fluid expected queue length is zero for all threshold values and all cases. However,when we simulate the expected queue length, we observe a more nuanced picture.Figure 2(b) suggested that the number of customer in the system (X) and thereforealso the queue length ((X − N)+) depends on the threshold value we choose. Eventhough the system is overloaded (ρ = 0.975 in that example), the queue may bezero in cases that the thresholds are small enough so that the whole distribution ofX is well below N . When the thresholds are larger, we observe a substantial queueaccumulating. This phenomenon occurs because small thresholds reduce the arrivalrate or increase the service rate, so that the effective workload is smaller. We alsosaw in Figure 2(b) three types of system steady-state distributions: a.) A Normaldistribution (e.g. when Na = Ns = 100); b.) A Laplace distribution (e.g. whenNa = Ns = 200); and c.) A truncated Poisson distribution (e.g. when Na = Ns =450). We showed that those distributions fit different cases in Theorem 4. Hence, wesuggest that the queue length analysis should also be based on the cases we identifiedin that theorem and the effective load in each of these cases. (We denote by λ theeffective arrival rate and by µ the effective service rate.) We propose a heuristic thatis based on that understanding and utilizes the results of our fluid analysis.

We start by considering the extreme cases where the fluid analysis suggests thatspeedup and/or admission control is always used. In such scenarios, the steady-statedistribution looks Normal, and it is conceivable that very limited information is lostby ignoring the change in dynamics due to the thresholds. As an example, considerthe ACF case where Na, Ns ≤ xLH (case 1.5). In this case, fluid analysis suggeststhat speedup or admission control are always used. If this were truly the case, thenumber of people in the system would evolve as an M/M/N queue with arrival rate

Page 17: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 17

λ = λL and service rate µ = µH . Using standard approaches, we can then get anapproximation for the queue length given by the analysis of this M/M/N queue. Viaa similar argument, we could do the same in cases 1.3 and 3.3 with λ = λL andµ = µL, in case 2.3 with λ = λL and µ = µL, in case 2.5 with λ = λH and µ = µH .

When the equilibrium is on a control threshold (either Na or Ns), it is certainthat the dynamics are changing due to the threshold. In fact, they are changing veryrapidly, so that the fluid content remains on the threshold boundary (see the exampleof Na = Ns = 200 in Figure 2). In these cases (1.2, 1.4, 2.2, 2.4, and 3.2), forsome proportion of the time (< 1) speedup/admission control are used, while inother times they are not. Hence, we still use an M/M/N queue to approximate thedynamics. However, the state independent arrival and departure rates will be givenby the weighted average arrival and departure rates as approximated by the fluidanalysis. Hence, for all cases we will consider the following arrival and departurerates: λ = P(Admission Control) × λL + (1 − P(Admission Control)) × λH andµ = P(Speedup)× µH + (1− P(Speedup))× µL, where P(Admission Control) andP(Speedup) are given by our fluid analysis as summarized in Table 1.

Finally, in cases 1.1, 2.1, and 3.1 speedup and admission control are never used.We observe that a significant queue starts to accumulate once both thresholds exceedN . A closer examination suggests that the distribution of X, in these cases, exhibita truncated effect at the threshold. See, for example, the simulated distributions inFigures 6(a). This phenomena is further demonstrated by the sample path in Figure6(b), showing how when the process hit the threshold value, it bounces back down.That is because the change in arrival and service rates (from λH to λL or from µL toµH ) at that point is large enough to change the derivative (X) direction and to push Xback and decrease the queue length. To capture this truncation, we use an M/M/N/Kqueue (a loss model), where K = min{Na, Ns, N}. The loss model queue length,E[Q], is derived by solving the local balance equations:

πi =1

i!

λ

)−iπ0 0 ≤ i ≤ N

πi =NN

N !

)iπ0 N < i ≤ K

π0 =

N∑i=0

1

i!

λ

)−i+

K∑i=N+1

NN

N !

)i−1 . (12)

Finally, we have that

E[(X−N)+

]≈ E[Q] =

K∑i=N

(i−N)πi.

In the next section, §5, we will see that this heuristics works very well. To providesome intuition as to why this seems to provide a very accurate approximation, weconsider the impact of the finite buffer in the loss model. When there are K jobsin an M/M/N/K queue, the loss of any new job ‘forces’ the system away from this

Page 18: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

18 Galit B. Yom-Tov, Carri W. Chan

300 350 400 450 500 550

Number in system

0

0.5

1

1.5

2

2.5

3

Fre

qu

en

cy

106

na= 380, ns= 380

na= 390, ns= 390

na= 390, ns= 400

na= 400, ns= 390

na= 400, ns= 400

na= 450, ns= 450

na= 500, ns= 500

N

(a) X distribution in cases 1.1, 2.1 and 3.1 (b) X sample path examples

Fig. 6 Distribution and sample path of the number of people in the system (X) in cases 1.1, 2.1, and 3.1

boundary. The speedup and/or admission control thresholds have a similar effect.When the number of patients in the system crosses one of these thresholds, the changein dynamics due to increased service rate and/or decreased arrival rate also ‘forces’the system down and away from that threshold.

Remark 3 One can utilize the derived approximations for the mean queue lengthand the probability of admission control and speedup to do performance analysisgiven thresholds Na and Ns. Moreover, it is possible to determine feasible sets ofNa andNs such that various constraints on these performance measures are satisfied.For example, if the hospital management sets a limit on the proportion of time thatadmission control is utilized, our approximations would provide a set of thresholdsto satisfy such a constraint.

Remark 4 One can actually analyze the steady state distribution by solving the bal-ance equations of the state-dependent Markov model, but this would result in a for-mula for each case, and will not provide intuition on the way the thresholds impactthe system. We also note that the queue length heuristic approximation should beaccurate when the change in arrival and service rate is significant.

5 Accuracy of Performance Measure: Numerical Comparisons

In this section, we examine the accuracy of our approximations. We will look ataccuracy as a function of system size and as a function of the thresholds that definethe multiple operational regimes specified in Theorem 4. We consider two systemsizes: a small system which may be representative of an ICU and a large systemwhich may be representative of a hospital’s general medical/surgical beds and/or acall-center. Since our approximations are based on fluid analysis, we expect them tobe more accurate for the larger system, but hopefully still reasonably accurate forsmall systems.

We calibrate the parameters of our model according to typical healthcare environ-ments. We used publicly available data from State of California Office of StatewideHealth Planning & Development (2010-2011) which keeps track of all hospitals inCalifornia. We only considered short-term, acute hospitals with 24 hour emergency

Page 19: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 19

05.

0e04

.001

.001

5.0

02D

ensi

ty

0 200 400 600 800 1000Licensed Beds

(a) Number of licensed hospital beds

0.0

05.0

1.0

15.0

2D

ensi

ty

0 50 100 150Adult ICU beds

(b) Number of licensed Adult ICU beds

Fig. 7 Empirical distribution of the number of licensed beds in California hospitals, as reported in Stateof California Office of Statewide Health Planning & Development (2010-2011).

care coverage, trauma designation, and ICUs. Figure 7 shows the empirical distri-bution of the number of licensed (a) hospital beds and (b) adult ICU beds in thesehospitals. The median number of licensed hospitals beds is 377 and the median num-ber of licensed adult ICU beds is 38.5; as such, we will consider a hospital withN = 400 beds and an ICU with N = 40 beds for our simulations.

In calibrating the remaining parameters of our model, we start by considering theLength-of-stay (LOS) of a patient. A typical average LOS varies between 2 to 9 days,depending on the hospital unit considered (see, for example, Table 3 in de Bruin et al(2010)). Hence, we chose a lower value of 3 hospital days as the LOS under speedupand 5 days as the LOS under ‘unstressed’, nominal conditions. The arrival rates arechosen in order to have approximately 20% bed turnover per day under high arrivalrates and 10% under low arrival rates. For our simulations, we use the followingparameters for an ’ICU’ (i.e. small system): λL = 4, λH = 7.5, µL = 0.2, µH =0.286, N = 40. For an average sized hospital (i.e. large system), we use: λL =50, λH = 78, µL = 0.2, µH = 0.286, N = 400. Note that the parameters chosenhere satisfy Assumption 2, so that the system is stable irrespective of whether or notspeedup and/or admission control are used. All simulations initiated at X(0) = 0.

Figure 8 presents the steady-state performance metrics given by our approxima-tions and simulation results for the large system (left column) and small system (rightcolumn) as we vary the thresholds Na and Ns1. Simulated steady-state values werecalculated from a single very long run of 500000 units of times (after ignoring atransient setup period of 100000 units of time). For the small system we simulatedthreshold Na and Ns in the range [1, 100]; for the large system we simulated thresh-oldNa andNs in the range [5, 700] in jumps of 5. These figures are meant to illustratethe typical behavior and effect of the control thresholds. The results are very similaracross all combinations of the thresholds, Na and Ns.

As expected, the approximations are more accurate for the large system. Still,they can be quite accurate for the small system as well. This phenomena is consis-tent with results presented in other research papers (e.g. Yom-Tov and Mandelbaum(2014)) and has some theoretical reinforcement (e.g. Janssen et al (2011)). We ob-

1 Note that, due to numeric issues, in order to calculate the expected queue length for large systems weused Stirling’s formula

(i! ∼

(ie

)i√2πi

)(Hazewinkel 2001) to calculate E[Q].

Page 20: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

20 Galit B. Yom-Tov, Carri W. Chan

Large (Hospital) system with N = 400 Small (ICU) system with N = 40

0 100 200 300 400 500 6000

10

20E

[Q]

a

0 100 200 300 400 500 6000

0.5

1

P(A

dm

issio

n)

0 100 200 300 400 500 6000

0.5

1

Ns

P(S

peedup)

(a) Na = 550

0 20 40 60 80 1000

2.5

5

E[Q

]

a

0 20 40 60 80 1000

0.5

1

P(A

dm

issio

n)

0 20 40 60 80 1000

0.5

1

Ns

P(S

peedup)

(b) Na = 65

0 100 200 300 400 500 6000

0.5

1

E[Q

]

a

0 100 200 300 400 500 6000

0.5

1

P(A

dm

issio

n)

0 100 200 300 400 500 6000

0.5

1

Ns

P(S

peedup)

(c) Na = 400

0 20 40 60 80 1000

0.5

1

E[Q

]a

0 20 40 60 80 1000

0.5

1

P(A

dm

issio

n)

0 20 40 60 80 1000

0.5

1

Ns

P(S

peedup)

(d) Na = 40

0 100 200 300 400 500 6000

0.5

1

E[Q

]

a

0 100 200 300 400 500 6000

0.5

1

P(A

dm

issio

n)

0 100 200 300 400 500 6000

0.5

1

Ns

P(S

peedup)

Simulation

Approximation

(e) Na = 225

0 20 40 60 80 1000

0.5

1

E[Q

]

a

0 20 40 60 80 1000

0.5

1

P(A

dm

issio

n)

0 20 40 60 80 1000

0.5

1

Ns

P(S

peedup)

(f) Na = 30

Fig. 8 Approximation vs. simulation results as a function of the Speedup threshold (Ns) for some fixedAdmission threshold (Na) values

Page 21: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 21

serve that the approximations are very good in most cases. Some gaps can be ob-served when speedup and/or admission control is used for some (but small) propor-tion of the time (e.g. 5–10%) or when it is used most (but not all) of the time (e.g.90–95%)—i.e., when Na is close to xLH , xHH , or xHL, or Ns is close to xLH , xLL,or xHL (the boundaries of cases 1.2, 1.4, 2.2, 2.4, and 3.2 in Theorem 4). This ismore pronounced in the small system. In such situations, the fluid approximationsfor P(Speedup) and/or P(Admission Control) are not very accurate; presenting a gapof up to 20%. For example, in Figure 8(d), the approximation for the probability ofadmission control is 0, while the simulation suggests the true probability is around10%.

Since the queue length approximation is based on a heuristic we need to care-fully examine it both as a function of system size and as a function of the thresh-old regime we chose. The inaccuracy observed under some situations for the prob-ability of admission control and speedup could result in degradation of the queuelength approximation, because our queue length approximation uses these proba-bility to derive an ‘effective’ arrival and service rates (λ and µ). Therefore, poorestimates for P(Admission Control) and P(Speedup) could result in poor estimatesfor E[(X − N)+]. Surprisingly, this is not the case. For example, in Figure 8(f) wefound situations where the approximation of P(speedup) under estimates the simu-lated value; in this case, the expected queue length is effectively 0 both in the simu-lation and via our approximation. Figures 9 and 10 compare results of the simulatedqueue length with the heuristically-approximated values, as a function of the chosenadmission and speedup control thresholds, for large and small systems respectfully.(For this demonstration we used the same parameters as before, and it covers all op-erating regimes identified in Theorem 4.) Note that Figures 9(d) and 10(d) show thedistribution of the difference between the simulation and the heuristic values. We seethat the probability that the difference is of more than one customer is very rare, evenin the small system. Hence, we conclude that the heuristic we developed is able tocapture the limiting behaviour of the systems in our case-study very well for all oper-ating regimes investigated in this paper. We believe that this results from the fact thatqueues are accumulated mostly in the cases 1.1, 2.1 and 3.1, in which speedup andadmission control are not used. In these cases, the approximations for P(Admission)and P(Speedup) are very accurate. Therefore, accuracy will naturally depend on themagnitude of the difference between the arrival rates, λH −λL, and the service rates,µH − µL. If these differences are very small, the proposed control policies wouldnot be very effective in reducing load, and we it would be difficult to distinguish thissystem from a standard M/M/N with a single arrival and service rate.

As we have found that the approximations for P(Speedup), P(Admission Control)and E[(X − N)+] appear to be reasonably accurate, the next step is to consider theperformance of decisions which are optimized over these approximations.

6 Cost Minimization

Now that we have derived approximations for the different steady-state performancemetrics, we are in position to consider our original optimization problem from (1).

Page 22: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

22 Galit B. Yom-Tov, Carri W. Chan

(a) Expected queue length - simulation (b) Expected queue length - approximation

(c) Difference between simulation and approxima-tion

(d) Difference distribution

Fig. 9 Expected queue length (E[X]) - simulation vs. approximation - in a large system (N = 400)

Because we’ve established the optimality of threshold policies in Section 3, our opti-mization problem can be reduced to:

minNa,Ns

{h(E[(X−N)+

]) + P (X ≥ Na)φ(λL) + P (X < Na)φ(λH)

+ P (X ≥ Ns)ξ(µH) + P (X < Ns)ξ(µL)}. (13)

From Sections 4.1 and 4.2, we now have closed form expressions for approximationsto the optimization problem in (13). Without loss of generality, we set φ(λH) = 0and ξ(µL) = 0, so there is no cost associated with the nominal system arrival andservice rates. We also consider a linear function for the queue length costs. Hence,our optimization problem is:

minNa,Ns

{cwE[(X−N)+

]+ caP (X ≥ Na) + csP (X ≥ Ns)} (14)

where cw is the per-patient waiting cost rate, ca = φ(λL) is the cost rate for admissioncontrol, and cs = ξ(µH) is the cost rate for speedup.

Observation 1 Using the approximations in Section 4.1 and 4.2 we find regimes inwhich our approximations suggest the cost in (14) to be zero due to the approximated

Page 23: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 23

(a) Expected queue length - simulation (b) Expected queue length - approximation

(c) Difference between simulation and approxima-tion

(d) Difference distribution

Fig. 10 Expected queue length (E[X]) - simulation vs. approximation - in a small system (N = 40)

queue length and the approximated probability of speedup and admission control allbeing 0. In particular, this occurs in cases 1.1, 2.1 and 3.1 of Theorem 4. Thus, thecost (14) is zero when:

xHL ≤ Na ∧Ns ≤ N.This implies that it is always ‘optimal’ to use at least one form of congestion controlbefore a queue builds regardless of the exact system parameters; a zero cost solutionwill never have both Na, Ns > N .

There has been some evidence that hospitals do use speedup and/or admission controlbefore reaching full capacity (e.g. Kim et al (2015) and Kc and Terwiesch (2012)).However, in other cases, such as ambulance diversion, admission control is not useduntil a queue builds (Allon et al 2013). As we will see in Section 6.2, one must beprudent with how to select between these seemingly zero-cost solutions.

6.1 A Greedy Heuristic

We suggest the following Greedy heuristic to select the (Na, Ns) amongst the poten-tially numerous solutions with the minimal approximated cost as indicated in Obser-vation 1. In general, this heuristic prioritizes the use of the speedup and admission

Page 24: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

24 Galit B. Yom-Tov, Carri W. Chan

control in decreasing order of the cost measure. In order to ensure the costs are com-parable, we use normalized costs. While probabilities (for speedup and admissioncontrol) are naturally normalized to the range [0, 1], the magnitude of the expectedqueue length can vary dramatically. Hence, we normalize the waiting costs by di-viding them by the maximum expected queue length, denoted by Qmax, which isobtained when speedup or admission control are never used. Such a normalizationbecomes useful to interpret how the costs related to the relative disutility of waiting,speedup, and admission control. For instance, consider the case where cw = ca = cs.In such a case, the incremental cost of increasing the probability of speedup by 1%is the same as the incremental cost of increasing the probability of admission controlby 1%. By normalizing the queue length by Qmax, we find that these costs are alsoequivalent to an increase in the expected queue length of 1% of the maximum possi-ble queue length. Thus, larger changes in the queue length are necessary to incur thesame relative cost when the system is larger and/or more congested. Of course, onecould do away with normalizing the queue-length, but it make the it more difficult tocapture an intuitive interpretation of the costs.

The Greedy heuristic then selects among the potentially numerous solutions withapproximated zero costs. Because our approximations suggest that any solution amongstthis set will result in zero costs, we must find a way to prioritize amongst them. Todo this, the Greedy heuristic is most aggressive in reducing the costs (that will beincurred due to stochastic fluctuations) by limiting the use of the control which in-curs the most cost. For example, if speedup has the highest costs (cs is maximal) thenNs should be as large as possible. This implies that only admission control is used;additionally, in light of Observation 1, we know that xHL ≤ Na ≤ N . The value ofNa will then be selected based on the relative costs between admission control andwaiting. Under a similar argument, if admission control is most expensive (ca is max-imal) then Na should be as large as possible and xHL ≤ Ns ≤ N . Finally, if waitingis most costly (cw/Qmax is maximal), thenNa andNs should be as small as possible.The Greedy heuristic is defined more formally by the pseudo-code in Algorithm 1.

In Section 6.2, we will use simulation to examine how well such an approachperforms.

6.2 Numerical Results: Approximation-Based Cost Minimization

Next we consider the performance of policies resulting from solving an optimizationproblem based on our approximated performance functions. The normalization factorQmax in our examples are: in the large system Qmax = 19.7, and in the small sys-tem Qmax = 8.9. Here we find Na and Ns which minimize the approximated costsand use simulation to compare the resulting cost to the minimum cost achieved viaexhaustive search over many Na and Ns combinations. We also consider a numberof benchmarks for comparison:

– Never use speedup or admission control: Na = Ns =∞– Use speedup and admission control as soon as all beds are filled: Na = Ns = N– Always use speedup or admission control: Na = Ns = 0

Page 25: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 25

We derive the optimal performance via exhaustive search; for the large systemwe checked integer thresholds between 0 to 700 in jumps of five units2, and for thesmall systems we search all combinations of integer thresholds between 0 to 100. Thefluid approximation is derived by solving the optimization problem in (14) using theapproximations for P(Speedup), P(Admission Control), andE[(X−N)+] as given inSections 4.1 and 4.2. As noted in Section 6, when considering the fluid approximatedcosts, there may be many solutions with zero cost. For example, in the large systemthere are 369 such solutions. Hence, our first question is: are they all “practically”equivalent? As an example, assume that cs = 100, ca = 1, and cw = 10 (cw/Qmax =.51). Figure 11 shows the actual costs (via simulation) for each solution that hasapproximated cost equal 0. We observe that although many of the solutions havesimulated cost very close to 0, in others it is very different; nevertheless, about halfof the solutions are indeed very close the optimal performance of 0.09.

Observation 2 While the approximated costs in (14) are 0 for a (potentially large)set of Na and Ns pairs, the actual cost associated with these solutions can differ

2 The exhaustive search was computationally intensive, requiring nearly two weeks to complete onan Intel Xeon E5-2470, 2.3Ghz, 16 core CPU. Thus, providing more granularity was computationallylimiting.

Algorithm 1 (Nga , N

gs )= Greedy Heuristic(cs, ca, cw)

1: N MIN FLUID ← all (Na, Ns) pairs with fluid cost approximation equal to the minimum value,i.e. Cfluid(N MIN FLUID) = min{Na,Ns} Cfluid(Na, Ns).

2: if max(cs, ca, cw) = cs then3: Ng

s ← max(N MIN FLUID{Ns})4: N MAX NS ← all (Na, Ns = Ng

s ) ∈ N MIN FLUID5: if max(ca, cw) = ca then6: Ng

a ← max(N MAX NS{Na})7: else8: Ng

a ← min(N MAX NS{Na})9: end if

10: else if max(cs, ca, cw) = ca then11: Ng

a ← max(N MIN FLUID{Na})12: N MAX NA← all (Na = Ng

a , Ns) ∈ N MIN FLUID13: if max(cs, cw) = cs then14: Ng

s ← max(N MAX NA{Ns})15: else16: Ng

s ← min(N MAX NA{Ns})17: end if18: else (max(cs, ca, cw) = cw)19: if max(cs, ca) = cs then20: Ng

a ← min(N MIN FLUID{Na})21: N MIN NA← all (Na = Ng

a , Ns) ∈ N MIN FLUID22: Ng

s ← min(N MIN NA{Ns})23: else24: Ng

s ← max(N MIN FLUID{Ns})25: N MIN NS ← all (Na, Ns = Ng

s ) ∈ N MIN FLUID26: Ng

a ← min(N MIN NS{Na})27: end if28: end if

Page 26: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

26 Galit B. Yom-Tov, Carri W. Chan

!"

#"

$"

%"

&"

'"

("

)"

*"

+"

#!"

#" $#"

&#"

(#"

*#"

#!#"

#$#"

#&#"

#(#"

#*#"

$!#"

$$#"

$&#"

$(#"

$*#"

%!#"

%$#"

%&#"

%(#"

!"#$%

&"'($)"*%+(,-./%

Fig. 11 Distribution of simulation costs when approximated costs equal 0 (large system, scenario 11).

Table 2 Cost parameters for different cost minimization scenarios.

Cost Scenario Speedup Control cs Admission Control ca Waiting Cost cw(before normalization)

1 1 2 32 1 3 23 2 1 34 2 3 15 3 1 26 3 2 17 1 10 1008 1 100 109 10 1 100

10 10 100 111 100 1 1012 100 10 1

significantly. However, we find that almost half of the solutions are indeed very closeto optimal.

Hence, the question of how to select between the minimal fluid solutions is highlypertinent. Choosing randomly amongst the 369 solutions with zero fluid cost wouldresult in an average cost of 3.53, which is far from optimal. The solution with theminimum cost within the set has cost of 0.09 which is exactly equal to the optimalvalue. Yet, finding this solution requires simulating all 369 zero fluid cost solutions.Although this is not as extensive as the run that is required for an exhaustive search, itis still computationally intensive. With this in mind, we consider the Greedy heuristicpresented in Section 6. We next examine its performance for different cost scenarios.

Table 2 summarizes the cost scenarios we consider. We chose these scenarios inorder to examine what happens where the costs have the same order of magnitude(scenarios 1–6), but still different order of importance; or when they are of differentorders of magnitude (scenarios 7–12), highly emphasizing one measure in particular.

Tables 3 and 4 present the performance of the different strategies, for the variouscost scenarios. The first thing we note is the robustness of the optimal solution. Theleft column—optimal—presents the minimal solution found using exhaustive searchfor each cost scenario. Analyzing this solution, we note that the minimal cost is ro-bust. For example, for the large system there are between 8–14 solutions within 5%of the minimal one, and 57–105 solutions within 10% of it. (The numbers for thesmall systems are 6–176 solutions within 5% and 15–477 within 10% of the optimal

Page 27: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 27

performance). Note also that the “optimal” value was found by simulation; hence,we provide standard errors in brackets for these values. Under the via Fluid Approx-imation column we present the minimal simulated cost within the set of zero-valuefluid solutions (Min), the average performance of these solutions (Avg), and the per-formance of the solution selected by the Greedy heuristic (Greedy). We make thefollowing observations:

1. In most cases, the optimal value is within the set of zero fluid cost solutions.2. The difference between the minimal performance of the zero fluid cost solution

set and the optimal one is very small (often times it is 0).3. The average cost of the zero fluid cost solution set may be quite far from optimal;

hence, it is important to choose wisely within this range.4. The Greedy heuristic is very close to optimal, achieving in most cases the min-

imal performance. Thus, prioritizing admission control and/or speedup based onrelative costs can be very cost-effective.

The last three columns show the performance of the benchmark policies: Never, Al-ways, and All beds filled. We see that the simple benchmarks fail dramatically in allscenarios; the only reasonable one is the “All beds filled” policy. Still, the Greedyheuristic consistently seems to outperform this policy. The “All beds filled” performswell only for large systems where costs are close in magnitude (Scenarios 1–6); eventhen, it never achieves the optimal cost. We surmise that using the combination of thefluid approximation with the Greedy heuristic is a very good way to find a solutionwith near optimal performance.

6.3 Cost Misspecification: Robustness of Proposed Greedy Policy

We also consider the robustness of our proposed greedy heuristic to miss-estimatesin the cost parameters: cw, ca and cs. Certainly, if the optimization is done over costswhich are incorrectly specified, the resulting policy will be suboptimal. The questionis how much worse will the performance be. Additionally, since we know the greedyheuristic is suboptimal, how will its performance be impacted by such misspecifica-tion?

To examine this, we consider the thresholds, Na and Ns, selected under theGreedy Heuristic and Optimal Policy when the costs are misspecified by plus or mi-nus 10%, 20%, 30%, 40%, and 50%. The thresholds of the Optimal Policy under costmisspecification are determined via exhaustive search over the incorrect cost param-eters. The performance of these policies are then evaluated using simulation over thecorrect cost parameters. The worse case performance over all possible misspecifica-tions is reported.

Table 5 summarizes the robustness results for our 12 cost scenarios in terms ofa relative ratio between the average cost of the polices under misspecified costs tothe true optimal solution (without misspecification). We can see that for up to 20%(and often 30+%) errors in cost estimates, the performance of the Greedy Heuristicis very robust. If the initial performance of the Greedy Heuristic, under perfect costinformation, was reasonable, then this will still be the case, even if the costs have

Page 28: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

28 Galit B. Yom-Tov, Carri W. Chan

Table 3 Large system: Performance of different strategies for cost scenarios 1–12.

PolicyCost Optimal via Fluid Never: Always: All beds filled:

Scenario (exhaustive Approximations Na = Ns Na = Ns Na = Nssearch) Min Avg Greedy =∞ = 0 = N

1 0.07 0.07∗ 0.12 0.07∗ 3.06 3.00 0.09(0.005) (0.005) (0.050) (0.005) (0.664) (0.000) (0.007)

2 0.07 0.07∗ 0.16 0.07∗ 2.04 4.00 0.12(0.005) (0.004) (0.093) (0.004) (0.443) (0.000) (0.009)

3 0.08 0.08∗ 0.12 0.08∗ 3.06 3.00 0.09(0.005) (0.005) (0.033) (0.006) (0.664) (0.000) (0.007)

4 0.11 0.11∗ 0.19 0.11∗ 1.02 5.00 0.15(0.007) (0.007) (0.064) (0.007) (0.222) (0.000) (0.011)

5 0.07 0.07∗ 0.15 0.07∗ 2.04 4.00 0.12(0.004) (0.004) (0.068) (0.005) (0.443) (0.000) (0.009)

6 0.13 0.13∗ 0.19 0.13∗ 1.02 5.00 0.15(0.007) (0.009) (0.040) (0.009) (0.222) (0.000) (0.011)

7 0.11 0.11∗ 0.66 0.11∗ 101.84 11.00 0.45(0.012) (0.012) (0.403) (0.013) (22.135) (0.000) (0.034)

8 0.08 0.08∗ 4.17 0.11 10.18 101.00 3.03(0.006) (0.006) (4.193) (0.008) (2.220) (0.000) (0.211)

9 0.11 0.11∗ 0.60 0.11∗ 101.84 11.00 0.45(0.010) (0.010) (0.369) (0.008) (22.135) (0.000) (0.034)

10 0.39 0.54 446 0.54 1.02 110.00 3.29(0.033) (0.033) (3.900) (0.031) (0.229) (0.000) (0.229)

11 0.09 0.09∗ 3.53 0.12 10.18 101.00 3.03(0.005) (0.005) (3.529) (0.009) (2.220) (0.000) (0.211)

12 0.43 0.64 3.88 0.64 1.02 110.00 3.29(0.041) (0.042) (3.180) (0.044) (0.229) (0.000) (0.229)

Bold font indicates the heuristic method that got the minimal cost value; ∗ denotes when the heuristicsequivalent to the optimal costs determined via exhaustive search;Standard deviations are given in brackets below each value, based on 100 replications of the simulationof the proposed thresholds for this policy

moderate misspecification. Of course, if the performance is poor under perfect costinformation and/or the cost misspecification is very high, then the performance ofthe Greedy Heuristic can degrade substantially. Interestingly, the Greedy Heuristicis much more robust than the optimal policy. In some instances, the performance ofthe Greedy Heuristic under cost misspecification is better than that of the OptimalPolicy under cost misspecification. For example, under cost scenario 3, the GreedyHeuristic achieves the minimum cost with errors up to 30% and it outperforms themisspecified Optimal Policy for up to 50% errors. This robustness feature, along withthe simplicity of the heuristic, is another desirable property of the proposed heuristic.

7 Model Extension: A System with Returns to Service

Thus far, we have only accounted for the undesirability of delays, admission controland speedup via a cost function, which can capture quality and/or monetary costs.However, it is known that in healthcare settings these dynamics can reduce qualityof care and that the deterioration of a patient’s physiologic state may require a return

Page 29: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 29

Table 4 Small system: Performance of different strategies for cost scenarios 1–12.

PolicyCost Optimal via Fluid Never: Always: All beds filled:

Scenario (exhaustive Approximations Na = Ns Na = Ns Na = Nssearch) Min Avg Greedy =∞ = 0 = N

1 0.24 0.24∗ 0.32 0.26 2.74 3.00 0.28(0.006) (0.006) (0.068) (0.008) (0.185) (0.000) (0.005)

2 0.22 0.22∗ 0.40 0.22∗ 1.83 4.00 0.37(0.006) (0.006) (0.163) (0.006) (0.124) (0.000) (0.006)

3 0.21 0.21∗ 0.32 0.21∗ 2.74 3.00 0.29(0.004) (0.004) (0.113) (0.005) (0.185) (0.000) (0.005)

4 0.29 0.35 0.47 0.35 0.92 5.00 0.45(0.008) (0.007) (0.091) (0.008) (0.063) (0.000) (0.008)

5 0.19 0.19∗ 0.40 0.19∗ 1.83 4.00 0.37(0.004) (0.004) (0.197) (0.004) (0.124) (0.000) (0.006)

6 0.28 0.33 0.47 0.33 0.92 5.00 0.45(0.007) (0.006) (0.110) (0.006) (0.063) (0.000) (0.008)

7 0.63 0.96 2.73 1.41 91.12 11.00 1.54(0.018) (0.025) (0.529) (0.025) (6.137) (0.000) (0.033)

8 0.39 0.40 9.29 0.40 9.20 101.00 9.12(0.012) (0.012) (8.673) (0.013) (0.643) (0.000) (0.153)

9 0.38 0.58 2.73 1.41 91.12 11.00 1.54(0.010) (0.021) (1.584) (0.025) (6.137) (0.000) (0.033)

10 0.71 1.59 9.95 1.60 1.00 110.00 9.87(0.034) (0.034) (7.957) (0.035) (0.094) (0.000) (0.165)

11 0.26 0.26∗ 9.30 0.26∗ 9.20 101.00 9.12(0.006) (0.007) (8.968) (0.007) (0.643) (0.000) (0.153)

12 0.71 1.55 9.96 1.56 1.00 110.00 9.87(0.035) (0.026) (8.141) (0.028) (0.094) (0.000) (0.165)

Bold font indicates the heuristic method that got the minimal cost value; ∗ denotes when the heuristicsequivalent to the optimal costs determined via exhaustive search;Standard deviations are given in brackets below each value, based on 100 replications of the simulationof the proposed thresholds for this policy

Table 5 Large system: Robustness of greedy policy—Relative performance of Optimal Policy and GreedyHeuristic when cost parameters are misspecified to the true minimum cost.

Relative performance of Relative performance of‘Optimal Policy’ Greedy Heuristic

Cost with cost misspecifications with cost misspecificationsScenario 0 % 10 % 20 % 30 % 40 % 50 % 0% 10 % 20 % 30 % 40 % 50 %

1 1.000 1.001 1.001 1.399 1.399 2.055 1.019 1.019 1.019 1.019 2.055 2.0552 1.000 1.000 1.000 1.074 1.074 1.074 1.018 1.018 1.018 1.018 1.018 3.1473 1.000 1.000 1.225 1.539 1.716 1.716 1.000 1.000 1.000 1.000 1.565 1.5654 1.000 1.027 1.027 1.633 1.815 1.815 1.043 1.043 1.043 1.815 1.815 1.8155 1.000 1.000 1.000 1.000 1.717 2.282 1.000 1.000 1.000 1.000 1.000 1.0006 1.000 1.102 1.180 1.292 1.292 1.346 1.046 1.046 1.046 1.312 1.312 1.3127 1.000 1.007 1.048 1.066 1.066 1.402 1.045 1.045 1.045 1.045 5.615 5.6158 1.000 1.000 1.000 1.000 1.000 1.150 1.456 1.456 1.456 1.456 1.456 1.4569 1.000 1.002 1.007 1.025 1.199 1.297 1.041 1.041 1.041 1.041 5.352 5.352

10 1.000 1.000 1.085 1.140 1.269 1.343 1.479 1.479 1.479 1.479 1.479 1.47911 1.000 1.000 1.000 1.000 1.003 1.309 1.309 1.309 1.309 1.309 1.309 1.30912 1.000 1.016 1.045 1.045 1.215 1.257 1.535 1.535 1.535 1.535 1.535 1.535

Page 30: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

30 Galit B. Yom-Tov, Carri W. Chan

to service. A common quality measure used in practice is readmission rates. Usingsimulation, we examine an extended model that incorporates customer (patient) read-missions explicitly, and use our original model (without readmissions) to determinepolicies which minimize the readmission rate for the extended model.

To incorporate readmissions, we assume that waiting for service and/or usingspeedup or admission control increases the likelihood of readmission. Without lossof generality, we assume the readmission risk to be 0 if neither speedup or admissioncontrol are used (i.e. µ = µL, λ = λH ) and the new customer is able to enter serviceimmediately upon arrival. If a customer would have arrived under the nominal arrivalrate, λH , but was blocked due to admission control, this customer may return toservice with probability pRλ after some time which is exponentially distributed withmean 1/δa. Similarly, if a customer is discharged under speedup, his probability of areturn to service increases by pRµ ; he returns after 1/δs units of time (on average). Ifthe customer arrives to the system with X customers in the system, his probability ofreturn to service increases by min{pRw × (X − N)+, 1}. Thus, if a customer arriveswith X customer in the system and then is discharged under speedup, his probabilityof return to service is pRµ + pRw × (X−N)+. On the other hand, if the same customeris discharged under the nominal service rate, his probability of return to service ispRw × (X − N)+. We simulate such a model for each Na and Ns combination with40 iterations of 100 days each3. We then use exhaustive search to find the thresholds,Na and Ns, which minimize the readmission rate.

As a comparison, we use the analysis of our original model from Section 2 withoutreadmission, to determine thresholds, Na and Ns. We will then use these thresholdsin the model with readmissions and simulate the resulting readmission rate of thispolicy. To do this, we need to appropriately define our cost parameters for our originalmodel to capture the increase in readmission rate due to admission control, speedupand waiting. Doing so results in ca = pRλ , cs = pRµ , cw = pRw .

For illustrative purposes, we let pRλ = 0.05, pRµ = 0.0667, pRw = 0.001 (see, forexample, Kim et al (2015), Chan et al (2014)). Table 6 compares the simulated returnrates of an exhaustive search over the systems with and without readmissions. Weobserve very small differences in performance; for the small system, the differenceis not even statistically significant. As seen in Figure 12, poor selection of thresholdscan result in increases in readmission rates of up to 8%. Additionally, we find that thestructure of the readmission rate as a function of Na and Ns is very similar to thatof the cost function in our original model without returns. In particular, the optimalregime of (Na, Ns) which minimizes readmission rates for the system with returnsis practically identically to the optimal regime which minimizes costs for the originalmodel presented in Section 2. Additionally, we find that the minimum readmissionrates are quite robust, as in Section 6.3. Hence, incorporating readmissions into ouroriginal model through appropriately defined cost factors seems to work quite ef-fectively and avoids that complexity associated with explicitly including returns toservice in the model.

3 Given the computational complexity of this simulation–we must keep track of the number of cus-tomers in the system upon arrival for each customer–the number of repetitions was limited to 40.

Page 31: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 31

Table 6 A system with returns to service: Comparison of return rates for solution which ignores returns toservice but has appropriately define cost measures (Original Model) to solution established via exhaustivesearch over a model which explicitly incorporates readmissions.

Original Model Extended Model(without returns) (with returns)

ICU setting (N = 40) 0.23% 0.18%95% confidence interval [0.16,0.31] [0.11,0.24]Hospital setting (N = 400) 0.138% 0.085%95% confidence interval [0.12,0.16] [0.07,0.1]

020406080100

0

50

100

0

0.02

0.04

0.06

0.08

Ns

Na

Ret

urn

Prob

abili

ty

Fig. 12 Small System: Readmission rates as a function of Ns and Na.

8 Conclusion

In this paper, we examined the trade-off between the probability of admission control,probability of speed up and expected queue length when managing a service systemvia an admission control and speedup policy. We showed that if the cost function iseither linear or concave the optimal policy has a distinct structure of a threshold pol-icy. We then investigate the dynamics of a service system with such threshold policyusing fluid approximations, and retrieve approximations for the main performancemeasures of the system. Using simulation, we found that these approximations canbe very accurate. We then used these approximations in the original cost minimiza-tion problem, identified a set of solutions with seemingly zero costs, and developeda heuristic that achieves near optimal performance. Our results can be utilized in twoways: 1) to estimate the performance of a specific admission control and speeduppolicy, or 2) to find a reasonable admission control and speedup policy. Our proposedheuristic is based on fluid estimates and seems very robust to cost misspecifications.

One potential future direction is to consider what happens when Assumption2 is relaxed. We believe that many of the structural results should hold for anyN > λH/µL. However, we expect that the performance measure approximations(especially E[(X − N)+]) and the optimal solution are very different from what weshowed here. Different techniques are likely necessary to develop an understandingof such systems. Still, it is highly undesirable to operate a system which is unsta-ble under nominal control; thus, we believe that understanding the behavior of oursystem under the assumption that N < λH/µL is an important first step.

We observed two different system dynamics as a function of the threshold chosen.As such, a step forward may be to develop diffusion approximations for the systemunder different conditions. This will eliminate the need for the heuristic we used to

Page 32: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

32 Galit B. Yom-Tov, Carri W. Chan

approximate the expected queue length. This would probably require using AveragingPrinciple techniques such as in Perry and Whitt (2011).

While we did not explicitly consider returns to service in our analytic model, wefind that, with appropriately defined cost parameters, our model, without readmis-sions, can perform reasonably well. It would be interesting to explore a model whichexplicitly incorporates readmissions. We note that this is done in Chan et al (2014)without admission control and absent of an optimization framework and it is not ob-vious how those techniques (which are also used here) can be extended to this morecomplex flow model.

We note that time-varying arrival rates can arise in many service settings, espe-cially hospitals. As seen in Chan et al (2014) and Yom-Tov and Mandelbaum (2014),when the time scale of variation is short compared to the service time (LOS), thenignoring the time-variation can result in very reasonable performance. This is likelyto be the case in our settings where the average LOS in the ICU and hospital are onthe order of days, while the time-variation is on the order of hours. Of course, in anEmergency Department setting, where both the service time and time-variation areon the order of hours, accounting for the time-variability may be essential.

Finally, another feature that arises in a number of service settings is that customerscan abandon the system if the wait is too long. Our current framework does not in-corporate such dynamics and this is an interesting area for future research. Moreover,incorporating abandonment would allow the system to be stable, even without admis-sion or speedup control when N > λH/µL. This could introduce new dynamics thathave not been considered here.

APPENDIX

Appendix A: Miscellaneous Proofs for Stochastic System

PROOF OF THEOREM 1: The proof of this theorem requires an intermediate resulton the differential discounted cost, ∆.

Proposition 2 [Differential Monotonicity] The differential discounted cost function,∆(X), is non-decreasing in the number of jobs, X. That is, Let X > X, then:

∆(X) ≤ ∆(X).

PROOF: The proof is via the value iteration method and induction. We generate asequence of functions Jk starting with J0(X) = 0 for all X ≥ 0. Then for each k > 0we have:

Jk+1(0) =1

β + vminλ{φ(λ) + (v − λ)Jk(0) + λJk(1)}

Additionally, for X > 0, we have:

Jk+1(X) =1

β + vminλ,µ{h(X) + φ(λ) + ξ(µ) + λJk(X + 1)

+(X ∧N)µJk(X− 1) + (v − λ− (X ∧N)µ)Jk(X)}

Page 33: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 33

For k ≥ 0 and X > 0, let

∆k(X) = Jk(X)− Jk(X− 1)

with ∆k(0) = 0. Using value iteration, we have that J(X) = limk→∞ Jk(X). Itfollows that ∆(X) = limk→∞∆k(X). If we can show that ∆k(X) is non-decreasingin X for every k, then the proposition is true. To do this, we use induction. The basecase is trivially true for k = 0, where ∆0(X) = 0 for all X.

We will assume that the assertion is true for k and will show that it is also truefor k + 1. We denote uk(X + 1) = (λk(X + 1), µk(X + 1)) = arg minλ,µ{φ(λ) +λ∆k(X + 1) + ξ(µ)− (X ∧N)µ∆k(X)} as the strategy used in iteration k + 1.

∆k+1(X + 1) = Jk+1(X + 1)− Jk+1(X) (15)

=1

β + v[h(X) + φ(λk+1(X + 1)) + ξ(µk+1(X + 1)) + vJk(X + 1)

+λk+1(X + 1) (Jk(X + 2)− Jk(X + 1))

−(X ∧N)µk+1(X + 1) (Jk(X + 1)− Jk(X))− Jk+1(X)]

≥ 1

β + v[h(X + 1)− h(X) + λk+1(X + 1)∆k(X + 2)

+ (v − λk+1(X + 1)− (X ∧N)µk+1(X + 1))∆k(X + 1)

+(X ∧N)µk+1(X + 1)∆k(X)]

where the last inequality comes from the fact that we can use the policy uk+1(X+ 1)at iteration k + 1 in state X and incur cost which is no less than Jk+1(X). Similarly,we can use the policy uk+1(X− 1) in state X at iteration k+ 1 and incur cost no lessthan Jk+1(X):

∆k+1(X) = Jk+1(X)− Jk+1(X− 1) (16)

≤ 1

β + v[h(X)− h(X− 1) + λk+1(X− 1)∆k(X + 1)

+ (v − λk+1(X− 1)− (X− 1)µk+1(X− 1))∆k(X)

+ (X− 1)µk+1(X− 1)∆k(X− 1)].

Combining equations (15) and (16), for X ≥ 1 we have that:

(β + v)(∆k+1(X + 1)−∆k+1(X)) ≥ h(X + 1)− h(X− 1)

+[v − (X ∧N)µk(X + 1)

−λk+1(X + 1)](∆k(X + 1)−∆k(X))

+λk+1(X + 1)(∆k(X + 2)−∆k(X + 1))

+(X− 1)µk+1(X− 1)(∆k(X)−∆k(X− 1))

≥ 0 by the induction hypothesis.

For the differential function when X = 1, we will use a suboptimal policy in state 0so that the arrival and service rates are the same as those used in state 1, λk(1) and

Page 34: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

34 Galit B. Yom-Tov, Carri W. Chan

µk(1), so that:

∆k+1(1) = Jk+1(1)− Jk+1(0)

≥ 1

β + 1[ξ(µk+1(1)) + (v − µk+1(1))∆k(1) + λk+1(1)(∆k(2)−∆k(1))]

≥ 0 = ∆k+1(0)

where the first inequality follows from the induction hypothesis. This completes theproof that for all X and k, ∆k+1(X + 1) ≥ ∆k+1(X), and so is also true in the limitas k →∞. �

Now, we consider the Bellman equation, where the arrival rate and service ratedecisions can be separated:

J(X) =1

β + v

[h(X) + vJ(X) + min

λ{φ(λ)

+λ∆(X + 1)}+ minµ{ξ(µ)− (X ∧N)µ∆(X)}

].

– Admission Control: We first consider the optimization of the arrival rate. Ourgoal is to find λ∗(X) such that:

λ∗(X) = arg minλ{φ(λ) + λ∆(X + 1)}.

By Proposition 2, we have that ∆(X) is non-decreasing in X. By assumption,φ(λ) is non-increasing in λ. Hence, λ∗ is also non-increasing in X.

– Speedup: We now consider the optimization of the service rate. Our goal is tofind µ∗(X) such that:

µ∗(X) = arg minµ{ξ(µ)− (X ∧N)µ∆(X)}.

By Proposition 2, we have that x∆(X) is non-decreasing in X. By assumption,ξ(µ) is non-decreasing in µ. Hence, µ∗ is also non-decreasing in X.

�PROOF OF THEOREM 2: We again turn back to Bellman’s equation:

J(X) =1

β + v

[h(X) + vJ(X) + min

λ{φ(λ)

+λ∆(X + 1)}+ minµ{ξ(µ)− (X ∧N)µ∆(X)}

].

– Admission Control: We first consider the optimization of the arrival rate. Ourgoal is to find λ∗(X) such that:

λ∗(X) = arg minλ∈[λL,λH ]

{φ(λ) + λ∆(X + 1)}.

Page 35: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 35

Consider a fixed X. Then∆(X+1) is some non-negative constant. By assumption,φ(λ) is concave; hence, the portion of the cost function associated with the arrivalrate:

φ(λ) + λ∆(X + 1) is concave.

Since we are minimizing a concave function over a finite interval, the optimaladmission rate must be at the boundary. Therefore, we must have that λ∗(X) =λL or λ∗(X) = λH .

– Speedup: We now consider the optimization of the service rate. Our goal is tofind µ∗(X) such that:

µ∗(X) = arg minµ{ξ(µ)− (X ∧N)µ∆(X)}.

Consider a fixed X. Then (X ∧ N)∆(X) is some non-negative constant. By as-sumption, ξ(µ) is linear; hence, the portion of the cost function associated withthe arrival rate:

ξ(µ)− (X ∧N)µ∆(X) is linear.

Since we are minimizing a linear function over a finite interval, the optimal ser-vice rate must be at the boundary. Therefore, we must have that µ∗(X) = µL orµ∗(X) = µH .

�PROOF OF PROPOSITION 1: This is a direct result of Proposition 4.3.3 in Bert-

sekas (2001) on Blackwell Optimal policies. Note that the results of Theorem 2 (andProposition 2) are independent of the exact value of the discount factor, β. Becausethere exists a Blackwell optimal policy, it must satisfy the structural properties de-rived in Theorem 2. Additionally, because a Blackwell optimal policy is also optimalfor the average cost problem, the properties hold for the average cost problem. �

Appendix B: Proofs for Fluid System

Our system is a piecewise-smooth set of ordinary differential equations. We will takea similar approach to that in Chan et al (2014); however, our system has two regionsof discontinuity, X = Na and X = Ns, and is one-dimensional. Still, we can uti-lize generalize Lyapunov techniques for discontinuous differential equations outlinesin Filippov (1988) and di Bernardo et al (2008). The main idea behind the Filippov(1988) approach is to use a ‘smoothed’ version of the ODE at the points of discon-tinuity, by using a convex combination of the surrounding smooth ODEs. We willdemonstrate this for the case where Na < Ns and note that the remaining cases(Na = Ns and Na > Ns) will follow similarly.

Page 36: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

36 Galit B. Yom-Tov, Carri W. Chan

We first partition the state space into 3 distinct regions and 2 switching bound-aries:

D1 = {X : X < Na}D2 = {X : Na < X < Ns}D3 = {X : Ns < X}Σ12 = {X : X = Na}Σ23 = {X : X = Ns}.

We can consider the differential equations under policies which either 1) never useadmission control or speedup 2) always use admission control and speedup 3) alwaysuse admission control, but never use speedup, and, finally 4) never use admissioncontrol, but always use speedup:

1. [Never use admission control or speedup:] XHL(t) , λH − µL(X(t) ∧N).2. [Always use admission control and speedup:] XLH(t) , λL − µH(X(t) ∧N).3. [Always use admission control. Never use speedup:] XLL(t) , λL−µL(X(t)∧N).

4. [Never use admission control. Always use speedup:] XHH(t) , λH−µH(X(t)∧N).

We can now define the fluid function Fi(X), X ∈ Di as the smooth ODE in eachregion:

F1(X) = XHL(t) , λH − µL(X(t) ∧N)

F2(X) = XLL(t) , λL − µL(X(t) ∧N)

F3(X) = XLH(t) , λL − µH(X(t) ∧N)

We can now define our discontinuous ODE, X = F (X), as a differential inclusion:

X ∈ F(X) =

F1(X) , if X ∈ D1,F2(X) , if X ∈ D2,F3(X) , if X ∈ D3,{(1− ψ)F1(X) + ψF2(X)|0 ≤ ψ ≤ 1} , if X ∈ Σ12,{(1− ψ)F2(X) + ψF3(X)|0 ≤ ψ ≤ 1} , if X ∈ Σ23.

(17)

We now have the primitives necessary to prove the remaining results for the fluidsystem.

PROOF OF THEOREM 3: This results follows directly from the existence resultin Filippov (1988). We restate it here for completeness.

Theorem 5 (Theorem 1, Chapter 2, Section 7 of Filippov (1988)) Let F(X) be adifferential inclusion that satisfies the following conditions in the domain G:

1. F(X) is non-empty for all Q ∈ G.2. F(X) is bounded and closed for all Q ∈ G.3. F(X) is convex for all Q ∈ G.

Page 37: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 37

4. The function F is upper semicontinuous in Q.

Then for any point x0 ∈ G, there exists a solution of the problem

X ∈ F(X), X(0) = x0.

Thus, what is required to prove existence of a solution to our ODE is to verify whetherour differential inclusion (17) satisfies the four conditions of the theorem.

It is trivial to see that the four conditional are satisfied when the system state isnot on a switching boundary, i.e. when X ∈ D1 ∪D2 ∪D3. This is because F(X) isa single point which is bounded above by λH and below by λL − µHN . Because thedifferential inclusion is defined by a continuous function in these regions, they arealso upper semicontinuous in the regions.

What remains is to demonstrate the the four conditions are satisfied in Σ12 andΣ23. For any X ∈ Σij the differential inclusion is given by the convex combinationof a single, bounded point. Thus, it is easy to see that conditions 1-3 are satisfied.

We need to prove that the ODE is upper semicontinuous for all X ∈ Σ12 ∪Σ23.We do this by construction and will start with X ∈ Σ12. The case for X ∈ Σ23

follows similarly. Consider an open set V such that F(X) ⊂ V . Then, there mustexist an ε > 0 such that for any f ∈ F(X), f + ε ∈ F(X). Now consider X ∈ Σ12

and some ε > 0. Because F1 and F2 are continuous functions, there exists a δ > 0such that |X −X ′| < δ, |F1(X) − F1(X ′)| < ε/2, and |F2(X) − F2(X ′)| < ε/2.Thus,

|(1− ψ)F1(X) + ψF2(X)− (1− ψ)F1(X ′)− ψF2(X ′)| < ε,∀ψ ∈ [0, 1].

This implies that F(X ′) ⊂ V . Thus, we have constructed the necessary open set,X ′ : |X −X ′| < δ,X ∈ Σ12, to demonstrate that F(X) is upper semicontinuous onΣ12.

Since all four conditions hold for all X ∈ G = [0, Xmax], a solution to our ODEexists.

�PROOF OF THEOREM 4: To show globally asymptotic stability, we need to iden-

tify a Lyapunov function and prove that for all Q ≥ 0, Q 6= x, the derivative of theLyaponov function is strictly negative. We use the following Lyapunov function:

V (X) = |Q− x| (18)

where x is the specified equilibrium. The main challenge here is that the ODE (4) isdiscontinuous. Hence, we need to use a generalized Lyapunov theory which utilizesFilippov solutions as done in Shevitz and Paden (1994). We use the Filippov method-ology, which redefines the ODE at the points of discontinuity, Na and Ns, as theset-valued function which is now equal to the convex combination of the surround-ing smooth ODEs in (4) as in (17). In order to establish global asymptotic stability,we need to show that the set value map for our generalized Lyapunov derivative isnegative for all states not equal to the equilibrium (Shevitz and Paden 1994).

We now define our set value map, generalized Lyapunov derivative. This requiresconsidering a number of cases depending on the whether X is on a point of disconti-nuity.

Page 38: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

38 Galit B. Yom-Tov, Carri W. Chan

1. [Q 6= Na, Ns].

˙V (X) =

{X, X > x;−X, X < x.

(19)

2. [Q = Na 6= Ns]. In this case, the flow is on a point of discontinuity, Na; thus,the set value map is defined as the convex combination of the surrounding smoothODEs.

˙V (X) =

ψXLH + (1− ψ)XHH , ψ ∈ [0, 1], X > x and X > Ns;ψXLL + (1− ψ)XHL, ψ ∈ [0, 1], X > x and X < Ns;−ψXLH − (1− ψ)XHH , ψ ∈ [0, 1], X < x and X > Ns;−ψXLL − (1− ψ)XHL, ψ ∈ [0, 1], X < x and X < Ns.

(20)

3. [Q = Ns 6= Na]. In this case, the flow is on a different point of discontinuity,Ns. We take a similar approach to what we did before;

˙V (X) =

ψXLH + (1− ψ)XLL, ψ ∈ [0, 1], X > x and X > Na;ψXHH + (1− ψ)XHL, ψ ∈ [0, 1], X > x and X < Na;−ψXLH − (1− ψ)XLL, ψ ∈ [0, 1], X < x and X > Na;−ψXHH − (1− ψ)XHL, ψ ∈ [0, 1], X < x and X < Na.

(21)

4. [Q = Ns = Na]. In this case, the flow is on the (only) point of discontinuity,Na = Ns.

˙V (X) =

{ψXLH + (1− ψ)XHL, ψ ∈ [0, 1], X > x;−ψXLH − (1− ψ)XHL, ψ ∈ [0, 1], X < x.

(22)

ψ is simply a parameter to generate the convex combination of smooth ODEs. Inorder to prove global asymptotic stability, we must have ˙V (X) < 0 for all X ≥ 0,X 6= x and all ψ ∈ [0, 1]. Due to the amount of algebra involved in this proof, weonly include here the proof for Case 1, ACF (Na < Ns), while noting the proofsfor Case 2, SCF, and Case 3, SASC, will follow similarly. In this proof, it will behelpful to recall that by Assumption 2, N > xHL = λH/µL. Also, we do not needto consider the fourth case, X = Na = Ns, because we are currently examining thecase where Na < Ns.

Case 1.1 xHL ≤ Na: In this case, the equilibrium is x = xHL. We need to examinethe three cases i. X 6= Na, Ns, ii. X = Na, and iii. X = Ns. There are a numberof subcases to consider within each case:i. [Q 6= Na, Ns]

(a) X > x = xHL.

˙V (X) = X = 1{Q<Na}λH + 1{Q≥Na}λL

−(1{Q<Ns}µL + 1{Q≥Ns}µH

)(Q ∧N)

≤ λH − µL(Q ∧N) < λH − µLx = λH − µLxHL = 0

Page 39: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 39

(b) X < x = xHL.

˙V (X) = −X = −λH + µL(Q ∧N) < −λH + µLx

= −λH + µLxHL = 0

ii. [Q = Na 6= Ns] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:(a) X > x and X > Ns. This case cannot occur because X = Na < Ns.(b) X > x and X < Ns.

˙V (X) = ψXLL + (1− ψ)XHL = ψ[λL − µL(Q ∧N)]

+ (1− ψ)[λH − µL(Q ∧N)] ≤ λH − µL(Q ∧N)

< λH − µLx = λH − µLxHL = 0, ∀ψ ∈ [0, 1]

(c) X < x and X > Ns. This case cannot occur because X = Na < Ns.(d) X < x and X < Ns. This case cannot occur because X < x = xHL ≤

Na.iii. [Q = Ns 6= Na] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:

(a) X > x and X > Na.

˙V (X) = ψXLH + (1− ψ)XLL = ψ[λL − µH(Q ∧N)]

+ (1− ψ)[λL − µL(Q ∧N)]

≤ λL − µL(Q ∧N) < λH − µL(Q ∧N)

< λH − µLxHL = 0, ∀ψ ∈ [0, 1]

(b) X > x and X < Na. This case cannot occur because X = Ns > Na.(c) X < x and X > Na. This case cannot occur because X < x = xHL ≤

Na.(d) X < x and X < Na. This case cannot occur because X = Ns > Na.

Case 1.2 xLL ≤ Na ≤ xHL: In this case, the equilibrium is x = Na. We need toexamine the two cases X 6= Na, Ns and X = Ns

4. There are a number ofsubcases to consider within each of our two cases:i. [Q 6= Na, Ns]

(a) X > x = Na.

˙V (X) = X = λL −(1{Q<Ns}µL + 1{Q≥Ns}µH

)(Q ∧N)

≤ λL − µL(Q ∧N) < λL − µLxLL = 0

(b) X < x = Na < Ns.

˙V (X) = −X = −λH + µL(Q ∧N)

< −λH + µLx ≤ −λH + µLxHL = 0

ii. [Q = Ns 6= Na] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:

4 We do not need to consider the second case becauseNa is our equilibrium and our Lyapunov functionis equal to 0 when X = Na.

Page 40: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

40 Galit B. Yom-Tov, Carri W. Chan

(a) X > x and X > Na.

˙V (X) = ψXLH + (1− ψ)XLL

= ψ[λL − µH(Q ∧N)] + (1− ψ)[λL − µL(Q ∧N)]

≤ λL − µL(Q ∧N) < λL − µLx ≤ λL − µLxLL = 0, ∀ψ ∈ [0, 1]

(b) X > x and X < Na. This case cannot occur because x = Na.(c) X < x and X > Na. This case cannot occur because x = Na.(d) X < x and X < Na. This case cannot occur because X = Ns > Na.

Case 1.3 Na ≤ xLL ≤ Ns: In this case, the equilibrium is x = xLL. We need toexamine the three cases i. X 6= Na, Ns, ii. X = Na, and iii. X = Ns. There area number of subcases to consider within each case:i. [Q 6= Na, Ns]

(a) X > x = xLL ≥ Na.

˙V (X) = X = λL −(1{Q<Ns}µL + 1{Q≥Ns}µH

)(Q ∧N)

≤ λL − µL(Q ∧N) < λL − µLxLL = 0

(b) X < x = xLL ≤ Ns.

˙V (X) = −X = −1{Q<Na}λH − 1{Q≥Na}λL + µL(Q ∧N)

≤ −λL + µL(Q ∧N) < −λL + µLxLL = 0

ii. [Q = Na 6= Ns] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:(a) X > x and X > Ns. This case cannot occur because X = Na < Ns.(b) X > x andX < Ns. This case cannot occur becauseX = Na ≤ xLL =

x.(c) X < x and X > Ns. This case cannot occur because X = Na < Ns.(d) X < x and X < Ns.

˙V (X) = −ψXLL − (1− ψ)XHL

= −ψ[λL − µL(Q ∧N)]− (1− ψ)[λH − µL(Q ∧N)]

≤ −λL + µL(Q ∧N) < −λL + µLx

= −λL + µLxLL = 0, ∀ψ ∈ [0, 1]

iii. [Q = Ns 6= Na] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:(a) X > x and X > Na.

˙V (X) = ψXLH + (1− ψ)XLL

= ψ[λL − µH(Q ∧N)] + (1− ψ)[λL − µL(Q ∧N)]

≤ λL − µL(Q ∧N) < λL − µLxLL = 0, ∀ψ ∈ [0, 1]

(b) X > x and X < Na. This case cannot occur because X = Ns > Na.(c) X < x andX > Na. This case cannot occur becauseX = Ns ≥ xLL =

x.

Page 41: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 41

(d) X < x and X < Na. This case cannot occur because X = Ns > Na.Case 1.4 xLH ≤ Ns ≤ xLL: In this case, the equilibrium is x = Ns. We need to

examine the two cases i. X 6= Na, Ns and ii. X = Na5. There are a number of

subcases to consider within each of our two cases:i. [Q 6= Na, Ns]

(a) X > x = Ns > Na.

˙V (X) = X = λL − µH(Q ∧N) < λL − µH x ≤ λL − µHxLH = 0

(b) X < x = Ns.

˙V (X) = −X = −1{Q<Na}λH − 1{Q≥Na}λL + µL(Q ∧N)

≤ −λL + µL(Q ∧N) < −λL + µLx ≤ −λL + µLxLL = 0

ii. [Q = Na 6= Ns] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:(a) X > x and X > Ns. This case cannot occur because X = Na < Ns.(b) X > x and X < Ns. This case cannot occur because X > x = Ns.(c) X < x and X > Ns. This case cannot occur because X = Na < Ns.(d) X < x and X < Ns.

˙V (X) = −ψXLL − (1− ψ)XHL

= −ψ[λL − µL(Q ∧N)]− (1− ψ)[λH − µL(Q ∧N)]

≤ −λL + µL(Q ∧N) < −λL + µLx

≤ −λL + µLxLL = 0, ∀ψ ∈ [0, 1]

Case 1.5 Ns ≤ xLH : In this case, the equilibrium is x = xLH . We need to examinethe three cases i. X 6= Na, Ns, ii. X = Na, and iii. X = Ns. There are a numberof subcases to consider within each case:i. [Q 6= Na, Ns]

(a) X > x ≥ Ns > Na.

˙V (X) = X = λL − µH(Q ∧N) < λL − µH x = λL − µHxLH = 0

(b) X < x.

˙V (X) = −X = −1{Q<Na}λH − 1{Q≥Na}λL

+(1{Q<Ns}µL + 1{Q≥Ns}µH

)(Q ∧N)

≤ −λL + µH(Q ∧N) < −λL + µH x = λL − µHxLH = 0

ii. [Q = Na 6= Ns] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:(a) X > x and X > Ns. This case cannot occur because X = Na < Ns.(b) X > x and X < Ns. This case cannot occur because X = Na < Ns ≤

xLH = x.5 We do not need to consider the third case because Ns is our equilibrium and our Lyapunov function

is equal to 0 when X = Ns.

Page 42: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

42 Galit B. Yom-Tov, Carri W. Chan

(c) X < x and X > Ns. This case cannot occur because X = Na < Ns.(d) X < x and X < Ns.

˙V (X) = −ψXLL − (1− ψ)XHL

= −ψ[λL − µL(Q ∧N)]− (1− ψ)[λH − µL(Q ∧N)]

≤ −λL + µL(Q ∧N) < −λL + µLxLH

< −λL + µLxLL = 0, ∀ψ ∈ [0, 1]

iii. [Q = Ns 6= Na] We want to show that for all ψ ∈ [0, 1], ˙V (X) < 0:(a) X > x and X > Na.

˙V (X) = ψXLH + (1− ψ)XLL

= ψ[λL − µH(Q ∧N)] + (1− ψ)[λL − µL(Q ∧N)]

≤ λL − µL(Q ∧N)

< λL − µLxLL = 0, ∀ψ ∈ [0, 1]

(b) X > x and X < Na. This case cannot occur because x = xLH ≥ Ns >Na.

(c) X < x and X > Na. This case cannot occur because x = xLH ≥ Ns =Q.

(d) X < x and X < Na. This case cannot occur because x = xLH ≥ Ns =Q.

This concludes the proof for the global stability of Case 1. �

References

Adusumilli KM, Hasenbein JJ (2010) Dynamics admission and service rate control of a queue. QueueingSystems 66:131–154

Allon G, Deo S, Lin W (2013) The impact of hospital size and occupancy of hospital on the extent ofambulance diversion: Theory and evidence. Operations Research 61(3):544–562

Anand K, Pac MF, Veeraraghavan S (2010) Quality-speed conundrum: Tradeoffs in customer-intensiveservices. Management Science 57:40–56

Ata B, Shneorson S (2006) Dynamic control of an M/M/1 service system with adjustable arrival andservice rates. Management Science 52:1778–1791

Bekker R, Borst S (2006) Optimal admission control in queues with workload-dependent service rates.Probability in the Engineering and Informational Sciences 20:543–570

Bekker R, Boxma O (2007) An M/G/1 queue with adaptable service speed. Stochastic Models 23:373–396

Bekker R, Borst S, Boxma O, Kella O (2004) Queues with workload-dependent arrival and service rates.Queueing Systems 46:537–556

Bekker R, Boxma O, Resing J (2008) Queues with adaptable service speed. Statistica Neerlandica 62:441–457

Boxma OJ, Vlasiou M (2007) On queues with service and interarrival times depending on waiting times.Queueing Systems 56:121–132

di Bernardo M, Budd C, Champneys A, Kowalczyk P (2008) Piecewise-smooth dynamical systems: The-ory and applications. Springer

Bertsekas D (2001) Dynamic Programming and Optimal Control. Athena Scientific

Page 43: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

Balancing Admission Control, Speedup, and Waiting in Service Systems 43

de Bruin A, Bekker R, van Zanten L, Koole G (2010) Dimensioning hospital wards using the Erlang lossmodel. Annals of Operations Research 178:23–43

Chalfin DB, Trzeciak S, Likourezos A, Baumann BM, Dellinger RP (2007) Impact of delayed transferof critically ill patients from the emergency department to the intensive care unit. Critical CareMedicine 35:1477–1483

Chan CW, Farias VF, Bambos N, Escobar G (2012) Optimizing ICU discharge decisions with patientreadmissions. Operations Research 60:1323–1342

Chan CW, Farias VF, Escobar G (2017) The impact of delays on service times in the intensive care unit.Management Science 63(7):2049–2072

Chan CW, Yom-Tov G, Escobar G (2014) When to use speedup: An examination of service systems withreturns. Operations Research 62:462–482

Choi DI, Lim DE (2016) Performance analysis of novel overload control with threshold mechanism. Math-ematical Problems in Engineering

Dimitrakopoulos Y, Burnetas A (2017) The value of service rate flexibility in an M/M/1 queue with ad-mission control. IISE Transactions 49(6):603–621

Dimitrakopoulos Y, Burnetas A (2016) Customer equilibrium and optimal strategies in an M/M/1 queuewith dynamic service control. European Journal of Operational Research 252:477–486

Dong J, Feldman P, Yom-Tov GB (2015) Service system with slowdowns: Potential failures and proposedsolutions. Operations Research 63(2):305–324

Dong J, Perry O (2019) Queueing models for patient-flow dynamics in inpatient wards. Operations Re-search To appear

Filippov A (1988) Differential equations with discontinuous righthand sides. Kluwer Academic Publishers,Dortrecht

Green LV (2003) How many hospital beds? Inquiry 39:400–412

Hasija S, Pinker E, Shumsky RA (2010) OM Practic—Work expands to fill the time available: Capacityestimation and staffing under Parkinson’s law. Manufacturing & Service Operations Management12:1–18

Hazewinkel M (ed) (2001) Stirling formula, Encyclopedia of Mathematics. Springer

Huang J, Carmeli B, Mandelbaum A (2015) Control of patient flow in emergency departments, or multi-class queues with deadlines and feedback. Operations Research 63(4):892–908

Janssen AJEM, van Leeuwaarden JSH, Zwart B (2011) Refining square root safety staffing by expandingErlang C. Operations Research 59(6):1512–1522

Kc D, Terwiesch C (2012) An econometric analysis of patient flows in the cardiac intensive care unit.Manufacturing & Service Operations Management 14(1):50–65

Kim SH, Chan CW, Olivares M, Escobar G (2015) ICU admission control: An empirical study of capacityallocation and its implication on patient outcomes. Management Science 61(1):19–38

Lee N, Kulkarni V (2014) Optimal arrival rate and service rate control of multi-server queues. QueueingSystems 76:37–50

Mandelbaum A, Momcilovic P, Tseytlin Y (2012) On fair routing from emergency departments to hospitalwards: QED queues with heterogeneous servers. Management Science 58(7):1273–1291

Ormeci EL (2004) Dynamic admission control in a call center with one shared and two dedicated servicefacilities. IEEE Transactions on Automatic Control 49:1157–1161

Powell SG, Schultz KL (2004) Throughput in serial lines with state-dependent behavior. ManagementScience 50:1095–1105

Perry O, Whitt W (2011) An ODE for an overloaded X model involving a stochastic averaging principle.Stochastic Systems 1:17–66

Shevitz D, Paden B (1994) Lyapunov stability theory of nonsmooth systems. IEEE Transactions on Auto-matic Control 39(9):1910–1914

Shi P, Chou MC, Dai JG, Ding D, Sim J (2015) Models and insights for hospital inpatient operations:Time-dependent ED boarding time. Management Science 62(1):1–28

Page 44: Balancing Admission Control, Speedup, and Waiting in ... · work, we examine a multi-server queueing system which considers both admission control and speedup. We use dynamic programming

44 Galit B. Yom-Tov, Carri W. Chan

Shmueli A, Sprung C, Kaplan E (2003) Optimizing admissions to an intensive care unit. Health CareManagement Science 6(3):131–136

Song H, Tucker AL, Graue R, Moravick S, Yang JJ (2019) Capacity pooling in hospitals: The hiddenconsequences of off-service placement. Working paper

State of California Office of Statewide Health Planning & Development (2010-2011) Annual FinancialData. URL http://www.oshpd.ca.gov/HID/Products/Hospitals/AnnFinanData/CmplteDataSet/index.asp

Yom-Tov GB, Mandelbaum A (2014) Erlang-R: A time-varying queues with reentrant customers, in sup-port of healthcare staffing. Manufacturing & Service Operations Management 16:283–299