3
Operations Research Letters 38 (2010) 502–504 Contents lists available at ScienceDirect Operations Research Letters journal homepage: www.elsevier.com/locate/orl Grid scheduling with NBU service times Yusik Kim a , Rhonda Righter b,, Ronald Wolff b a INRIA Bordeaux - Sud-Ouest, 351 cours de la Libération, 33405 Talence Cedex, France b Department of IEOR, University of California, 4141 Etcheverry Hall, Berkeley, CA 94720, USA article info Article history: Received 31 January 2009 Accepted 16 August 2010 Available online 16 September 2010 Keywords: Multi-server queues Stochastic scheduling Grid computing NBU processing times abstract In the highly unpredictable environment of grid computing, it often makes sense to replicate the same job on several of the available servers. We prove that never replicating is optimal for a heterogeneous multi-server system in a random environment when service times are New Better than Used. © 2010 Elsevier B.V. All rights reserved. 1. Introduction Grid computing [2] is a new distributed computing paradigm where Internet-connected servers are integrated to form a fast virtual computer. Examples of existing projects include SETI@Home and Folding@Home. Servers participating in the grid have different owners, a system load that varies over time, and unknown local scheduling policies. The service time (the time between submission and completion) of a job on a server depends on the server rather than properties of the job, and is highly variable. In this situation, it may make sense to assign replicas of a job to several servers simultaneously and use the output of whichever finishes first. While this is clearly not optimal when processing times are deterministic, various amounts of replication can be shown to be optimal depending on the variability of the processing times, where generally more variability implies more replication [3]. By replicating, we complete individual jobs more quickly at the expense of occupying servers that could otherwise be used to work on different jobs. Whether or not it is a worthwhile trade-off depends on the service time distribution. We model the computing grid as a heterogeneous multi- server queueing system, with an infinite input queue. The infinite queue is reasonable for large-scale computing applications such as SETI@Home in which there are a vast number of similar computations to be done. We also assume that there is a single job class, and that all randomness and heterogeneity in processing is due to the server. This is the case when all jobs perform the Corresponding author. E-mail addresses: [email protected] (Y. Kim), [email protected] (R. Righter), [email protected] (R. Wolff). same subroutine on different data, as in Single-Program-Multiple- Data (SPMD) parallel programs, and for many grid applications. There is an environmental state that may affect server speeds and availabilities and that evolves independently of the scheduling actions. We assume that given the environmental state, the processing times of (replicas of) the same job on different servers are independent, which is reasonable in a grid environment in which different servers are at different locations and belong to different entities. We assume that communication delays can either be ignored, which is realistic when computation times are significantly longer than communication times, and is generally the case for applications in which grid computing makes sense, or they can be incorporated into the processing times. We also assume that if a job is replicated and finishes on one server, there is no cost for deleting, or overwriting, (replicas of) that job from other servers. This assumption also is reasonable in grid environments in which the reason one server takes longer than another to process the same job is because it is either unavailable or busy with its own, higher priority, work. Koole and Righter [4] also modeled the grid as a multi-server queueing system but with general job arrival processes. They proved that when service times are New Worse than Used (NWU— defined later), max-replication, i.e., replicating each job so that all servers are occupied by a single job and its replicas, stochastically maximizes the number of job completions by any time. They also proved an analogous result for New Better than Used (NBU— defined later) service times; that no-replication, i.e., not replicating jobs at all and processing distinct jobs in parallel, stochastically maximizes the number of jobs completed up to time t for all t 0 when there are two servers and at least two jobs. Borst et al. [1] showed that minimal replication is optimal for geometric service times. Here, no job is replicated when the 0167-6377/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.orl.2010.08.009

Grid scheduling with NBU service times

Embed Size (px)

Citation preview

Page 1: Grid scheduling with NBU service times

Operations Research Letters 38 (2010) 502–504

Contents lists available at ScienceDirect

Operations Research Letters

journal homepage: www.elsevier.com/locate/orl

Grid scheduling with NBU service timesYusik Kim a, Rhonda Righter b,∗, Ronald Wolff ba INRIA Bordeaux - Sud-Ouest, 351 cours de la Libération, 33405 Talence Cedex, Franceb Department of IEOR, University of California, 4141 Etcheverry Hall, Berkeley, CA 94720, USA

a r t i c l e i n f o

Article history:Received 31 January 2009Accepted 16 August 2010Available online 16 September 2010

Keywords:Multi-server queuesStochastic schedulingGrid computingNBU processing times

a b s t r a c t

In the highly unpredictable environment of grid computing, it often makes sense to replicate the samejob on several of the available servers. We prove that never replicating is optimal for a heterogeneousmulti-server system in a random environment when service times are New Better than Used.

© 2010 Elsevier B.V. All rights reserved.

1. Introduction

Grid computing [2] is a new distributed computing paradigmwhere Internet-connected servers are integrated to form afast virtual computer. Examples of existing projects includeSETI@Home and Folding@Home. Servers participating in the gridhave different owners, a system load that varies over time, andunknown local scheduling policies. The service time (the timebetween submission and completion) of a job on a server dependson the server rather than properties of the job, and is highlyvariable. In this situation, it may make sense to assign replicasof a job to several servers simultaneously and use the output ofwhichever finishes first. While this is clearly not optimal whenprocessing times are deterministic, various amounts of replicationcan be shown to be optimal depending on the variability of theprocessing times, where generally more variability implies morereplication [3]. By replicating, we complete individual jobs morequickly at the expense of occupying servers that could otherwisebe used towork on different jobs.Whether or not it is aworthwhiletrade-off depends on the service time distribution.

We model the computing grid as a heterogeneous multi-server queueing system, with an infinite input queue. The infinitequeue is reasonable for large-scale computing applications suchas SETI@Home in which there are a vast number of similarcomputations to be done. We also assume that there is a singlejob class, and that all randomness and heterogeneity in processingis due to the server. This is the case when all jobs perform the

∗ Corresponding author.E-mail addresses: [email protected] (Y. Kim), [email protected]

(R. Righter), [email protected] (R. Wolff).

0167-6377/$ – see front matter© 2010 Elsevier B.V. All rights reserved.doi:10.1016/j.orl.2010.08.009

same subroutine on different data, as in Single-Program-Multiple-Data (SPMD) parallel programs, and for many grid applications.There is an environmental state that may affect server speeds andavailabilities and that evolves independently of the schedulingactions. We assume that given the environmental state, theprocessing times of (replicas of) the same job on different serversare independent, which is reasonable in a grid environment inwhich different servers are at different locations and belong todifferent entities. We assume that communication delays caneither be ignored, which is realistic when computation times aresignificantly longer than communication times, and is generallythe case for applications in which grid computing makes sense,or they can be incorporated into the processing times. We alsoassume that if a job is replicated and finishes on one server, there isno cost for deleting, or overwriting, (replicas of) that job fromotherservers. This assumption also is reasonable in grid environments inwhich the reason one server takes longer than another to processthe same job is because it is either unavailable or busywith its own,higher priority, work.

Koole and Righter [4] also modeled the grid as a multi-serverqueueing system but with general job arrival processes. Theyproved that when service times are New Worse than Used (NWU—defined later), max-replication, i.e., replicating each job so that allservers are occupied by a single job and its replicas, stochasticallymaximizes the number of job completions by any time. Theyalso proved an analogous result for New Better than Used (NBU—defined later) service times; that no-replication, i.e., not replicatingjobs at all and processing distinct jobs in parallel, stochasticallymaximizes the number of jobs completed up to time t for all t ≥ 0when there are two servers and at least two jobs.

Borst et al. [1] showed that minimal replication is optimalfor geometric service times. Here, no job is replicated when the

Page 2: Grid scheduling with NBU service times

Y. Kim et al. / Operations Research Letters 38 (2010) 502–504 503

number of jobs is at least as large as the number of servers, andotherwise the difference in the number of replicas for each job isminimized while keeping all servers busy.

Clearly, in the deterministic case, never replicating is optimaleven when there are fewer jobs than servers. In the exponentialcase, any replication policy is optimal, as long as all servers are keptbusy.

In this paper, we complement Koole and Righter’s [4] resultfor NBU service times by generalizing the number of servers ton ≥ 2 while requiring an infinite number of jobs available attime 0. We prove that for NBU service times, the policy thatnever idles or replicates jobs, among all scheduling policies thatallow both idling and replication, stochastically maximizes thejob completion process. Note that this policy has the additionalpractically attractive feature of decoupling the servers. That is, wecan assume that each server has its own infinite input queue ofjobs, and that they all operate independently of each other. Alsonote that the non-idling, non-replicating policywill still be optimaleven if we relax our assumption that replication can be done at nocost.

Note that when there are not an infinite number of jobs (jobsarrive, say) and there are more than two servers, the optimalpolicy is unclear. In particular, if there are fewer jobs than servers,and there is no penalty for preempting and replacing a replicate(when a new job arrives), it is clear that jobs should be replicated.However, it is unclear which jobs should be replicated, and which,if any, should be preempted when a new job arrives.

2. Results

Let X be a non-negative random variable. We say that X is NewBetter (Worse) than Used or NBU (NWU) if X ≥st(≤st)[X−t|X > t]for all t ≥ 0. We assume that nominal service times on a server,given the environmental state, are NBU, and independent of otherservers. This is weaker than IFR (increasing in failure rate), whereX is IFR if [X−t|X > t] is stochastically decreasing in t for all t ≤ 0.

Let Nt (N ′t) denote the number of successful job completions

under policy π (π ′) up to time t . We define the preference relationπ ′

≽ π if {N ′t}

t=0 ≥st{Nt}∞

t=0. That is, we can couple the departureprocesses such that every departure is earlier under π ′ than underπ with probability 1. The NINR (Non-Idling No-Replication) Policyis the policy that never idles a server and never replicates a job.In other words, each server independently serves its own infinitesupply of jobs.

For a single server, say server 1, with an infinite number ofjobs and NBU service times, it is not hard to show that anynon-idling, non-preemptive policy stochastically maximizes thedeparture process from that server. When server 1 is part of amulti-server system with the option of replication, it is also easyto see intuitively that the effect of replication is to introduce thepossibility, from server 1’s point of view, that some of its jobs maybe preempted and discarded. Hence, intuitively, replication can beseen to be non-optimal, and indeed, these observations basicallyprove that not replicating stochastically maximizes {N(i)t}∞t=0 foreach i, where N(i)t is the number of departures from server iby time t . However, we want to show the optimality of neverreplicating in the sense of maximizing the joint departure processwhen servers’ policies have an effect on each other, which is moredifficult. A careful coupling argument is necessary to relate thestandard multi-server model without server interaction to themodel where server interaction is induced by job replication. Thestandard approach is to consider an arbitrary policy that replicates,at time t on server 1 say, and couple its sample path with that of apolicy that does not replicate at that time on that server, so that thelatter sample path hasmore total departures fromall servers acrossall time. However, changing the replication decision at server 1will

change the future sample path in terms of job completions on theother servers. Hence, we need to be a bit careful.

Call our original Model A, and consider a relaxed Model B, inwhichwe can decide the identity of a job on any server at any time,and, in particular, when there is a job completion. Clearly this givesus more freedom than having to determine job identities upon thestart of service. When a job completes, if we suppose that someother jobs are replicates, then we basically induce a preemption atthe servers of those replicates. Hence, a further relaxation of ourmodel is model C, in which we assume that each server has itsown independent set of jobs, and that we can preempt any serverat any time based on the histories of all the servers. Preemptedjobs are assumed to be discarded. For any policy and any samplepath in Model B, there is a corresponding sample path and policyfor Model C, and similarly, for any policy and any sample path inModel A, there is a corresponding sample path and policy forModelB. Therefore, if the optimal policy for the relaxed Model C can beachieved with a policy that is permitted in Model A, this policymust also be optimal for Model A. Thus, since a non-idling non-preemptive policy for Model C can be achieved by the NINR policyin Model A, we need only show the following.

Theorem 2.1. For Model C, a heterogeneous multi-server system in arandomenvironment, inwhich idling and preemption are permitted atany time on any server, and may depend on the history of the process,if there are an infinite number of jobs and service times are NBU, forany idling and/or preemptive policy π , there exists a non-idling non-preemptive policy π ′ such that π ′

≽ π .

Proof. I. Non-idling: Suppose an idling policy π idles some serverj during some time interval [a, b]. Since there are an infinitenumber of jobs in the queue, construct a policy π ′, with coupledenvironment and service times, that agrees with π before time aand agrees with π for the other servers during [a, b], but assigns ajob that is distinct from all other assigned jobs to server j at timea. If the job finishes before time b, then assign another distinctjob. Repeat until time b. At time b, preempt and discard the jobcurrently running on server j and let π ′ agree with π from thenon. Then π ′ has more departures than π up to any time t . Theresult follows by repeating the argument for all servers and idleintervals.

II. Non-preemptive: Now suppose that under π some job ispreempted and discarded, and suppose the first preemptionhappens at time t1, on server j, say. We argue that we can improveπ by not preempting server j at time t1. By repeating the argument,and using our non-idling result above, we can conclude that apolicy that never preempts nor idles is optimal. Consider a policyπ ′

that agrees with π before t1 but does not preempt on server j at t1.Because of ourNBU assumption,we can construct coupled nominalpotential remaining service times Y , Y ′ on the same probabilityspace such that P(Y ≥ Y ′) = 1, where Y is the nominal servicetime for a new job started at time t1 under π , and Y ′ is theremaining nominal service time of the job in process at time t1under π ′. Let us also couple the future environmental state underboth policies, so that the speeds and availabilities of the serversare identical. Then, assuming the job on server j at time t+1 is notpreempted, its completion time under policies π and π ′ will beC(Y ) ≥ C(Y ′) respectively, where C is an increasing function thatdepends on the server speed over the course of processing. Let π ′

agreewithπ on the other servers from time t1 until either server j ispreempted again underπ or the job completes on server j underπ ′

(at time t1 +C(Y ′) ≤ t1 +C(Y )). If the preemption under π occursfirst, letπ ′ also preempt, and let it agreewithπ ′ from then on for allservers, so π ′

≽ π in the non-strict sense since all job completiontimes under π ′ and π are identical. Otherwise, if the job completesfirst under π ′, let π ′ idle server j and agree with π for the otherservers from time t1 + C(Y ′) until there is either a job completion

Page 3: Grid scheduling with NBU service times

504 Y. Kim et al. / Operations Research Letters 38 (2010) 502–504

or a preemption on server j under π , whichever comes first. Let uscall this time t2. If t2 is induced by a job completion, the numberof jobs completed under π and π ′ are equal at t2. If t2 is inducedby a preemption, there is exactly 1 more job completed under π ′

than π . In either case, the number of jobs completed under π ′ isat least as large as that under π at time t2. Finally, let π ′ agreewith π for all servers after time t2 and couple the sample pathsunder the two policies thereafter. Now we have π ′

≽ π . By ournon-idling argument above, we can find a non-idling π ′′ such thatπ ′′

≽ π ′. We apply the same reasoning to π ′′ until there are nomore preemptions and we arrive at π∗. �

The following corollaries for our original Model A are immedi-ate.

Corollary 2.2. The NINR policy stochastically maximizes {Nt}∞

t=0.

Corollary 2.3. The NINR policy maximizes the long-run throughput,limt→∞

Ntt . Hence the maximal throughput is n/EX where X is a

generic service time.

Corollary 2.4. For a system with a finite number of initial jobs andan arbitrary arrival process, the NINR policy maximizes the stability

region, i.e., the system will be stable under NINR as long as the arrivalrate, λ, is smaller than n/EX.

Corollary 2.5. For a system with a finite number m of initial jobswherem > n and an arbitrary arrival processwith arrival rate smallerthan n/EX, the NINR policy stochastically minimizes the time until thesystem empties its queue, i.e., until only n jobs are left in the system.

Acknowledgements

We would like to thank the referee for helpful feedback thatimproved our presentation.

References

[1] S. Borst, O. Boxma, J.F. Groote, S.Mauw, Task allocation in amulti-server system,Journal of Scheduling 6 (2003) 423–436.

[2] I. Foster, C. Kesselman, S. Tuecke, The anatomy of the grid: enabling scalablevirtual organizations, International Journal of High Performance ComputingApplications 15 (3) (2001) 200–222.

[3] Y. Kim, Resource management for large scale unreliable distributed systems,Ph.D. Dissertation, UC, Berkeley, 2009.

[4] G. Koole, R. Righter, Resource allocation in grid computing, Journal ofScheduling 11 (2008) 163–173.