UNIVERSIDAD DE CHILE FACULTAD DE CIENCIAS F´ISICAS Y ...jverschae/memoria.pdf · approximation algorithms for scheduling orders on parallel machines submitted in partial fulfillment

UNIVERSIDAD DE CHILE

FACULTAD DE CIENCIAS FISICAS Y MATEMATICAS

DEPARTAMENTO DE INGENIERIA MATEMATICA

APPROXIMATION ALGORITHMS FOR SCHEDULING ORDERS ON

PARALLEL MACHINES

SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF MATHEMATICAL CIVIL ENGINEER

SUPERVISOR:

JOSE RAFAEL CORREA HAEUSSLER

COMMITTEE:

MARCOS ABRAHAM KIWI KRAUSKOPF

ROBERTO MARIO COMINETTI COTTI-COMETTI

SANTIAGO, CHILE

AUGUST 2008

SUBMITTED AS PARTIALFULFILLMENT FOR THE DEGREEOF MATHEMATICAL CIVILENGINEERBY: JOSE C. VERSCHAE T.DATE: 18/08/2008SUPERVISOR: JOSE R. CORREA

“APPROXIMATION ALGORITHMS FOR SCHEDULING ORDERS ON PARALLELMACHINES”

The purpose of this thesis was to study the problem of scheduling orders on machines. Inthis problem a producer has an amount of machines in which must process a set of jobs. Eachjob belongs to an order, corresponding to a request of a client. On the other hand, the jobshave a processing time, which might depend on the machine on which is being processed, anda release date. Finally, each order has an associated weight depending on how important isto the producer. The completion time of an order is the point in time when all of its jobs hasbeen processed. The producer must decide when and in which machine each job is processed,with the objective of minimizing the weighted sum of completion times of orders.

This model generalizes several classical scheduling problems. First, the objective functionin our problem includes as a special case the objective of minimizing the maximum completiontime (makespan) and the weighted sum of completion times of jobs. Furthermore, in this thesisis shown that our model also generalizes the problem of minimizing the sum of weightedcompletion times of jobs in one machine under precedence constrains.

Being all these problemsNP-hard, their apparent intractability suggest to search efficientalgorithms that yield a solution whose cost is near to the optimum. Is with this objectivethat, based on time-indexed linear relaxations, a 27/2-approximation algorithm was proposedfor the more general setting previously described. This is the first algorithm with a constantapproximation guarantee for this problem, which improves the result of Leung, Li and Pinedo(2007). Based on similar techniques, in the case where jobs can be preempted, an algorithmswith an approximation guarantee arbitrarily closed to 4 was obtained.

Also, a polynomial time approximation scheme (PTAS) was found in the case the ordersare disjoint, and the machines are identical and constantly many. Furthermore, it was con-cluded that a variant of this approximation scheme can be applied to the case where thenumber of machines is part of the input, but the amount of jobs per order or the amount oforders is a constant.

Finally, the problem of minimizing the makespan on unrelated machines was studied,obtaining an algorithm that transforms a solution with nonpreemptive jobs to one where nojobs is preempted, and the makespan of the solution is increased at most by a factor of 4.Moreover, it was proven that is not possible to find an algorithm with a better guarantee.

i

Acknowledgments

First of all, I want to thank my parents and brothers for instilling in me the love of thinking.

Their constant support helped me throughout all my career. I thank my brother Rodrigo for

always listen to me and discuss my writing.

To my loving wife Natalia, that with her help, love, patience and unconditional support

helped me finishing this thesis.

I specially thank my advisor Jose R. Correa, that through long hours of discussions

introduced me to the world of investigation. More than only help me in my work, he gave me

friendship and support in general. Without his constant support this thesis would not have

carried out successfully.

To all the students of the mathematics department of the University of Chile, for always

being willing to talk and cheer me up.

I also thank Martin Skutella who received me in my staying in Germany during September

and October 2007. His collaboration and important contributions made possible Chapter 4

of this writing. I also thank all his group in TU-Berlin, for making pleasant my staying in

Berlin. I also thank Nicole Megow for offering me her friendship and support.

ii

Contents

1 Introduction 1

1.1 Machine scheduling problems . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Approximation algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Polynomial time approximation schemes . . . . . . . . . . . . . . . . . . . . 7

1.4 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5.1 Single machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5.2 Parallel machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.5.3 Unrelated machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.6 Contributions of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 On the power of preemption on R||Cmax 18

2.1 R|pmtn|Cmax is polynomially solvable . . . . . . . . . . . . . . . . . . . . . . 18

2.2 A new rounding technique for R||Cmax . . . . . . . . . . . . . . . . . . . . . 22

2.3 Power of preemption of R||Cmax . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3.1 Base case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.2 Iterative procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Approximation algorithms for minimizing∑

wLCL on unrelated machines 32

3.1 A (4 + ε)−approximation algorithm for

R|rij, pmtn|∑wLCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2 A constant factor approximation for R|rij|∑

wLCL . . . . . . . . . . . . . . 37

4 A PTAS for minimizing∑

wLCL on parallel machines 41

4.1 Algorithm overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

iii

4.3 Polynomial Representation of Order’s Subsets . . . . . . . . . . . . . . . . . 53

4.4 Polynomial Representation of Frontiers . . . . . . . . . . . . . . . . . . . . . 55

4.5 A PTAS for a specific block . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.6 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5 Concluding remarks and open problems 63

iv

Chapter 1

Introduction

1.1 Machine scheduling problems

Machine scheduling problems deal with the allocation of scarce resources over time. They

arise in several and very different situations, for example, a construction site where the boss

has to assign jobs to each worker, a CPU that must process tasks asked by several users, or

a factory’s production lines that must manufacture products for its clients.

In general, an instance of a scheduling problem contains a set of n jobs J , and a set of m

machines M where the jobs in J must be processed. A solution of the problem is a schedule,

i.e., an assignment that specifies when and on which machines i ∈ M each job j ∈ J is

executed.

To classify scheduling problems we have to look at the different characteristics or at-

tributes that the machines and jobs have, as well as the objective function to be optimized.

One of these is the machine environment, or the characteristics of the machines on our model.

For example, we can consider identical or parallel machines, where each machine is an iden-

tical copy of all the others. In this setting each job j ∈ J takes a time pj to be processed,

independent of the machine in which is scheduled. On the other hand, we can consider a

more general situation where each machine i ∈M has a different speed si, and then the time

that takes to process job j on it is inversely proportional to the speed of the machine.

Additionally, scheduling problems can be classified depending on job’s characteristics.

Just to name a few, our model may consider nonpreemptive jobs, i.e. jobs cannot be inter-

rupted until they are completed, or preemptive jobs, i.e. jobs that can be interrupted at any

time and later resumed on the same or in a different machine.

1

Also, we can classify problems depending on the objective function. One of the more

naturals objective functions is to minimize the makespan, i.e., to minimize the point in time

at which the last job finishes. More precisely, if for some schedule we define the completion

time of a job j ∈ J , denoted as Cj, as the time where job j ∈ J finishes processing, then the

objective is to minimize Cmax := maxj∈J Cj. Other classical example consists on minimizing

the number of late jobs. In this setting, each job j ∈ J has a deadline dj and the objective is

to minimize the number of jobs that finish processing after its deadline. As these, there are

several other different objective functions that can be considered.

A large amount of scheduling problems can be consired by combining the characteristics

just mentioned. So, it becomes necessary to introduce a standard notation for all these

different problems. For this, Grahams, Lawler, Lenstra and Rinnooy Kan [20], introduced

the “three field notation”, where a scheduling problem is represented by an expression of

the form α|β|γ. Here, the first field α denotes the machine environment, the second field β

contains extra constrains or characteristics of the problem, and the last field γ denotes the

objective function. In the following we describe the most common values for α, β and γ.

1. Values of α.

• α = 1 : Single Machine. There is only one machine at our disposal to process the

jobs. Each job j ∈ J takes a given time pj to be processed.

• α = P : Parallel Machines. We have a number m of identical or parallel machines

to process the jobs. Then, the processing time of job j is given by pj, independently

of the machine where job j is processed.

• α = Q: Related Machines. In this setting each machine i ∈ M has a speed si

associated. Then, the processing time of job j ∈ J on machine i ∈ M equals

pj/si, where pj is the time it takes to process j in a machine of speed 1.

• α = R: Unrelated Machines. In this more general setting there is no a priori

relation between the processing times of jobs on each machine, i.e., the processing

time of job j ∈ J on machine i ∈M is an arbitrary number denoted by pij.

Additionally, in the case that α = P,Q or R, we can add the letter m at the end of the

field indicating that the number of machines m is constant. Then, for example, if under

a parallel machine environment the number of machines is constant, then α = Pm. The

value of m can also be specified, e.g., α = P2 means that there are exactly 2 parallel

machines to process the jobs.

2

2. Values of β.

• β = pmtn: Preemptive Jobs. In this setting we consider jobs that can be pre-

empted, i.e., jobs that can be interrupted and resume later on the same or on a

different machine.

• β = rj: Release Dates. Each job j ∈ J has associated a release date rj, such that

j cannot start processing before that time.

• β = prec: Precedence Constrains. Consider a partial order relation over the jobs

(J,≺). If for some pair of jobs j y k, j ≺ k, then k must start processing after the

completion time of job j.

3. Values of γ.

• γ = Cmax: Makespan. The objective is to minimize the makespan Cmax :=

maxj∈J Cj.

• γ =∑

Cj: Average Completion Times. We must minimize the average of the

completion times, or equivalently∑

j∈J Cj.

• γ =∑

wjCj: Sum of weighted Completion Times. Consider a weight wj for each

j ∈ J . Then, the objective is to minimize the sum of weighted completion time∑

j∈J wjCj.

It is worth noticing that by default we consider nonpreemptive jobs. In other words,

if the field β is empty, then jobs cannot be preempted. For example, R||∑wjCj denotes

the problem of finding a nonpreemptive schedule of a set of jobs J on a set of machines

M , where each job j ∈ J takes pij units of time to process in machine i ∈ M , minimizing∑

j∈J wjCj. As a second example, R|rj|∑

wjCj denotes the same problem as before, with

the only difference that a job j can only start processing after rj. Also, note that the field β

can take more than just one value. For example, R|prec, rj |∑

wjCj is the same as the last

problem, but adding precedence constrains.

Over all scheduling problems, most non-trivial ones are NP-hard and therefore there

is no polynomial time algorithm to solve them unless P = NP . In particular, as we will

show later, one of the fundamental problems in scheduling, P2||Cmax, can be easily proven

NP-hard. In the following section we describe some general techniques to address NP-hard

optimization problems and some basic applications to scheduling.

3

1.2 Approximation algorithms

The introduction of the NP-complete class given by Cook [11], Karp [24] and independently

Levin [31], left big challenges about how these problems could be tackle given their apparent

intractability. One option that has been widely studied is the use of algorithms that com-

pletely solves the problem, but has no polynomial upper bound on the running time. This

kind of algorithm can be useful in small to medium instances, or in instances with some

special structure where the algorithm runs fast enough in practice. Nevertheless, there may

be other instances where the algorithm takes exponential time to finish, becoming impracti-

cal. The most commons of this approaches are Branch & Bound, Branch & Cut and Integer

Programming techniques.

For the special case of NP-hard optimization problems, another alternative is to use

algorithms that runs in polynomial time, but may not solve the problem to optimality. Among

this kind of algorithms, a particularly interesting class is “approximation algorithms”, i.e.,

algorithms in which the solution is guaranteed to be, in some sense, close to the optimal

solution.

More formally, let us consider a minimization problem P with cost function c. For α ≥ 1,

we say that a solution S to P is an α-approximation if it cost c(S) is within a factor α from

the cost of the optimal OPT , i.e., if

c(S) ≤ α ·OPT. (1.1)

Now, consider a polynomial-time algorithm A whose output over instance I is A(I). Then,

A is an α-approximation algorithm if for any instance I, A(I) is an α-approximation. The

number α is called the approximation factor of algorithm A, and if α does not depends on

the input we say the A is a constant factor approximation algorithm.

Analogously, if P is a maximization problem with objective function c, a solution S is an

α-approximation, for α ≤ 1, if

c(S) ≥ α ·OPT.

As before, for α ≤ 1, an algorithm A is an α-approximation algorithm if A(I) is an α-

approximation for any instance I. On the remaining of this document we will only study

minimization problems, and therefore we will not use this definition.

One of the firsts approximation algorithm for an NP-hard optimization problem was

presented by R.L. Graham [19] in 1966, even before the notion of NP-completeness was

4

formally introduced. Graham studied the problem of minimizing the makespan on parallel

machines, P ||Cmax. He proposed a greedy algorithm consisting on: (1) Order the jobs ar-

bitrarily, (j1, . . . , jn); (2) For k = 1, . . . , n, schedule job jk on the machine where it would

begin processing first. Such a procedure is called a list-scheduling algorithm.

Lemma 1.1 (Graham 1966 [19]). List-scheduling is a (2 − 1/m)-approximation algorithm

for P ||Cmax.

Proof. First notice that if OPT denotes the makespan of the optimal solution, then

OPT ≥ 1

m

∑

j∈J

pj, (1.2)

since otherwise the total amount of machine time needed to process all jobs would be less

than∑

j∈J pj. Let ℓ be such that Cjℓ= Cmax, and denote Sj = Cj − pj the starting time of

a job j ∈ J . Then, noting that at the ℓ-th step of the algorithm all machines were busy at

time Sjℓ,

Sjℓ≤ 1

m

ℓ−1∑

k=1

pjk,

and therefore,

Cmax = Sjℓ+ pjℓ

≤ 1

m

ℓ∑

k=1

pjk+ (1− 1

m)pjℓ≤(

2− 1

m

)OPT, (1.3)

where the last inequality follows from (1.2) and the fact that pjℓ≤ OPT , since no schedule

can finish before pj for any j ∈ J .

As we could observe, a crucial step in the previous analysis is to obtain a good lower

bound on the optimal solution (for example Equation (1.2) in last lemma), to then use it to

upper bound the solution given by the algorithm (as in Equation (1.3)). Most techniques

to find lower bounds are problem specific, and therefore is hard to give general rules of how

to find them. One of the few exceptions that has been proven useful in a wide variety of

problems, consists on formulating the optimization problem as a integer program, and later

relax its integrality constrains. Clearly, the optimal solution of the relaxed problem must be

a lower bound on the optimal solution of the original problem. An algorithm that uses this

technique is called a LP-based approximation algorithm. To illustrate this idea, consider the

following problem.

5

Minimum Cost Vertex-Cover:

Input: A graph G = (V,E), and a cost function c : V → Q over the vertices.

Objective: Find a vertex-cover, i.e., a set B ⊆ V that intersects every edge in E,

minimizing the cost c(B) =∑

v∈B c(v).

It is easy to see that this problem is equivalent to the following integer program:

[LP] min∑

v∈V

yvc(v) (1.4)

yv + yw ≥ 1 for all vw ∈ E, (1.5)

yv ∈ 0, 1 for all v ∈ V. (1.6)

Therefore, by replacing Equation (1.6) by yv ≥ 0, we obtain a linear program whose

optimal value is a lower bound on the optimal of the Minimum Cost Vertex-Cover

problem. To get a constant factor approximation algorithm, we proceed as follows. First

solve [LP] (by, for example, using the ellipsoid method), and call the solution y∗v . To round

this fractional solution first note that Equation (1.5) implies that for every edge vw ∈ E

either y∗v ≥ 1/2 or y∗

w ≥ 1/2. Then, the set B = v ∈ V |y∗v ≥ 1/2 is a vertex-cover, and

furthermore we can bound its cost as,

c(B) =∑

v:y∗v≥1/2

c(v) ≤ 2∑

v∈V

y∗vc(v) ≤ 2OPTLP ≤ 2OPT, (1.7)

where OPT denotes the cost of the optimum solution of the vertex-cover problem and OPTLP

is the solution of [LP]. Thus, the algorithm just described is a 2-approximation algorithm.

Noting that OPT ≤ c(B), Equation (1.7) implies that

OPT

OPTLP

≤ 2,

for any instance I of the Minimum Cost Vertex-Cover. More generally, any α- approx-

imation algorithm that uses OPTLP as a lower bound must satisfy

maxI

OPT

OPTLP≤ α.

The left hand side of this last equation is called the integrality gap of the linear program.

6

Finding a lower bound on the integrality gap is a common technique to see what is the best

approximation factor that a linear program can yield. To do this we just need to find a

instance with a large ratio OPT/OPTLP . For example, is easy to show that the rounding

we just described for Minimum Cost Vertex-Cover is best possible. Indeed, considering

the graph G as the complete graph of n vertices and the cost function c ≡ 1, we get that

OPT = n− 1 and OPTLP = n/2, and thus OPT/OPTLP → 2 when n→∞.

1.3 Polynomial time approximation schemes

For a given NP-hard problem, it is natural to ask what is the best possible approximation

algorithm in terms of its approximation factor. Clearly, this depends on the problem. On

one side, there are some problems that do not admit any kind of approximation algorithms

unless P = NP . For example, the travelling salesman problem with binary costs cannot be

approximated up to any factor. Indeed, if there exists an α-approximation algorithm for this

problem, then we can use it to decide whether exists or not a hamiltonian circuit of cost

zero: If the optimum solution is zero, then the approximation algorithm must return zero by

(1.1), independently of the value of α; If the optimum solution is greater than zero then the

algorithm will also return a solution with cost greater than zero.

On the other hand, there are some problems that admit arbitrarily good approximation

algorithms. To formalize this idea we define a polynomial time approximation scheme (PTAS)

as a collection of algorithms Aεε>0 such that each Aε is a (1 + ε)-approximation algorithm

that runs in polynomial time. Let us remark that ε is not considered as part of the input,

and therefore the running time of the algorithm could depend exponentially on ε.

A common technique to find a PTAS is to “round” the instance such that the solution

space is significantly decreased, but the value of the optimal solution is only slightly changed.

Later, we can use exhaustive search or dynamic programming to find an optimal or near-

optimal (i.e. a (1 + ε)-approximation) solution to the rounded problem. To obtain an

almost-optimal solution to the original problem, we transform the solution of the rounded

instance without increasing the cost in more than a 1 + O(ε) factor.

We briefly show this technique by applying it to P2||Cmax, i.e. the problem of minimizing

the makespan on two parallel machines. Consider a fixed 0 < ε < 1, and call OPT the

makespan of the optimal solution. We will show how to find a schedule of makespan less

than (1+ε)2OPT ≤ (1+3ε)OPT , which is enough by redefining ε← ε/3. Begin by rounding

7

up the values of each pj to powers of (1 + ε),

pj ← (1 + ε)⌈log1+ε pj⌉.

With this, the processing time of each job is increased in at most a (1+ε) factor, and so is the

optimal makespan. In other words, by denoting OPTr the optimal makespan of the rounded

instance, OPTr ≤ (1 + ε)OPT . Then, it would be enough to find a (1 + ε)-approximation of

the rounded instance, since using that assignment of jobs to machines on the original problem

would only decreases the makespan of the solution, thus yielding a (1 + ε)2-approximation.

For this, let P = maxj pj, and define a job to be “big” if pj ≥ εP and “small” otherwise.

Thanks to our rounding, the amount of different values the processing time of a big job can

take is less than ⌊log1+ε 1/ε⌋+1 = O(1). Also, notice that a schedule of big jobs is determined

by specifying how many jobs of each size are assigned to each of the two machines. Thus,

we can enumerate all schedules of big jobs in time n⌊log1+ε 1/ε⌋+1 = nO(1) = poly(n), and take

the one with the shortest makespan.

To schedule small jobs, notice that a list-scheduling algorithm is enough: process each job

one step at a time, in any order, on the machine that would finish first. Clearly, this yields

a (1 + ε)-approximation for the rounded instance. Indeed, if after adding the small jobs the

makespan was not increased, then the solution constructed is optimal. On the other hand, if

adding the small jobs increased the makespan, then the difference between the makespan of

both machines is less than εP ≤ εOPTr. Therefore, the makespan of the solution constructed

is less than (1 + ε)OPTr ≤ (1 + ε)2OPT . Thus, we can construct a (1 + ε)2-approximation

of the original problem in polynomial-time.

Although the algorithm that we just showed runs in polynomial-time for any fixed ε, the

running time increases exponentially when ε decreases. Thus, we may ask if we can do even

better, e.g., if we can find a PTAS for which the running time is also polynomial in ε. Such

a scheme is called a fully polynomial time approximation schemes(FPTAS). Unfortunately,

there are only few problems that admits an FPTAS. Indeed, it can be shown that any strongly

NP-hard problem cannot admit a FPTAS, unless P = NP (see for example [42] Ch. 8).

In the next section we will describe the problem that we are going to work on this thesis.

Not surprisingly the problem is NP-hard, and thus the tools discussed in this and in the

previous sections will be helpful to study it.

8

1.4 Problem definition

In this writing we study a natural scheduling problem arising in manufacturing environments.

Consider a setting where clients place orders, consisting of one or more products, to a given

manufacturer. Each product has a machine dependant processing requirement, and has to

be processed on any of m machines available for production. The manufacturer has to find

a schedule so as to give the best possible service to its clients.

In its most general form, the problem we consider is as follows. We are given a set of jobs

J and a set of orders O ⊆ P(J), such that⋃

L∈O L = J . Each job j ∈ J is associated with a

value pij which represents its processing time on machine i, while each order L has a weight

factor wL depending on how important it is for the manufacturer. Also, job j is associated

with a machine dependant release date rij, so it can only start being processed on machine

i by time rij. An order is completed once all its jobs have been processed. Therefore, if

Cj denotes the point in time at which job j is completed, CL = maxCj : j ∈ L denotes

the completion time of order L. The goal of the manufacturer is to find a nonpreemptive

schedule in the m available machines so as to minimize the sum of weighted completion time

of orders, i.e.,

min∑

L∈O

wLCL.

We refer to this objective function as the sum of weighted completion time of orders. Let us

remark that in this general framework we are not restricted to the case where the orders are

disjoint, and therefore one job may participate in the completion time of several orders.

To adopt the three field scheduling notation we denote this problem as R|rij|∑

wLCL, or

R||∑wLCL, in case all release dates are zero. When the processing times pij do not depend

on the machine, we exchange the “R” by a “P”. Also, when we impose the additional

constraint that orders are disjoint subsets of jobs we will add part in the second field β of

the notation.

As will be showed later, our problem generalizes several classic machine scheduling prob-

lems. Most notably, these include R||Cmax, R|rij|∑

wjCj and 1|prec|∑wjCj. Since all of

this are NP-hard in the strong sense (see for example [17]), then our more general setting

also is. It is somewhat surprising that the best known approximation algorithms for all these

problems have an approximation guarantee of 2 [4, 35, 37]. However, for our more general

setting, no constant factor approximation is known. The best known result, due to Leung,

Li, Pinedo and Zhang [29], is an algorithm for the special case of related machines (i.e.,

9

pij = pj/si, where si is the speed of machine i) and without release dates on jobs. The

approximation factor of the algorithm is 1 + ρ(m − 1)/(ρ + m − 1), where ρ is the ratio of

the speed of the fastest machine to that of the slowest machine. In general this guarantee is

not constant and can be as bad a m/2.

1.5 Previous work

To illustrate the flexibility of our model, we now review some relevant scheduling models in

different machine environments that lie in our framework.

1.5.1 Single machine

We begin by considering the problem of minimizing the sum of weighted completion time

of orders on one machine. First we study the simply case where no job belongs to more

than one order, 1|part|∑wLCL, showing that is equivalent to 1||∑wjCj. The later, as was

shown by Smith [41], can be solved to optimality by scheduling jobs in non-increasing order

of wj/pj. In the literature, this greedy algorithm is known as Smith’s rule.

To see that the these two problems are indeed equivalent, we first show that there is a

optimal schedule of 1|part|∑wLCL where all jobs of an order L ∈ O are processed consecu-

tively. To see this, consider an optimal schedule where this does not hold. Then, there exist

jobs j, ℓ ∈ L and k ∈ L′ 6= L, such that k starts processing at Cj, and ℓ is processed after

k. Thus, swapping jobs j and k, i.e. delaying j by pk units of time and bringing forward k

by pj units of time, does not increase the cost of the solution. Indeed, job k decreases its

completion time, and so CL′ is not increased. Also, order L does not increase its completion

time since job ℓ ∈ L, which is always processed after j, remains untouched. By iterating this

argument, we finish with a schedule where all jobs in an order are processed consecutively.

Therefore, each order can be seen as a larger job with processing time∑

j∈L pj, and thus our

problem is equivalent to 1||∑wjCj.

We now consider the more general problem 1||∑wLCL, where we allow jobs to belong

to several orders at the same time. We will prove that this problem is equivalent to single

machine scheduling with precedence constraints denoted by 1|prec|∑wjCj. Recall that in

this problem there is a partial order over the jobs meaning that, if j k, then job j

must finish being processed before job k begins processing. If j k we say that j is a

predecessor of k and k is a successor of j. This classic scheduling problem has attracted

10

much attention since the sixties. Lenstra and Rinnooy Kan [26] showed that this problem

is strongly NP-hard even with unit weights or unit processing times. On the other hand,

several 2-approximation algorithms have been proposed: Hall, Schulz, Shmoys & Wein [21]

gave a LP-relaxation based 2-approximation, while Chudak & Hochbaum [6] proposed an-

other 2-approximation based on a half-integral programming relaxation. Also, Chekuri &

Motwani [4], and Margot, Queyranne & Wang [32] independently developed a very simple

combinatorial 2-approximation. Furthermore, the results in [2, 12] imply that 1|prec|∑wjCj

is a special case of vertex cover. However, hardness of approximation results where unknown

until recently Ambuhl, Mastrolilli & Svensson [3] proved that there is no PTAS for this

problem unless NP-hard problems can be solved in randomized subexponential time.

We now show that 1||∑wLCL and 1|prec|∑wjCj are equivalent and therefore all results

known for the latter can be also be applied to 1||∑wLCL. First, let us see that every α-

approximation for 1|prec|∑wjCj implies an α-approximation for 1||∑wLCL. Let I = (J,O)

be an instance of 1||∑wLCL, where J is the job set and O the set of orders. We construct

an instance I ′ = (J ′,) of 1|prec|∑wjCj as follows. For each job j ∈ J there is a job j′ ∈ J ′

with pj′ = pj and wj′ = 0. Also, for every order L ∈ O we will consider an extra job j(L) ∈ J ′

with processing time pj(L) = 0 and weight wj(L) = wL. The only precedence constrains that

we will impose will be that j′ j(L) for all j ∈ L and every L ∈ O. Since pj(L) = 0, we can

restrict ourselves to schedules of I ′ where each j(L) is processed when the last job of L is

completed. Thus, it is clear that the optimal solutions to both problems have the same total

cost. Furthermore, it is straightforward to note that given an algorithm for 1|prec|∑wjCj

(approximate or not) we can simply apply it to instance I ′ above and impose that j(L) is

processed exactly when the last job of L is completed, without a cost increase. The resulting

schedule for I ′ can then be directly applied to the original instance I of 1||∑wLCL and its

cost will remain the same.

To see the other direction, let I = (J,) be an instance of 1|prec|∑wjCj. To construct

an instance I ′ = (J ′, O) of 1||∑wLCL, consider the same set of jobs J ′ = J and for every

job j ∈ J ′, we let L(j) ∈ O be the order k ∈ J : k j, and let wL(j) = wj. With this

construction the following lemma holds.

Lemma 1.2. Any schedule of I ′ can be efficiently transformed into a schedule of the same

instance, respecting the underlying precedence constraints and without increasing the cost.

Proof. Let k be the last job that violates a precedence constrain, and let j be the last job

that is a successor of k but is scheduled before k. We will show that delaying job j right after

11

k

k

j

j

Figure 1.1: Top: Original schedule. Bottom: Schedule after delaying j.

job k (see Figure 1.1) does not violate any new precedence constrain, and does not increase

the total cost. Indeed, if moving j after k violates a precedence constrains then there exists

a job j′ that was originally processed between j and k, such that j j′. Thus k j′,

contradicting the choice of j and k.

Also, note that every job but j diminishes its completion time. Furthermore, the com-

pletion time of each order containing j is not increased, since each such order also contained

job k and the completion time of j in the new schedule will be the same as the completion

of k in the old schedule.

With this lemma we conclude that the optimal schedule for instance I of 1|prec|∑wjCj

has the same cost as that for instance I ′ of 1||∑wLCL. Moreover, any α-approximate

schedule for instance I ′ of 1||∑wLCL can be transformed into a schedule for instance I of

1|prec|∑wjCj of the same cost. Thus, the following holds.

Theorem 1.3. The approximability thresholds of 1|prec|∑wjCj and 1||∑wLCL coincide.

1.5.2 Parallel machines

In this section we talk about scheduling on parallel machines, where the processing time of

each job j, pij = pj does not depend on the machine where is processed.

Recall the previously defined problem of minimum makespan scheduling on parallel ma-

chines, P ||Cmax, which consists in finding a schedule of n jobs in m parallel machines, so as

to minimize the maximum completion time. Notice that if in our setting O only contains

one order, then the objective function becomes maxj∈J Cj = Cmax, and therefore P ||Cmax is

a special case of P ||∑wLCL, which at the same time is a special case of our more general

model R|rij|∑

wLCL.

12

The problem P ||Cmax has been a classical machine scheduling problem. It can be easily

proven NP-hard, even for 2 machines. Indeed, consider the 2Partition problem where,

for a given multiset of positive integers S = a1, . . . , an, we must decide whether exists a

partition R, T ⊆ A, R⋃· T = A such that

∑j∈S aj =

∑j∈R aj = 1/2

∑j∈A aj. Then, for

a given multiset S, consider n jobs where job j = 1, . . . , n has processing time pj = aj .

Then, finding the minimum makespan schedule on two parallel machines would let us solve

2Partition: the minimum makespan equals 1/2∑

j∈J pj if and only if there exist sets

J1, J2 ⊆ J , J1

⋃· J2 = J , corresponding to the set of jobs processed in each machine, such

that∑

j∈J1pj =

∑j∈J2

pj = 1/2∑

j∈J pj. And thus, since 2Partition is NP-complete

[24, 17], we conclude that P2||Cmax is NP-hard. On the other hand, as showed in Lemma

1.1, a list-scheduling approach yields a 2-approximation algorithm. Furthermore, Hochbaum

and Shmoys [22] presented a PTAS for the problem (see also [42, Chapter 10]).

On the other hand, when on our model each order only contains one job, the problem

becomes equivalent to minimize the sum of weighted completion times of jobs∑

j∈J wjCj.

Thus, in this case, the parallel machine version of our problem with no release dates becomes

P ||∑wjCj. The study of this problem also goes back to the sixties (see [9] for an early

treatment). As in the makespan case, the problem becomes NP-hard already for two ma-

chines. On the other hand, a sequence of approximation algorithms had been proposed until

Skutella and Woeginger [40] found a PTAS for the problem. Later, Afrati et al. [1] extended

this result for the case on non-trivial release dates.

A natural question is thus to ask if there exists a PTAS for P |part|∑wLCL (notice that,

as shown in Section 1.5.1, the slightly more general problem P ||∑wLCL is unlikely to have

a PTAS). Although we do not know whether the latter holds, Leung, Li, and Pinedo [28]

(see also Yang and Posner [44]) presented a 2-approximation algorithm for this problem.

We briefly give an alternative analysis of Leung et al.’s algorithm by using a classic linear

programming framework, first developed by Queyranne [33] for the single machine problem.

Let Mj be the midpoint of job j in a given schedule, in other words, Mj = Cj − pj/2.

Eastman et al. [15] implicitly showed that for any set of jobs S ⊆ J and any feasible

schedule in m parallel machines, then the inequality:∑

j∈S pjMj ≥ p(S)2/2m is satisfied,

where p(S) =∑

j∈S pj. These inequalities are called the parallel inequalities. It follows that if

OPT denotes the value of an optimal schedule, then OPT is lower bounded by the following

linear program:

[LP] min∑

L∈O

wLCL

13

CL ≥ Mj + pj/2 for all L ∈ O and j ∈ L,∑

j∈S

pjMj ≥ p(S)2/2m for all S ⊆ N.

Queyranne [33] showed that [LP] can be solved in polynomial time since separating the

parallel inequalities reduces to submodular function minimization. Let M ∗1 , . . . ,M ∗

n be an

optimal solution and assume without loss of generality that M ∗1 ≤M ∗

2 ≤ · · · ≤M ∗n. Clearly,

C∗L = maxM ∗

j + pj/2 : j ∈ L, so the optimal solution is completely determined by the

M values. Consider the algorithm that first solves (LP) and then schedules jobs using a

list-scheduling algorithm according to the order M ∗1 ≤ M ∗

2 ≤ · · · ≤ M ∗n. Let CA

j denote the

completion time of job j in the schedule given by the algorithm, so that CAL = maxCA

j : j ∈L. It is easy to see that CA

j equals the time at which job j is started by the algorithm, SAj ,

plus pj. Furthermore, at any point in time before SAj all machines were busy processing jobs

in 1, . . . , j − 1, thus SAj ≤ p(1, . . . , j − 1)/m. It follows that

CAL ≤ max

j∈L

p(1, . . . , j − 1)

m+ pj

.

Also, M ∗j p(1, . . . , j) ≥∑l∈1,...,j plM

∗l ≥ p(1, . . . , j)2/2m. Then,

C∗L ≥ max

j∈L

p(1, . . . , j)

2m+

pj

2

.

We conclude that CAL ≤ 2C∗

L which implies that the algorithm returns a solution which

is within a factor of 2 of OPT. Furthermore, note that this approach not only works for

P |part|∑wLCL but also for P ||∑wLCL.

1.5.3 Unrelated machines

In the unrelated machine setting, our problem is also a common generalization of some classic

machine scheduling problems. As before, if there is a single order and rij = 0, our problem

becomes minimum makespan scheduling (R||Cmax), in which the goal is to find a schedule of

the n tasks in m unrelated machines so as to minimize the makespan. In a seminal work,

Lenstra, Shmoys and Tardos [27] give a 2-approximation algorithm for R||Cmax, and showed

that it is NP-hard to approximate it within a constant better than 3/2. Thus, the same

14

hardness result holds for R||∑wLCL.

On the other hand, if orders are singletons and rij = 0, our problem becomes minimum

sum of weighted completion times scheduling (R||∑wjCj). In this setting each job j ∈ J is

associated, with a processing time pij, and a weight wj. The goal is to find a schedule so as

to minimize sum of weighted completion times of jobs. As in the makespan case, the latter

problem was shown to be APX-hard [23] and therefore there is no PTAS, unless P = NP .

On the positive side, Schulz and Skutella [35] used a linear program relaxation to design

an approximation algorithm with performance guarantee of 3/2 + ε in the case without

release dates, and 2 + ε in the more general case. Furthermore, Skutella [38] refined this

result by means of a convex quadratic programming relaxation obtaining a 3/2-approximation

algorithm in the case of trivial release dates, and a 2-approximation algorithm in the more

general case.

Finally, it is worth mentioning that our problem also generalizes assembly scheduling

problems that have received attention recently, which we denote by A||∑wjCj (see e.g.

[7, 8, 30]). As explained before, in this setting we are given a set M with m machines and

a set of jobs J , with associated weights wj. Each job has m parts, one to be processed

by each machine. So, pij denotes the processing time of the i-th part of job j, that must

be processed on machine i. The goal is to minimize the sum of weighted completion times

(∑

wjCj ), where the completion time Cj of job j is the time by which all of its parts have

been processed. Thus, in our setting, a job with its m parts can be modelled as an order

that contains m jobs. To ensure that each of the jobs on each order can only be processed

on its correspondent machine, we give it infinity (or sufficiently large) processing time on all

the others machines.

Besides proving that the assembly line problem is NP-hard, Chen and Hall [7] and

Leung, Li, and Pinedo [30] independently gave a simple 2-approximation algorithm based in

the following linear programming relaxation of the problem:

[LP] min∑

j∈N

wjCj

∑

j∈S

pijCj ≥(pi(S)2 + p2

i (S))/2 for all i = 1, . . . ,m, S ⊆ N.

Similarly to the 2-approximation described for P ||∑wLCL in Section 1.5.2, the algorithm

15

consists in processing jobs according to the order given by an optimal LP solution. Clearly,

this is a 2-approximation. Indeed, consider C1 ≤ · · · ≤ Cn the optimal LP solution (after

reordering if needed) and let S = 1, . . . , k. Call CH and C∗ the heuristic and the optimal

completion time vectors respectively. Clearly, pi(S)Ck ≥∑

j∈S pijCj ≥ pi(S)2/2, hence

2Ck ≥ pi(S) for all i ∈M . It follows that CHk = max1≤i≤m pi(S) ≤ 2Ck, and then

∑wjC

Hj ≤

2∑

wjCj ≤ 2∑

wjC∗j , and thus the solution constructed is an 2-approximation.

1.6 Contributions of this work

In this thesis we develop approximation algorithms for R|rij|∑

wLCL and some of its par-

ticular cases. In Chapter 2 we begin by showing some techniques used in the subsequents

sections. First, we review the result of Lawler and Labetoulle [25] showing that R|pmpt|Cmax,

i.e. the problem of minimizing the makespan of preemptive jobs on unrelated machines, is

polynomially solvable. Later, we propose a way of rounding any solution of R|pmpt|Cmax

to a solution of R||Cmax, such that the cost of the solution is not increased in more than a

factor of 4. For this we use the classic rounding technique of Shmoys and Tardos [37] for the

generalized assignment problem. We conclude this chapter by showing that this rounding is

best possible. To this end we construct a sequence of instances for which the ratio between its

optimal preemptive makespan and its optimal nonpreemptive makespan is arbitrarily closed

to 4.

In Chapter 3 we generalize the techniques previously developed. We begin by giving a

(4 + ε)-approximation for R|pmpt, rij|∑

wLCL, i.e. for each fixed ε > 0 we show a (4 + ε)-

approximation algorithm. The algorithm is based on a time-index linear program relaxation

of the problem based on that of Dyer and Wolsey [13]. The rounding uses Lawler and La-

betoulle’s [25] result, described in the previous chapter. Also we show a 27/2-approximation

algorithm for R|rij|∑

wLCL. This is the first constant factor approximation algorithm

for this problem, and thus improves the non-constant factor approximation algorithm for

Q|part|∑wLCL proposed by Leung et al. [29]. Our approach is based on an interval-

indexed linear program proposed by Hall et al [21], and uses a very similar rounding to the

one showed in Chapter 2.

In Chapter 4 we design a PTAS for P ||∑wLCL, for the cases when the number of orders

is constant, the number of jobs inside each order is constant, or the number of machines

is constant. Our algorithm works in all three cases and thus generalizes the known PTASs

16

in [1, 22, 40]. Our approach follows closely the PTAS of Afrati et al. [1] for P |rj|∑

wjCj.

However, the main extra difficulty from that of Afrati et al. case, is that we might have

orders that are processed through a long period of time, and its cost is only realized when it

is completed. To overcome this issue, and thus be able to apply the dynamic programming

ideas in [1], we simplify the instance and prove that there is a near-optimal solution in

which every order is fully processed in a restricted time span. This requires some careful

enumeration plus the introduction of artificial release dates.

Finally, in Chapter 5 we summarize all the results, and then propose some possible direc-

tions for future investigation.

17

Chapter 2

On the power of preemption on

R||Cmax

In this chapter we study the problem of minimizing the makespan on unrelated machines,

R||Cmax, that as was explained before, is a special case of our more general problem of min-

imizing the sum of weighted completion time of orders on unrelated machines, R||∑wLCL.

The techniques in this chapter will give insight on how to give approximations algorithms for

the more general problems R|rij|∑

wLCL and R|rij, pmpt|∑wLCL.

In Section 2.1, we begin by reviewing the technique developed by Lawler and Labetoulle

[25] to solve R|pmtn|Cmax, that shows that this problem is equivalent to solving a linear

program. In Section 2.2, we give a quick overview of Lenstra, Shmoys and Tardos’s [27]

2-approximation algorithm for R||Cmax, and discuss why it is difficult to apply those ideas to

our more general setting. Then, we show how we can modify this result, getting one easier

to generalize. By doing this we obtain a rounding that turns any preemptive schedule to a

nonpremptive one, such that the makespan is not increased in more than a factor of 4. On

the other hand, in Section 2.3, we prove that this factor is best possible, i.e. there is no

rounding that converts a preemptive schedule to a nonpreemtive one with a guarantee better

than 4. We achieve this by iteratively constructing a family of almost tight instances.

2.1 R|pmtn|Cmax is polynomially solvable

We now present the algorithm developed by Lawler and Labetoulle, that computes the op-

timal solution of R|pmtn|Cmax. It is based on a linear programming formulation that uses

18

assignment variables xij, indicating the fraction of job j ∈ J that is processed on machine

i ∈ M . With this, it will be enough to give a way of converting any feasible solution of this

linear program to a preemptive schedule of equal makespan, i.e., we need to find a way of

distributing the fractions of each job inside each machine, such that no two fraction of the

same job are processed in parallel.

More precisely, let us consider the following linear program,

[LL] min C∑

i∈M

xij = 1 for all j ∈ J, (2.1)

∑

j∈J

pijxij ≤ C for all i ∈M, (2.2)

∑

i∈M

pijxij ≤ C for all j ∈ J, (2.3)

xij ≥ 0 for all i, j. (2.4)

It is clear that each preemptive schedule induces a feasible solution to [LL]. Indeed, given

any preemptive solution, denote C its makespan and xij the fraction of job j that is processed

on machine i. In other words, if yij denotes the amount of time that the schedule uses to

process job j on machine i, then xij = yij/pij. With this definition, the solution must satisfy

Equation (2.1) since every job is always completely scheduled. Furthermore, Equation (2.2)

is also satisfied since no machine i ∈ M can finish processing before∑

j pijxij. Similarly,

Equation (2.3) holds since no job j can be processed in two machines at the same time, and

thus the left hand side of this equation is a lower bound on the completion time of job j.

Let xij and C be any feasible solution of [LL]. Consider the following algorithm that

creates a preemptive schedule of makespan C.

Algorithm: Nonparallel Assignment

1. Define the values zij := pijxij/C, for all i ∈ M and j ∈ J . Note that the vector

(zij)ij belongs to the matching polyhedron P , of all yij ∈ Rnm satisfying the following

inequalities:

19

∑

i∈M

yij ≤ 1 for all j ∈ J, (2.5)

∑

j∈J

yij ≤ 1 for all i ∈M, (2.6)

yij ≥ 0 for all i, j. (2.7)

Also, note that P is integral, since the matrix that defines it is totally unimodular (see

for example [34] Ch. 18).

2. Note that by Caratheodory’s theorem [14, 16] it is possible to decompose vector z as

a convex combination of a polynomial number of vertices of P . More precisely, we can

find vectors Zk ∈ 0, 1nm⋂

P and scalars λk ≥ 0 for k = 1, . . . mn + 1, such that

zij =∑nm+1

k=1 λkZkij and

∑mn+1k=1 λk = 1.

3. Build the schedule as follows. For each i ∈ M,k = 1, . . . , nm + 1 such that Zkij = 1,

schedule job j in machine i, between time C∑k−1

ℓ=1 λℓ and C∑k

ℓ=1 λℓ.

We first show the correctness of the algorithm, and later show that it can be execute in

polynomial time.

Lemma 2.1. Let us consider xij and C satisfying equations (2.2), (2.3) and (2.4). Algo-

rithm: Nonparallel Assignment constructs a preemptive schedule of makespan at most

C, where the fraction of job j ∈ J processed on machine i ∈M is xij.

Proof. First, note that for each i ∈M and j ∈ J Algorithm: Nonparallel Assignment

process job j during pijxij units of time in machine i. Indeed, for each k = 1, . . . , nm + 1,

i ∈ M and j ∈ J such that Zkij = 1, the amount of time job j is processed on machine i

equals Cλk. Then, since Zk is binary, the total amount of time job j is processed in machine

i equalsnm+1∑

k=1

CλkZkij = Czij = pijxij.

Then, the fraction of job j that is processed in machine i is xij.

Furthermore, no job is processed in two machines at the same time. Indeed, if by con-

tradiction we assumed that there is a job that is processed in parallel, then there exist

20

k ∈ 1, . . . ,mn + 1, j ∈ J and i, d ∈ M such that Zkij = Zk

dj = 1. This implies that∑m

i=1 Zkij ≥ 2, contradicting that Zk belongs to P .

Finally, the makespan of the schedule is at most C, since the algorithm only assigns jobs

between time 0 and C∑mn+1

k=1 λk = C.

With this the following holds.

Corollary 2.2. To each feasible solution xij, C of [LL] corresponds a preemptive schedule

of makespan C and vice-versa.

Thus, to solve R|pmtn|Cmax it is enough to compute the optimal solution of [LL], and

then turn it to a preemptive schedule using Algorithm: Nonparallel Assignment.

Finally, we show that this algorithm runs in polynomial time.

Lemma 2.3. Algorithm: Nonparallel Assignment runs in polynomial time.

Proof. We just need to show that step (2) can be done in polynomial time. For this, consider

any polytope P = x ∈ RN |Ax ≤ b for some matrix A ∈M(R)K×N and vector b ∈ RK . For

any z ∈ P , we need to show how to decompose z as a convex combinations of vertices of P .

Clearly, it is enough to decompose z = λZ + (1− λ)z′, where λ ∈ [0, 1], Z is a vertex of P ,

and z′ belong to some proper face P ′ of P . Indeed, if this can be done, we can then interate

the argument over z′ ∈ P ′. This procedure will finish after N steps since the dimension of

the polytope is decreased after each iteration.

For this, consider z ∈ P . Find any vertex Z ∈ P , which can be done for example, by

minimizing a linear function over the polytope P . We define z′ by projecting z into the

frontier of P . For this, let γ = max γ ≥ 1|Z + γ(z − Z) ∈ P. In other words, if Ai denotes

the i-th row of A, then

γ = mini=1,...,K

bi − Ai · ZAi · (z − Z)

∣∣∣∣Ai · (z − Z) 6= 0

.

With this, define z′ := Z+ γ(z−Z) ∈ P , implying that z = z′/γ+Z(γ−1)/γ. Thus, defining

λ := 1/γ ≤ 1 we get that z = λz′ + (1− λ)Z. Finally, note that z′ belongs to a proper face

of P . For this, it is enough to show that there is i∗ ∈ 1, . . . ,K such that Ai∗ · z′ = bi∗ and

Ai∗ · Z < bi∗ , which is clear from the choice of γ. Then, the face P ′ ∋ z′ equals,

P ′ :=

x ∈ RN∣∣A′x ≤ b′

,

21

where

A′ :=

(A

−Ai∗

)and b′ :=

(b

−bi∗

).

Note that the complexity of this algorithm is O((V + KN) · N), where V denotes the

complexity of finding a vertex of P . In general, V can be done using the ellipsoid method,

but in our particular problem it can be done much faster. Indeed, finding a vertex of a face

in a matching polyhedron of a bipartite graph can be formulated as finding a matching over

a bipartite graph, with the extra restriction that a given subset of vertices must be covered.

Clearly this can be done by finding a maximum weight matching, which can be solved in

O(n2 ·m) ([34], Ch. 17.2), where n is the number of jobs and m the number of machines.

Finally, since N = nm, the time complexity of the algorithm is O(n3 ·m2).

2.2 A new rounding technique for R||Cmax

In 1990, Lenstra, Shmoys and Tardos [27] gave a 2-approximation algorithm for the problem

of minimizing the makespan on unrelated machines. For this, they noticed that if the value

of the optimal makespan Cmax was known, they could formulate the problem as finding an

integer feasible solution of a polytope. This polytope, that uses assignment variables of jobs

to machines xij, is defined by the following set of linear inequalities.

[LST]∑

i∈M

xij = 1 for all j ∈ J,

∑

j∈J

pijxij ≤ C for all i ∈M,

xij = 0 if pij > C, (2.8)

xij ≥ 0 for all i, j.

Then, if we can find a feasible integral solution of this polytope in polynomial time, then

we could solve R||Cmax by doing binary search on C to estimate Cmax.

To obtain a 2-approximation algorithm, Lenstra et. al relaxed the integrality contrains of

this feasibility problem, and proposed a rounding technique that turns any vertex of [LST]

to a feasible schedule with makespan at most 2C. Later, Shmoys and Tardos [37] refined

this rounding so they could turn any feasible solution of [LST] (not just a vertex) into a

22

schedule, without increasing the makespan in more than a factor of 2. Shmoys and Tardos

used this new technique to design an approximation algorithms for the generalized assignment

problem.

The main technical difficulty to generalize Lenstra et al.’s rounding technique to our more

general problem R||∑wLCL, relays on the fact that the value of the optimal makespan must

be previously known or guessed by a binary search procedure, thing that is not clear how

to do in R||∑wLCL. To overcome this, we further relax [LST] by replacing Equation (2.8)

with Equation (2.3), and thus removing the nonlinearity on the value of the makespan. With

this, we have removed the necessity to estimate Cmax by a binary search procedure, since we

can just minimize the makespan C over a polytope. In other words, we can use the solution

of the linear program [LL] as a lower bound of our problem. In what follows we show how to

round any fractional solution of [LL] to an integral one, such that the makespan increases in

at most a factor of 4. By Corollary 2.2, this is equivalent to turning any preemptive schedule

to a nonpreemptive one, such that the makespan is increase in no more than a factor of 4.

Let x and C be a feasible solution of [LL]. The rounding proceeds in two steps: First, we

eliminate fractional variables whose corresponding processing time is too large; Then, we use

the rounding technique of Shmoys and Tardos [37] as a subroutine. This result is subsumed

in the next theorem.

Theorem 2.4 (Shmoys and Tardos [37]). Given a nonnegative fractional solution to the

following system of equations:

∑

j∈J

∑

i∈M

cijxij ≤ C, (2.9)

∑

i∈M

xij = 1, for all j ∈ J, (2.10)

there exists an integral solution xij ∈ 0, 1 satisfying (2.9),(2.10), and also,

xij = 0 =⇒ xij = 0 for all i ∈M, j ∈ J, (2.11)∑

j∈J

pijxij ≤∑

j∈J

pijxij + maxpij : xij > 0 for all i ∈M.

Furthermore, such integral solution can be found in polynomial time.

23

To begin our rounding, we first define a modified solution x′ij as follows:

x′ij =

0 if pij > 2C∗

xij

Xjelse,

(2.12)

where Xj =∑

i:pij≤2C xij for all j ∈ J . Note that,

1−Xj =∑

i:pij>2C

xij ≤∑

i:pij>2C

xijpij

2C<

1

2,

where the last inequality comes from Equation (2.3). Thus Xj > 1/2, which implies that x′ij

satisfies

x′ij ≤ 2xij for all j ∈ J, i ∈M,

∑

j∈J

x′ij ≤ 2C for all i ∈M.

Also, note that by construction the following is also satisfied.

∑

i∈M

x′ij = 1 for all j ∈ J,

x′ij = 0 if pij > 2C.

Then, we can apply Theorem 2.4 to x′ij (for cij = 0), to obtain a feasible integral solution

xij to [LL], such that for all i ∈M ,

∑

j∈J

xijpij ≤∑

j∈J

x′ijpij + maxpij : xij > 0 ≤ 2C + 2C = 4C.

Thus, the rounded solution is within a factor 4 of the fractional solution.

2.3 Power of preemption of R||Cmax

We now show that the integrality gap of [LL] is at least 4. This, together with the rounding

developed in the previous section, implies that the integrality gap of [LL] is exactly 4. As

discussed in Section 1.2, this means that it is not possible to construct a rounding with a

24

factor better than 4, thus implying that the naive rounding developed on the previous section

is best possible.

Let us fix β ∈ [2, 4), and ε > 0 such that 1/ε ∈ N. We now construct an instance

I = I(β, ε) such that its optimal nonpreemptive makespan is at most C(1 + ε), and that any

nonpreemptive solution of I has makespan at least βC. The construction is done iteratively,

maintaining at each iteration a preemptive schedule of makespan (1 + ε)C , and where

the makespan of any nonpreemptive solution is increased. During the construction of the

instance, we will interchangeable use the equivalence between feasible solutions of [LL] and

preemptive schedules given by Corollary 2.2.

2.3.1 Base case

We begin by constructing an instance I0, which will later be our first iteration. To this end

consider a set of 1/ε jobs J0 = j(0; 1), j(0; 2), . . . , j(0; 1/ε) and a set of 1/ε + 1 machines

M0 = i(1), i(0; 1), . . . , i(0; 1/ε). Every job j(0; ℓ) can only be processed in machine i(0; ℓ),

where it takes βC units of time to process, and in machine i(1), where it takes a very short

time. More precisely, for all ℓ = 1 . . . , 1/ε we define,

pi(0;ℓ)j(0;ℓ) := βC,

pi(1)j(0;ℓ) := εCβ

β − 1,

The rest of the processing times are defined as infinite. Note that a feasible fractional

assignment is given by setting xi(0;ℓ)j(0;ℓ) = 1/β and xi(1)j(0;ℓ) = f0 := (β − 1)/β and setting

to zero all other variables. The makespan of this fractional solution is exactly (1 + ε)C.

Indeed, the load of each machine i ∈M0,∑

j∈J0xijpij, equals C. Also, the load associated to

each job j ∈ J0,∑

i∈M0xijpij, equals C + εC. Furthermore, no nonpreemptive solution with

makespan less than βC can have a job j(0; ℓ) processed in machine i(0; ℓ), and therefore all

jobs must be processed in i(1). This yields a makespan of C/f0 = βC/(β − 1). Therefore,

the makespan of any nonpreemptive solution is minβC,C/f0. Note that if β is chosen as

2, the makespan of any nonpreemptive solution must be at least 2, and therefore the gap of

the instance tends to 2 when ε tend to zero.

25

I0

i(0; 1)

i(0; ℓ)

i(0; 1/ε)

i(1)

j(0; ℓ)

xij = 1/β

pij = βC

xij =(β−1)

β

pij = εC β(β−1)

C| z

Figure 2.1: Instance I0 and its fractional assignment. The values over the arrows xij and pij

denote the fractional assignment and the processing time respectively.

2.3.2 Iterative procedure

To increase the integrality gap we proceed iteratively as follows. Starting from instance I0,

which will be the base case, we show how to construct instance I1. As we will show later, an

analogous procedure can be used to construct instance In+1 from instance In.

Begin by making 1/ε copies of instance I0, I l0 for l = 1, . . . , 1/ε, and denote the set

of jobs and machines of I l0 as J l

0 and M l0 respectively. Also, denote as i(1; ℓ) the copy of

machine i(1) belonging to M l0 (see Figure 2.2). Consider a new job j(1) for which pi(1;ℓ)j(1) =

C(β−β/(β−1)) for all ℓ = 1, . . . , 1/ε (and∞ otherwise), and define xi(1;ℓ)j(1) = εC/pi(1;ℓ)j(1).

This way, the load of each machine i(1; ℓ) in the fractional solution is (1 + ε)C, and the load

corresponding to job j(1) is exactly C. Nevertheless, depending on the value of β, job j(1)

may not be completely assigned. A simple calculation shows that for β = (3+√

5)/2, job j(1)

26

T1I10 Iℓ

0 I1/ε0

i(1; 1) i(1; ℓ) i(1; 1ε)

j(1)

xij = ε

β− ββ−1

pij = C(β − ββ−1

)

Figure 2.2: Instance T1 and its fractional assignment. The values over the arrows xij and pij

denote the fractional assignment and the processing time respectively.

is completely assigned in the fractional assignment. Furthermore, as justified before, in any

nonpreemptive schedule of makespan less than βC, all jobs of instance I l0 must be processed

on machine i(1; ℓ). Since also job j(1) must be processed on some machine i(1; ℓ) then the

load of that machine must be∑

j∈Jℓ0pi(1;ℓ)j +pi(1;ℓ)j(1) = Cβ/(β−1)+C(β−β/(β−1)) = βC.

Then, the gap of the instance already constructed converges to β = (3 +√

5)/2 when ε tend

to 0, thus improving the gap of 2 shown before.

On the other hand, for β > (3 +√

5)/2 (as we would like) there will be some fraction of

job j(1),

f1 := 1−1/ε∑

ℓ=1

xi(1;ℓ)j(1) =(β − 1)2 − β

β(β − 1)− β

that must be processed elsewhere. To overcome this, we do as follows. Let us denote the

instance consisting of jobs⋃1/ε

ℓ=1 J l0 and machines

⋃1/εℓ=1 M ℓ

0 as T1, and construct 1/ε copies of

instance T1, T k1 for k = 1, . . . , 1/ε. Also, consider 1/ε copies of job j(1), and denote them by

j(1; k) for k = 1, . . . , 1/ε (see Figure 2.3). As shown before, we can assign a fraction 1− f1

of each job j(1; k) to machines of T k1 . To assign the remaining fraction f1, we add an extra

machine i(2), with pi(2)j(1;ℓ) := εC/f1 (and ∞ for all other jobs), so that the fraction f1 of

each job j(1; ℓ) takes exactly εC to process in i(2). Then, defining xi(2)j(1;ℓ) = f1, the total

load of each job j(1; ℓ) does not exceed (1+ ε)C, while the load of machine i(2) is exactly C.

27

Let us denote the instance we have constructed so far as I1.

Notice that I1 is analogous to I0 in the sense that both satisfy the following properties

for n = 0, 1,

(i) In any nonpreemptive solution of makespan less than βC, every job j(n; ℓ) must be

processed on machine i(n+1). Therefore the makespan of any nonpreemptive solution

is at least minβC,C/fn.

(ii) The makespan of the fractional solution constructed is (1 + ε)C. In particular the load

of machine i(n+ 1) is C, and therefore a fraction of a job which takes less than εC can

still be processed on this machine without increasing the makespan.

Furthermore, it is easy to show that C/f0 < C/f1 for β > 2, i.e. the makespan of any

nonpreemptive solution increased from I0 to I1, and thus the integrality gap of the instance

also increased.

In the following we generalize the ideas shown before, and describe the construction of

an instance with integrality gap arbitrarily close to β, for any β ∈ [2, 4).

Procedure I

1. Construct I0, f0, and i0 as in Section 2.3.1, and let n = 0.

2. While fn > 1/(β − 1), we construct instance I(n + 1) as follows.

(a) Construct an instance Tn+1 consisting of 1/ε copies of instance In, that we denote

as I ln, for ℓ = 1, . . . , 1/ε, where the copy of machine i(n) belonging to I l

n is denoted

by i(n; ℓ).

(b) Create 1/ε copies of Tn+1, T kn+1 for k = 1, . . . , 1/ε. Denote the ℓ-th copy of instace

In belonging to instance T kn+1 as Iℓk

n , and the copy of machine i(n+1) that belongs

to instance Iℓkn as i(n + 1; ℓ, k).

(c) Create 1/ε new jobs, j(n + 1; k), for k = 1, . . . , 1/ε, and let pi(n+1;ℓ,k)j(n+1;k) =

C(β − 1/fn) for all k, ℓ = 1, . . . , 1/ε (and ∞ for all other machines).

We define the assignment variables for this new jobs as:

xi(n+1;ℓ,k)j(n+1;k) :=ε

β − 1/fn

for all k, ℓ = 1, . . . , 1/ε.

28

This way, the unassigned fraction of each job j(n + 1; k) equals

fn+1 := 1−1/ε∑

ℓ=1

xi(n+1;ℓ,k)j(n+1;k) (2.13)

=(β − 1)fn − 1

βfn − 1. (2.14)

(d) To assign the remaining fraction of jobs j(n + 1; k) for k = 1, . . . , 1/ε, we create

a new machine i(n + 2), and define pi(n+2)j(n+1;k) = εC/fn+1 for all k = 1, . . . , 1/ε

(and ∞ for all other jobs).

With this we can let xi(n+2)j(n+1;k) = fn+1, so that this way the load of each job

j(n + 1; k) and machine i(n + 2) are (1 + ε)C and C respectively.

(e) Call In+1 the instance constructed so far, and redefine n ← n + 1. Observe that

the defined assignment guarantees that the optimal preemptive makespan for In+1

is at most (1 + ε)C.

3. If fn ≤ 1/(β − 1), that is, the first time the condition of step (2) is not satisfied, we do

half an iteration as follows.

(a) Make 1/ε copies of In, Iℓn for ℓ = 1, . . . , 1/ε, and call i(n+1; ℓ), the copy of machine

i(n + 1) belonging to Iℓn.

(b) Create a new job j(n+1), and define pi(n+1;ℓ)j(n+1) := C(β−1/fn) and xi(n+1;ℓ)j(n+1) :=

ε. Notice that this way job j(n + 1) is completely processed in the preemptive

solution, and the makespan of the preemptive solution is still (1 + ε)C, since the

load of job j(n + 1) equals C(β − 1/fn) ≤ C.

(c) Return In+1, the instance thus constructed.

Lemma 2.5. If Procedure I finishes, then it returns an instance with a gap of at least

β/(1 + ε).

Proof. It is enough to show that if the procedure finishes then the makespan of any nonpre-

emptive solution is at least βC. We proceed by contradiction, assuming that instance In∗

returned by Procedure I has makespan strictly less than βC. Note that for the latter to

hold any job j in In∗ has to be assigned to the last machine i added by Procedure I for

which pij <∞ (this is obvious for jobs in I0, and follows inductively for jobs in In, n ≤ n∗).

29

In+1T 1

n

I1,1n I

1,1/εn

j(n + 1; 1)

xij = fn+1xij = fn+1

i(n + 2)

T1/εn

I1/ε,1n I

1/ε,1/εn

j(n + 1; 1/ε)

C εC

| z | z

Figure 2.3: Construction of instance In+1(β).

This implies that the load of all machines i(n∗; ℓ) (which where the last machines included)

due to jobs different from j(n∗) equals C/fn∗−1. Indeed, for each job j that was fractionally

assigned to any of these machines had xi(n∗;ℓ)j = fn∗−1, and i(n∗; ℓ) was the last machine for

which pij was bounded. Thus, as all machines i(n∗; ℓ) had load C in the fractional assignment

they will have load C/fn∗−1 in the nonpreemptive solution.

Furthermore, job j(n∗), for which pi(n∗,ℓ)j(n∗) = C(β−1/fn∗−1) must be processed in some

machine i(n∗, ℓ). Thus, the load of machine i(n∗, ℓ) is C/fn∗−1 +C(β−1/fn∗−1) = βC, which

is a contradiction.

To prove that the procedure in fact finishes, we first show a technical lemma.

Lemma 2.6. For each β ∈ [2, 4), if fn > 1/β, then fn+1 ≤ fn.

Proof. It follows from Equation (2.14) that,

fn+1 − fn =−βf2

n + βfn − 1

βfn − 1.

Note that the numerator of the last expression is always positive since the square equation

30

−βx2 + βx− 1 has no real roots for 0 ≤ β < 4. The result follows since, by hypothesis, the

denominator of this expression is always positive.

Lemma 2.7. Procedure I finishes.

Proof. We need to show that for every β ∈ [2, 4), there exist n∗ ∈ N such that fn∗ ≤ 1/(β−1).

If this does not hold, then fn > 1/(β − 1) > 1/β for all n ∈ N. Then Lemma (2.6) implies

that fnn∈N is a decreasing sequence. Therefore fn must converge to some real number

L ≥ 1/(β − 1). Thus, Equation (2.14) implies that

L =(β − 1)L− 1

βL− 1,

and therefore L is a root of equation −βx2 + βx− 1 which is a contradiction.

We have proved the following theorem.

Theorem 2.8. For each β ∈ [2, 4) and ε > 0, there is an instance I of R||Cmax, for which the

optimal preemptive makespan is at most C(1 + ε), and the optimal nonpreemptive makespan

is at least βC.

Corollary 2.9. The integrality gap of [LL] is 4.

31

Chapter 3

Approximation algorithms for

minimizing∑

wLCL on unrelated

machines

In this chapter we present approximation algorithms for the general case R|rij|∑

wLCL

and its preemptive version R|rij, pmtn|∑wLCL. Most of the techniques used for this are

generalization of the methods shown in the previous chapter.

3.1 A (4 + ε)−approximation algorithm for

R|rij, pmtn|∑wLCL

In the following we present a (4 + ε)-approximation algorithm for the preemptive version of

R|rij|∑

wLCL. This means that for each ε > 0 we give a (4 + ε)-approximation algorithm,

whose running time is polynomial on the size of the input and 1/ε.

From now on we will assume without loss of generality that all processing time pij are

integers greater or equal than 1. If this is not the case we can discard the cases when pij = 0

as trivial and scale the remaining processing times.

The algorithm developed in this section is based on a time-indexed linear program, whose

variables represent the fraction of each job that is processed at each (discreet) point in time

on each machine. This kind of linear relaxation was originally introduced by Dyer and Wolsey

[13] for the problem 1|rj|∑

j wjCj, and was later extended by Schulz and Skutella [35], who

used it to obtain a (3/2 + ε)-approximation and a (2 + ε)-approximation for R||∑wjCj and

32

R|rj|∑

wjCj respectively.

Let us consider a time horizon T , large enough so it upper bounds the greatest completion

time of any reasonable schedule, for instance T = maxi∈M,k∈Jrik +∑

j∈J pij. We divide

the time horizon into exponentially-growing time intervals, so that there is only polynomially

many of them. For that, let ε be a fix parameter, and let q be the first integer such that

(1 + ε)q−1 ≥ T . Then, we consider the intervals

[0, 1], (1, (1 + ε)], ((1 + ε), (1 + ε)2], . . . , ((1 + ε)q−2, (1 + ε)q−1].

To simplify the notation, let us define τ0 = 0, and τℓ = (1 + ε)ℓ−1, for each ℓ = 1, . . . , q.

With this, the ℓ-th interval corresponds to (τℓ−1, τℓ].

Given any preemptive schedule, let yjiℓ the fraction of job j that is processed on machine

i in the ℓ-th interval. Then, pijyjiℓ is the amount of time that job j is processed on machine

i in the ℓ-th interval. Consider the following linear program:

[DW] min∑

L∈O

wLCL

∑

i∈M

q∑

ℓ=1

yjiℓ = 1 for all j ∈ J, (3.1)

∑

j∈J

pijyjiℓ ≤ τℓ − τℓ−1 for all ℓ = 1, . . . , q and i ∈M, (3.2)

∑

i∈M

pijyjiℓ ≤ τℓ − τℓ−1 for all ℓ = 1, . . . , q and j ∈ J, (3.3)

∑

i∈M

(yji1 +

q∑

ℓ=2

τℓ−1yjiℓ

)≤ CL for all L ∈ O, j ∈ L, (3.4)

yjiℓ = 0 for all j, i, ℓ : rij > τℓ, (3.5)

yjiℓ ≥ 0 for all i, j, ℓ. (3.6)

It is easy to see that this is a relaxation of our problem. Indeed, Equation (3.1) assures

that every job is completely processed. Equation (3.2) must hold since in each interval ℓ and

machine i the total amount of time available is at most τℓ − τℓ−1. Similarly, Equation (3.3)

holds since no job can be simultaneously processed in two machines at the same time, and

therefore for a fixed interval the total amount of time that can be used to process a job is at

33

most the length of the interval. To see that Equation (3.4) is valid notice that pij ≥ 1, and

thus CL ≥ 1 for all L ∈ O. Also notice that CL ≥ τℓ−1 for all L, j ∈ L, i, ℓ such that yjiℓ > 0.

Thus, the left hand side of Equation (3.4) is a convex combination of values smaller than CL.

Finally, Equation (3.5) must hold since no part of a job can be assigned to an interval that

finishes before the release date in any given machine.

As usual in approximation algorithms based on linear relaxations, we first compute the

optimal solution of [DW], and then transform it into a preemptive schedule whose cost is

within a constant factor from the optimal cost of [DW]. To construct the schedule we do as

follows. For any job j ∈ L, we truncate to zero all its variables that assign part of it to

an interval considerably later than CL. Afterwards, we use Algorithm: Nonparallel

Assignment (see Section 2.1) to construct a feasible schedule inside each interval, making

sure that no job is processed in two machines at the same time.

More precisely, let y∗jiℓ and C∗

L be the optimal solution of [DW]. Let j ∈ J , and L =

arg minC∗L′ ∈ O|L′ ∋ j. For a given parameter β > 1 (which will be appropriately chosen

later), we define:

y′jiℓ =

0 if τℓ−1 > βC∗

L

y∗jiℓ

Yjif τℓ−1 ≤ βC∗

L,(3.7)

where,

Yj =∑

i∈M

∑

ℓ: τℓ−1≤β·C∗L

y∗jiℓ.

The modified solution y′ satisfies the following lemma.

Lemma 3.1. The modified solution y′jiℓ, obtained by applying Equation (3.7) to y∗

jiℓ satisfies,

∑

i∈M

q∑

ℓ=1

y′jiℓ = 1 for all j, (3.8)

∑

j∈J

pijy′jiℓ ≤

β

β − 1(τℓ − τℓ−1) for all i, ℓ, (3.9)

∑

i∈M

pijy′jiℓ ≤

β

β − 1(τℓ − τℓ−1) for all j, ℓ, (3.10)

y′jiℓ = 0 if τℓ−1 > βC∗

L for all L ∈ O, j ∈ L. (3.11)

34

Proof. It is clear that y′jiℓ satisfies (3.8) since:

∑

i∈M

q∑

ℓ=1

y′jiℓ =

∑

i∈M

∑

y′jiℓ>0

y∗jiℓ

Yj

=1

Yj

∑

i∈M

∑

ℓ:τℓ−1≤β·C∗L

y∗jiℓ = 1.

Furthermore, to show that equations (3.9) and (3.10) holds, note that,

1− Yj =∑

i∈M

∑

ℓ: τℓ−1>β·C∗L

y∗jiℓ

≤∑

i∈M

∑

ℓ: τℓ−1>β·C∗L

y∗jiℓ

τℓ−1

βC∗L

≤ C∗L

βC∗L

=1

β.

The last inequality follows from Equation (3.4), and by noting that ℓ 6= 0 whenever τℓ−1 >

β · C∗L. Then, Yj ≥ (β − 1)/β, and thus y′

jiℓ ≤ ββ−1

y∗jiℓ. With this, equations (3.9) and (3.10)

follow from equations (3.2) and (3.3). Finally, note that Equation (3.11) follows directly from

the definition of y′.

Equation (3.11) in the previous lemma implies that the variables y′jiℓ only assign jobs to

intervals that finish before βC∗L in case j ∈ L. On the other hand, as shown by equations

(3.9) and (3.10), the amount of load assign to each interval may not fit in the available time

span τℓ− τℓ−1. Thus, we will have to increase the size of every interval in a factor β/(β − 1).

With the latter observations, we are ready to describe the algorithm.

Algorithm: Greedy Preemptive LP

1. Solve [DW] to optimality and call the solution y∗ and (C∗L)L∈O.

2. Define y′jiℓ using Equation (3.7).

3. Construct a preemptive schedule S as follows.

(a) For each ℓ = 1, . . . , q, define xij = y′jiℓ and C = (τℓ − τℓ−1)β/(β − 1), and apply

Algorithm: Nonparallel Assignment to this fractional solution. Call the

preemptive schedule obtained Sℓ.

(b) For each job j ∈ J that is processed by schedule Sℓ at time t ∈ [0, C] in machine

i ∈M , make schedule S process j in machine i at time t + τℓ−1β/(β − 1).

35

Lemma 3.2. Algorithm: Greedy Preemptive LP constructs a feasible schedule where

the completion time of each order L ∈ O is less than C∗L(1 + ε)β2/(β − 1).

Proof. Note that equations (3.9) and (3.10) implies that for each ℓ = 1, . . . , q, xij = y′jiℓ

and C = (τℓ − τℓ−1)β/(β − 1) satisfies equations (2.2) and (2.3). Then, by Lemma 2.1, the

makespan of each schedule Sℓ is less than (τℓ − τℓ−1)β/(β − 1), and thus the schedule Sℓ

defines the schedule S in the disjoint amplified interval [τℓ−1β/(β − 1), τℓβ/(β − 1)). Also,

it follows from Lemma 2.1 and Equation (3.8) that the schedule S completely process every

job.

To bound the completion times of the orders, consider a fixed order L ∈ O and job j ∈ L.

Let ℓ∗ be the last interval for which y′jiℓ > 0, for some machine i ∈M . I.e.,

ℓ∗ = maxi∈M

maxℓ ∈ 1, . . . , q|y′jiℓ > 0.

Then, the completion time Cj is smaller than τℓ∗β/(β−1). To further bound Cj, we consider

two cases. If ℓ∗ = 1 then,

Cj ≤β

β − 1≤ C∗

L(1 + ε)β

β − 1,

where the last inequality follows since C∗L ≥ 1. On the other hand, if ℓ∗ > 1, Equation (3.11)

implies that

Cj ≤ τℓ∗β

β − 1≤ τℓ∗−1(1 + ε)

β

β − 1≤ C∗

L(1 + ε)β2

β − 1.

Thus, by taking the maximum over all j ∈ L, the completion time of the order L is upper

bounded by C∗L(1 + ε)β2/(β − 1).

Theorem 3.3. Algorithm: Greedy Preemptive LP is a (4 + ε)-approximation for

β = 2.

Proof. Let CL be the completion time of order L given by Algorithm: Greedy Pre-

emptive LP. Taking β = 2 in the last lemma, which is the optimal choice, it follows that

CL ≤ C∗L(1 + ε)β2/(β − 1) = 4(1 + ε)C∗

L. Then, multiplying CL by its weight wL and adding

over all L ∈ O, we conclude the the cost of the schedule constructed is no larger than 4(1+ε)

times the cost of the optimal solution to [DW], which is a lower bound on the cost of the

optimal preemptive schedule.

36

3.2 A constant factor approximation for R|rij|∑

wLCL

We now give the first constant factor approximation for the general problem R|rij|∑

wLCL.

Our algorithm is based on an interval-index linear programming relaxation developed by Hall,

Schulz, Shmoys, and Wein [21], and on the rounding technique developed in Section 2.2.

Similarly as before, we consider a time horizon T , large enough so it upper bounds the

greatest completion time of any reasonable schedule, for example T = maxi∈M,k∈Jrik +∑

j∈J pij. We also divide the time horizon into exponentially-growing time intervals, so that

there is only polynomially many. For that, let α > 1 be a parameter which will determine

later, and let q be the first integer such as αq−1 ≥ T . With this, consider the intervals

[1, 1], (1, α], (α, α2], . . . , (αq−2, αq−1].

To simplify the notation, let us define τ0 = 1, and τℓ = αℓ−1, for each ℓ = 1 . . . q. With

this, the ℓ-th interval corresponds to (τℓ−1, τℓ]. Let us remark that, in this setting, the first

interval starts and finish at 1, contrary to the definition on the previous section where the

first interval started at 0 and finished at 1.

To model the scheduling problem we consider the variables yjiℓ, indicating whether job

j is finished in the machine i and in the interval ℓ. These variables allow us to write the

following linear program based on that in [21], which is a relaxation of the scheduling problem

even when integrality constraints are imposed.

[HSSW] min∑

L∈O

wLCL

∑

i∈M

q∑

ℓ=1

yjiℓ = 1 for all j ∈ J (3.12)

ℓ∑

s=1

∑

j∈J

pijyjis ≤ τℓ for all i ∈M and ℓ = 1, . . . , q (3.13)

∑

i∈M

q∑

ℓ=1

τℓ−1yjiℓ ≤ CL for all L ∈ O, j ∈ L (3.14)

yjiℓ = 0 for all i, ℓ, j : pij + rij > τℓ (3.15)

yjiℓ ≥ 0 for all i, l, j. (3.16)

It is clear that [HSSW] is a relaxation of our problem. Indeed, for any nonpreemptive

37

schedule, define yjiℓ = 1 iff job j finishes processing on machine i at the ℓ-th interval. Then,

Equation (3.12) holds since each job finishes in exactly one interval and one machine. The

left hand side of (3.13) corresponds to the total load processed in machine i in the interval

[0, τℓ], and therefore the inequality is valid. The double sum in inequality (3.14) corresponds

exactly to τℓ−1, where ℓ is the interval where job j finishes, so that is at most Cj, and therefore

it is upper bounded by CL if j ∈ L. The rule (3.15) imposes that some variables must be set

to zero before the LP is solved. This is valid since if pij + rij > τℓ then the job j will not be

able to finish before τℓ in machine i, and therefore yjiℓ will be zero.

Let (y∗jiℓ)jiℓ and (C∗

L)L be an optimal solution to [HSSW]. To obtain a feasible schedule

we need to round such solution into an integral one. For the special case where all orders are

singletons (as in Hall et al’s [21] situation), (3.14) becomes an equality, so that one can directly

use Theorem 2.4, regarding each machine-interval pair of our problem as one machine in the

algorithm, to round a fractional solution to an integral solution of smaller total cost. When

doing this the righthand side of equation (3.13) is increased to τℓ +maxpij : yjiℓ > 0 ≤ 2τℓ,

where the last inequality follows from (3.15). This can be used to derive a constant factor

approximation algorithm for the problem. In our setting however, it is not possible to apply

the theorem directly, due to the nonlinearity of the objective function. We thus take a detour

in the same manner as in Section 3.1: we round down to zero all variables y∗jiℓ for which τℓ−1

is considerable bigger than a certain parameter β times C∗L, for L = argminCL′|L′ ∋ j (and

we will optimize over β later on). For that we define the variables y′jiℓ using Equation (3.7).

With this, the next lemma follows from similar calculations as Lemma 3.1.

Lemma 3.4. The modified solution y′jiℓ ≥ 0 satisfies:

∑

i∈M

q∑

ℓ=1

y′jiℓ = 1 for all j ∈ J (3.17)

ℓ∑

s=1

∑

j∈J

pijy′jis ≤

β

β − 1τℓ for all i ∈M (3.18)

y′jiℓ = 0 if pij + rij > τℓ or τℓ−1 > βC∗

L,∀i, j, ℓ, L : j ∈ L (3.19)

With the previous lemma on hand we are in position to apply Theorem 2.4. To do this

we regard each interval-machine pair of our problem as one machine of the algorithm. In

other words, defining the set of machines M ′ = M × 1, . . . , q and xhj = yjh1h2 for each

h = (h1, h2) ∈ M ′, we round xhj to an integral solution xhj := yjh1h2 ∈ 0, 1 satisfying

38

equations (3.17), (3.19) and

∑

j∈J

yjiℓpij ≤∑

j∈J

y′jiℓpij + maxpij : y′

jiℓ > 0, j ∈ J ≤∑

j∈J

y′jiℓpij + τℓ, (3.20)

where the first inequality follows from (2.11) and the second from (3.19).

We are now ready to give the algorithm for R|rij|∑

wLCL.

Algorithm: Greedy-LP

(1) Solve [HSSW] obtaining an optimal solution (y∗jiℓ) and (C∗

L)L.

(2) Modify the solution according to (3.7) to obtain (y′jiℓ) satisfying (3.17), (3.18), and

(3.19).

(3) Round (y′jiℓ) using Theorem 2.4 to obtain an integral solution (yjiℓ) as above.

(4) Let Jil = j ∈ J : yjiℓ = 1. Greedily schedule in each machine i, all jobs in⋃q

ℓ=1 Jil,

starting from those in Ji1 until we reach Jiq (with an arbitrary order inside each set

Jil ), respecting the release dates.

To break down the analysis let us first show that Greedy-LP is a constant factor ap-

proximation for the case in which all release dates are zero.

Theorem 3.5. Algorithm Greedy-LP is a (27/2)-approximation for R||∑wLCL.

Proof. Let us fix a machine i and take a job j ∈ L such that yjiℓ = 1, so that j ∈ Jil. Clearly,

Cj, the completion time of job j in algorithm Greedy-LP, is at most the total processing

time of jobs in⋃ℓ

k=1 Jik. Then,

Cj ≤ℓ∑

s=1

∑

k∈J

pikykis

≤ℓ∑

s=1

(∑

k∈J

piky′kis + τs

)

≤ β

β − 1τℓ +

l∑

s=1

τs

≤(

βα

β − 1+

α2

α− 1

)τℓ−1

≤ βα

(β

β − 1+

α

α− 1

)C∗

L.

39

The second inequality follows from (3.20), the third from (3.18), and the fourth follows from

the definition of τk. The last inequality follows since, by condition (2.11), yjiℓ = 1 implies

y′jiℓ > 0, so that by (3.19) we have τℓ−1 ≤ βC∗

L. Optimizing over the factor approximation,

the best possible factor given by this method is attained at α = β = 3/2, and therefore

we conclude that Cj ≤ 27/2 · C∗L. As this latter fact holds for all j ∈ L, we conclude

that CL = maxj∈L Cj ≤ 27/2 · C∗L. The claimed approximation factor follows directly by

multiplying this inequality by wL, adding over all L ∈ O and using the fact that [HSSW] is

a lower bound on the optimal schedule.

Theorem 3.6. Algorithm Greedy-LP is a (27/2)-approximation for R|rij|∑

wLCL.

Proof. Similarly to the proof of the previous theorem, we will show that the solution given by

Algorithm Greedy-LP satisfies that Cj ≤ 27/2 · C∗L, even in the presence of release dates.

Let us define τ ℓ := 1α−1

+∑ℓ

s=1

(∑k∈J piky

′iks + τs

). We will see that it is possible to schedule

every set of jobs Jil in machine i between time τ ℓ−1 and time τ ℓ (with and arbitrary order

inside each interval), respecting all release dates. Indeed, assuming that 1 < α ≤ 2, it follows

from (2.11) and (3.19) that for every j ∈ Jil,

rij ≤ τℓ ≤1

α− 1+

τℓ − 1

α− 1=

1

α− 1+

ℓ−1∑

k=1

τk ≤ τ ℓ−1.

Thus job j is available for processing in machine i at time τ ℓ−1. On the other hand, note that

τ ℓ − τ ℓ−1 =∑

j∈J pijy′jiℓ + τℓ, so it follows from (3.20) that all jobs in Jil fit inside (τ ℓ−1, τ ℓ].

We conclude that in the schedule constructed by Greedy-LP any job j ∈ Jil is processed

before τ ℓ. Therefore, as in the previous theorem,

Cj ≤1

α− 1+

ℓ∑

s=1

(∑

k∈J

piky′iks + τk

)≤(

βα

β − 1+

α2

α− 1

)τℓ−1 ≤ βα

(β

β − 1+

α

α− 1

)C∗

L.

Again, choosing α = β = 3/2 we obtain Cj ≤ 27/2 · C∗L.

40

Chapter 4

A PTAS for minimizing∑

wLCL on

parallel machines

In this chapter we design a PTAS for P |part|∑wLCL with some additional constraint. We

assume that there are either a constant number of machines, a constant number of jobs per

order or a constant number of orders. First, we will describe the case where the number of

jobs of each order is bounded by a constant K, and then we will justify that this implies

the existence of PTASs for the other cases. The results in this chapter closely follows the

PTAS developed for P |rj |∑

wjCj in [1]. However, it is technically more involved mainly for

three reasons: Firstly, it is crucial to show that there is a near-optimal schedule such that

the time-span of every order is small, and, furthermore, the precise localization of orders is

significantly more complicated; Also, as we shall see later, it is important that all the near-

optimal solutions that we construct satisfy the properties of Lemma 4.1; Finally, we need to

be slightly more careful in the final placing of jobs.

As usual in the design of approximation schemes, the general idea is to add structure to

the solution by modifying the instance in a way such that the cost of the optimal solution

is not worsen in more than a (1 + ε) factor. Also, by applying several modifications to the

optimal solution of this new instance we will prove that there exists a near-optimal solution

that satisfies several extra properties. The structure given by this properties allow us to find

this solution by enumeration or dynamic programming. As each one of the modifications

that we are going to apply to the optimal solution only generates a loss of at most a factor of

(1 + ε) in the cost, and we will apply only a constant number of them, we will end up with a

solution that has cost within a factor of (1+ε)O(1) to the cost of the optimal scheduling. Then,

41

choosing a small enough ε we can approximate the solution up to any factor. To simplify the

notation, in what follows we assume that all processing times are positive integers and that

1/ε is also an integer. Also, in what follows all the logarithms will be taken base (1 + ε),

unless it is explicitly stated. Besides, we will denote as p(L) =∑

j∈L pj the total processing

time of a set L ∈ J .

As in the previous chapter, we will partition the time horizon in exponentially increasing

intervals. For every integer t we will denote as It the interval [(1 + ε)t, (1 + ε)t+1), and we

denote the size of the interval as |It|, then |It| = ε(1 + ε)t.

Besides rounding and partitioning, a basic procedure we will use repeatedly is that of

stretching, which consist in stretching the time axis by a factor of (1 + ε). Of course, this

only worsen the solution in a factor of (1 + ε). The two basic stretching procedures we use

are:

1. Stretch Completion Times: This procedure consists in delaying all jobs, so that

the completion time of a job j becomes C ′j = (1 + ε)Cj in the new schedule. This will

increase the cost of the solution in exactly a factor of (1 + ε). This procedure creates a

gap of idle time of size εpj before each job j. Indeed, if k was the job being processed

just before j, then C ′j − C ′

k = Cj − Ck + ε(Cj − Ck) ≥ Cj − Ck + εpj.

2. Stretch Intervals: The objective of this procedure is to create idle time in every

interval, except for those having a job that completely crosses them. As before, it

consists on shifting jobs to the following interval. More precisely, if job j finishes in It

and occupies dj time units in It, we will move j to It+1 by pushing it exactly |It| time

units, so it also uses exactly dj time units in It+1. Then, the completion time of the

new schedule will be at most (1 + ε)Cj, and therefore the overall cost is increased by

at most a factor (1 + ε).

Note that, if j started processing in It and was being processed in It for dj time units,

after doing the shifting will be processed in It+1 for at most dj time units. Since It+1

has ε|It| = ε2(1 + ε)t more time units than It, at least that much idle time will be

created in It+1. Also, we can assume that this idle time is consecutive in each interval.

Indeed, this can be accomplished by moving to the left as much as possible all jobs

that are scheduled completely inside an interval.

After applying the procedures we will also shift the index of the intervals to the right, so

if a job was being processed in interval It, it will be still processed in It in the new schedule.

42

This will give the illusion that we have stretched time or intervals in a (1 + ε) factor. Before

giving a general description of the algorithm we show that there exists an (1+ε)-approximate

schedule where no order crosses more than O(1) intervals. For this, we first show the following

basic property which is stated for the more general case of unrelated machines.

Lemma 4.1. For any instance of R|part|∑wLCL there exist an optimal schedule such that:

1. For any order L ∈ O and for any machine i = 1, . . . ,m, all jobs in L assigned to i are

processed consecutively.

2. The sequence in which the orders are arranged inside each machine is independent of

the machine.

Proof. Let us consider an optimal schedule of the problem. For a given order-machine pair L

and i, let j∗ be the last job in L that is assigned to i. It is easy to see that any job in L that

is processed in i before j∗ can be processed just before j∗ without increasing the completion

time of any order. With this we conclude the first property of the lemma. For the rest of the

proof will assume that the optimum solution satisfies this property.

For the second property, note that inside each machine the orders can be arrange following

their completion times CL1 ≤ CL2 ≤ . . . ≤ CLkwithout increasing the cost of the solution. If

this does not hold, then there exist two orders L,L′ such that CL ≤ CL′ and in some machine

i the jobs of L′ are processed just before the ones in L. If this is the case then it is clear than

interchanging these two sets of jobs in machine i does not increase the cost of the solution,

since jobs in L will decrease their completion time while jobs in L′ will still complete before

CL. Therefore, due to the fact that CL ≤ CL′, the completion time of L′ in this new schedule

will remain the same. The procedure can be iterated until the second property in the lemma

is satisfied.

Lemma 4.2. Let s := ⌈log(1 + 1/ε)⌉, then there exists an (1 + ε)-approximate schedule in

which every order is fully processed in s + 1 consecutive intervals.

Proof. Let us consider an optimal schedule as in Lemma 4.1 and apply Stretch Completion

Times. Then we move all jobs to the right as much as possible without increasing the

completion time of any order. Note that for any order L, each job j ∈ L increased its

completion time by at least εCL. Indeed, if this is not the case let L be the last order

(in terms of completion time) for which there exists j ∈ L that increased its completion

time by less than εCL. Let i be the machine processing j. Lemma 4.1 implies that all

43

jobs processed in i after job j belong to orders that finish later than CL and thus they

increase their completion time by at least εCL. As the completion time of order L was also

increased by εCL, we conclude that job j could be moved to the right by εCL contradicting

the assumption.

This implies that after moving jobs to the right, the starting point of each order L ∈ O,

SL, will be at least εCL, and therefore CL−SL ≤ SL/ε. Let Ix and Iy be the interval where L

starts and finishes respectively, then (1+ε)y−(1+ε)x+1 ≤ CL−SL ≤ SL/ε ≤ (1/ε)(1+ε)x+1,

which implies that y − x− 1 ≤ log(1 + 1ε) ≤ s.

4.1 Algorithm overview.

In the following we describe the general idea of the PTAS. Let us divide the time horizon in

blocks of s + 1 = ⌈log(1 + 1/ε)⌉+ 1 intervals, and denote as Bℓ the block [(1 + ε)ℓ(s+1), (1 +

ε)(ℓ+1)(s+1)). Lemma 4.2 suggest to optimize over each block separately, and later put the

pieces together to construct a global solution.

Since there may be orders that cross from one block to the next, it will be necessary to

perturb the “shape” of blocks. For that we introduce the concept of frontier. The outgoing

frontier of block Bℓ is a vector that has m entries. Its i-th coordinate is a guess on the

completion time of the last job scheduled in machine i among jobs that belong to orders

that began processing in Bℓ (in Section 4.4 we will see that there is a concise description of

frontier). On the other hand, the incoming frontier of a block is the outgoing frontier of the

previous one. For a given block and incoming and outgoing frontiers, we will say that an

order is scheduled inside block Bℓ if in each machine all jobs in that order begin processing

after the incoming frontier, and finish processing before the outgoing frontier.

Assume that we know how to compute a near-optimal solution for a given subset of orders

V ⊆ O inside a block Bℓ, with fixed incoming and outgoing frontiers F ′ and F , respectively.

Let W (ℓ, F ′, F, V ) be the cost (sum of weighted completion times) of this solution.

Let Fℓ be the set of possible outgoing frontiers of block Bℓ. Using dynamic programming,

we can fill a table T (ℓ, F, U) containing the cost of a near-optimal schedule for the subset of

orders U ⊆ O in block Bℓ or before, respecting the outgoing frontier F of Bℓ. To compute

this quantity, we can use the recursive formula:

T (ℓ + 1, F, U) = minF ′∈Fℓ,V ⊆U

T (ℓ, F ′, V ) + W (ℓ + 1, F ′, F, U \ V ).

44

Unfortunately, the table T is not of polynomial size, or even finite. Then, it will be necessary

to reduce its size as done in [1]. Summarizing, the outline of algorithm is the following.

Algorithm: PTAS-DP

1. Localization: In this step we will bound the time-span of the intervals in which each

order may be processed. We give extra structure to the instance and define a release

date rL for each order L, such that there exists a near-optimal solution where each

order begins processing after rL and ends processing no later than a constant number

of intervals after rL. More precisely, we prove that each order L is scheduled in the

interval [rL, rL · (1+ε)g(ε,K)], for some function g that will be specified later. This plays

a crucial role in the next step.

2. Polynomial Representation of Order’s Subsets: The goal of this step is to reduce the

number of subset of orders needed to try in the dynamic programming. To do this,

for all ℓ, we find a polynomial size set Θℓ ⊆ 2O of possible subsets of orders that are

processed in Bℓ or before in some near-optimal schedule.

3. Polynomial Representation of Frontiers: In this step we reduce the number of frontiers

we need to try in the dynamic programming. For all ℓ, we find Fℓ ⊂ Fℓ a set of

polynomial size such that for each block the outgoing frontier of a near-optimal schedule

belongs to Fℓ.

4. Dynamic Programming: For all ℓ, F ∈ Fℓ+1, U ∈ Θℓ compute:

T (ℓ, F, U) = minF ′∈ bFℓ,V ⊆U,V ∈Θℓ−1

T (ℓ− 1, F ′, V ) + W (ℓ, F ′, F, U \ V ).

It is clear that it is not necessary to compute exactly W (ℓ, F ′, F, U \ V ); a (1 + ε)-

approximation of this value, that moves the frontiers in at most a factor of (1 + ε), is

enough. To compute this approximation we partition jobs into small and large. For

large jobs we use enumeration and essentially try all possible schedules, while for small

jobs we greedily schedule them using Smith’s rule.

One of the main difficulties of this approach is that all the modifications applied to the

optimal solution must conserve the properties given by Lemma 4.1. This is necessary to

be able to describe the interaction between one block and the following by using only the

frontier. In other words, if this is no true, it could happen that some jobs of an order that

45

begins processing in a block Bℓ are processed after a job of an order that begins processing in

block Bℓ+1. This would greatly increase the complexity of the algorithm, since this interaction

would be need to be considered in the dynamic programming, which would become too large.

This is also the main reason why our result does not directly generalizes to the case when we

have release dates, since then Lemma 4.1 does not hold. In the sequel we will analyze each

of the previous steps separately.

4.2 Localization

Lemma 4.2 shows that each order is completely processed in at most a constant number s, of

consecutive intervals. However, we do not know a priori when in time each order is processed.

In what follows, we refine this result by explicitly finding a constant number of consecutive

intervals in which each order is processed in a near-optimal schedule. This property will be

helpful in Step 2 to guess the specific block in which each order will be processed.

The localization will be done by introducing artificial release dates, i.e., for each order L

we will give a point in time rL such that, loosing a factor of at most (1 + ε) in the cost, L

starts processing after rL. Naturally, it is enough to consider release dates which are powers

of (1 + ε). The release dates are chosen so that the total amount of processing time released

at any point (1 + ε)t is (1 + ε)tO(m). This will be sufficient to show that in a (1 + ε)-

approximate schedule all orders finish processing before a constant number of intervals after

they are released. The following definition will be useful in the description of the algorithm.

Definition 4.3. A job j is said to be small with respect to a time instant T if pj ≤ ε3T .

Otherwise, we say that j is big. Also, an order L is said to be small with respect to a time

instant T if p(L) ≤ ε2T . Otherwise, we say that L is big.

Algorithm: Localization

1. Initialize rj := (1 + ε)⌊log εpj⌋, u := log(minj∈J rj), v := ⌈log(∑

j∈J pj)⌉, and for all

L ∈ O, rL := maxj∈L rj(1 + ε)−s. Also let P := ⌈log(maxj∈J pj)⌉.

2. (i) For all orders L ∈ O sort jobs in L in nonincreasing order of their size. Then

greedily assign jobs to groups until the total processing time of each group just

surpasses ε2rL. After this process, there may be at most one group of size smaller

than ε2rL.

46

(ii) If this smaller group is of total processing time smaller than ε3rL we add it to the

biggest group and otherwise we leave it as a group.

After this process, we redefine jobs in L as the newly created groups, and define the

release dates of the new jobs, rj := (1 + ε)⌊log εpj⌋.

3. For all j ∈ J round its processing time to the next power of 1+ε. I.e., pj := (1+ε)⌈log pj⌉.

Recall that K = O(1) is the maximum number of jobs per order. For all L ∈ O define

its order type T (L) ∈ 0, . . . ,KP , as a vector whose p-th component is the number of

jobs in L with processing time equal to (1 + ε)p, i.e., T (L)p := |j ∈ L : log pj = p|.

4. For all t = u, . . . , v,

(i) Define the set Ot := L ∈ O : L is big with respect to (1+ ε)t and rL = (1+ ε)t.(ii) For α ∈ 0, . . . ,KP let Oα

t be the set that contains the K(1 + ε)s+2m/ε5 orders

of largest weight in L ∈ Ot : T (L) = α.(iii) Define Qt :=

⋃

α∈0,...,KP

Oαt .

(iv) For all L ∈ Ot \Qt, redefine rL := (1 + ε)t+1.

5. For all t = u, . . . , v,

(i) Define St := L ∈ O : L is small with respect to (1 + ε)t and rL = (1 + ε)t and

sort all orders L ∈ St in non-increasing order of wL/p(L).

(ii) Define Rt as the set that constains the first orders in St such that their total

processing time is in [mε(1 + ε)t,mε(1 + ε)t + mε3(1 + ε)t].

(iii) For all L ∈ St \Rt, redefine rL := (1 + ε)t+1.

In Step (1) we begin by defining a release date rj for every job, i.e., a point in time where

each job start processing after it in a (1 + ε)-approximate schedule. It is easy to see that

this is valid, since applying Stretch Completion Times ensures that no job starts processing

before εpj. Afterwards, we define the values u and v, that give lower and upper bounds for

the index of time intervals in which jobs may be processed. In other words we can restrict

to consider intervals It with t ∈ u, u + 1, . . . , v. Finally, for every order L ∈ O we initialize

a release date rL := maxj∈L rj(1 + ε)−s. This is valid because for every order L at least one

of its jobs begins processing after maxj∈J rj, and Lemma 4.2 assures that no order crosses

more than s intervals.

47

Clearly, this initial definition of the release date is not enough to assure that at each time

instant (1 + ε)t there will be no more than (1 + ε)tO(m) total processing time released. To

amend this we will delay the release dates of orders that will not be able to start before the

next integer power of (1 + ε). For that we first classify the orders by the size of its jobs.

Note that between two orders that are indistinguishable except for their weight, i.e., if there

is a one to one correspondence between the size of its jobs, then the jobs of the one with

larger weight will always have priority over the jobs of the other order. Therefore, between

a set of orders that are indistinguishable except for their weight (we say that this orders are

of the same type) we can greedily process the ones with larger weight first. This is the key

argument to justify the delaying of release dates. Nevertheless, to successfully apply this we

need to bound the amount of types of orders. To do this we proceed in two steps. First we

get rid of jobs that are too small. Then, we round the processing time of every job to bound

the number of values a processing time could take.

Step (2) gets rid of every small job with respect to the release date of its order. This is

done by considering several small jobs as one bigger one. The procedure is justified by the

following lemma.

Lemma 4.4. There is a (1+O(ε))-approximate solution to the scheduling problem, in which

all jobs in each group defined in Step (2) of Algorithm: Localization are processed

together in the same machine.

Proof. Let us first consider the groups of jobs defined in Step (2.i), and let us consider a

1 + O(ε)-approximate schedule of the original instance. We will find another 1 + O(ε)-

approximate schedule in which all jobs inside one of the groups are processed together.

Then, losing at most a factor of 1 + O(ε) in the cost, every group can be consider as a

single larger job. Notice that the groups consisting of only one job need not be considered

in the proof. The rest of the groups consist only of jobs smaller than ε2rL, and therefore by

construction their total processing time will be smaller than 2ε2rL. Also, since all of this jobs

have pj ≤ ε2rL, we can consider a near-optimal schedule where none of them is processed in

more than one interval. Indeed, using Stretch Intervals we create enough space to schedule

all these crossing jobs completely inside the interval where they begin processing.

Let us thus fix a group, and consider the machines and intervals in which the jobs that

belong to it are being processed. Interpreting a machine-interval pair as a virtual machine,

the group can be interpreted as a virtual job that is fractionally assigned to the virtual

machines containing its jobs. Now we can apply Shmoys and Tardos’ theorem (Theorem

48

2.4) to round this fractional solution so that each virtual job is processed completely inside

a virtual machine. The rounding guarantees that the total processing time assigned to each

virtual machine is not increased in more than 2ε2rL, since this is the largest a virtual job

can be. By applying Stretch Intervals twice we create the extra space needed. Also, the

completion time of the virtual job is increased in at most a factor 1 + ε, since the rounding

only assigns a virtual job to an interval if some of its jobs where previously assigned to it.

Therefore the completion time of no order is further increased. We conclude that we can

consider each of these groups as one larger job.

To finish the proof we must show that if the smallest job of an order has total processing

time smaller than ε3rL we can merge it with the biggest job. Indeed, if L was left with more

than one job after merging jobs into groups, the biggest job has processing time at least ε2rL.

By applying Stretch Completion Times we create a gap of idle time of at least ε3rL before

it, leaving enough space to fit the smallest job in there.

Remark that after this step we can guarantee that no big order L contains a small job

with respect to rL. In Step (3) we first reduce the number of possible values a processing time

can take by rounding them to powers of 1 + ε. It is easy to see that this does not increase

the cost of the solution in more than a factor 1 + ε. Indeed, applying Stretch Completion

Time leaves enough space so we can increase the size of every job j up to (1 + ε)pj , and

(1 + ε)⌈log1+ε pj⌉ ∈ [pj, (1 + ε)pj]. In the second part of Step (3) we classify orders by saying

how many jobs of each size they contain. Since we are assuming that every processing time

is greater than one, and are powers of (1 + ε), there are only P := ⌈log(maxj∈J pj)⌉ possible

values a processing time can take.

In Step (4) we delay the release dates of big orders that do not have any chance of begin

processed at their current release date. For that, we let Ot be the set of big orders released

at (1 + ε)t. We further partition Ot by the type of the orders. As explained before, for each

order type, the orders with largest weight will be processed first, and therefore we can delay

the release date of the orders with shortest weight that do not fit in It. In the following we

will show that for any type of big order α and for any t, at most K(1 + ε)s+2m/ε5 = O(m)

orders that belongs to Oαt can be processed in It, and therefore the rest can be delayed. The

next lemma help us to accomplish this.

Lemma 4.5. After delaying at most ⌈log(K/ε3) + s + 1⌉ times the release date of an order,

the order becomes small with respect to its new release date.

49

Proof. Let rL be the release date of an order L as it was initialized in Step (1). Note that

the definition of rL and rj implies that pj ≤ rL(1 + ε)s+1/ε, and since there are only K

jobs per order then p(L) ≤ rLK(1 + ε)s+1/ε. If the release date has been delayed at least

⌈log(K/ε3)+ s+1⌉ times, then p(L) ≤ (1+ ε)log(rL)+rK(1+ ε)s−r+1/ε ≤ ε2(1+ ε)log rL+r, and

therefore L is a small order with respect to its new release date.

Recall that for the original release dates, every job belonging to a big order satisfies

pj ≥ ε3rL. Therefore the last lemma implies that at any point in the algorithm, if L is a big

order with respect to its current release date rL, then any job j belonging to L satisfies that

pj ≥ ε6rL/(K(1 + ε)s+2). Thus, at most

m · |Ilog rL| · K(1 + ε)s+2

ε6rL

=K(1 + ε)s+2m

ε5

orders of each type in each Oαt can start before (1 + ε)t+1. The rest can have their release

date increased to (1 + ε)t+1 without further affecting the cost. With this, at each point in

time (1 + ε)t, and for each type of big order, there will be only

(1 + ε)tmK(1 + ε)2s+3/ε6 = (1 + ε)tO(mK/ε7) = (1 + ε)tO(m)

total processing time released at (1+ε)t. To conclude that there will be in total (1+ε)tO(m)

processing time of big orders released at (1 + ε)t, is sufficient to show that there are only

O(1) different types of big orders in Ot.

Lemma 4.6. At any point in the algorithm and at any time index t, there are only a constant

number, KO(log(K/ε)), of different types of big orders released at (1 + ε)t.

Proof. As shown before, every job j that belongs to an order L ∈ Ot satisfies

ε6(1 + ε)t

K(1 + ε)s+2≤ pj ≤

(1 + ε)t(1 + ε)s+1

ε,

where the second inequality follows since pj is smaller than (1+ε)s+1/ε times the release date

of L ∋ j. Thus, pj can only take ⌈2s + 3 − 7 log ε + log K⌉ = O(log(K/ε)) different values,

and by definition of type there will not be more than (K + 1)O(log(K/ε)) different ones.

Summarizing, we have proved the following.

50

Theorem 4.7. At the end of the algorithm, there will be f1(ε,K)m(1 + ε)t total processing

time corresponding to big orders (w.r.t (1 + ε)t) released at (1 + ε)t, for every t ∈ u, . . . , v.Here the function f1(ε,K) is given by KO(log(K/ε))/ε7 = KO(log(K/ε)).

With this we have completely dealt with big orders, but in the process we have created

several orders that are small with respect to their release dates. In Step (5) we deal with

these newly created orders. As before, we must define release dates such that at any instant

(1 + ε)t there are at most (1 + ε)tO(m) total processing time corresponding to small orders

(w.r.t (1 + ε)t) released at (1 + ε)t. As in the big orders case, we delay orders that can not

begin processing until the following release date. The following lemma explains how this is

possible.

Lemma 4.8. Given a feasible schedule of big orders (w.r.t. its release date), loosing at most a

factor of 1+O(ε), small orders (w.r.t. their release date) can be scheduled by a list scheduling

procedure in nonincreasing order of wL/p(L).

Proof. Let us consider a fixed schedule of big orders. Notice that all small orders can be

considered as just one job. Indeed, applying the same argument as in Lemma 4.4, we can

consider an order as a virtual job partially assigned to machine-interval pairs, and apply

Shmoys and Tardos’ theorem (Theorem 2.4).

Let us call the midpoint of a job j the value Mj = Cj − pj/2. Note that since we are

only considering orders that are small with respect to their release date, a near-optimal

schedule minimizing the sum of weighted midpoints is also near optimal for minimizing sum

of weighted completion times. Indeed, this follows since in this case the starting time of a

job is within a 1 + O(ε)-factor from its completion time.

The last observation leads us to consider the problem of optimizing the sum of weighted

midpoints in a single variable-speed machine. The speed of the machine at time s is given

by v(s), where v(s) is the number of machines that are free at s in the schedule of big

orders. This clearly lower bounds the cost of our original problem for the sum of weighted

midpoints objective. Note that the definition of midpoint of job j in this setting should be

Mj = 1/pj

∫∞

0Ij(s)v(s)sds, where Ij is the indicator function of job j in the schedule, i.e.

Ij(s) equals one if j is being processed at instant s, and zero otherwise.

In other words it is enough for our purpose to find an algorithm for minimizing sum of

weighted midpoints in a single machine of variable speed, and then turn it to a solution

on our original schedule of big orders increasing the cost in at most a (1 + O(ε))-factor.

51

Interestingly, finding the schedule minimizing the sum of weighted midpoints on one variable

speed machine can be achieved by scheduling in nonincreasing weight to processing time ratio

(known as Smith’s rule).

Claim: Let J be a set of jobs with associated processing times pj and weights wj. Consider

that we have a single machine with variable speed v(s) for s ∈ [0,∞). Then scheduling jobs

in nonincreasing order of wj/pj gives a solution minimizing the sum of weighted midpoints.

To prove the claim, we proceed by contradiction. Let us consider an optimal solution for

which there exists two jobs j and k, such that j is processed right before k and wk/pk > wj/pj.

Let Mj and Mk be the midpoints of this jobs in this schedule. Observe that swapping this two

jobs decreases the cost of the schedule. To see this, let M ′j and M ′

k be the midpoints of job j

and k respectively, in the schedule where k is processed before j. Noting that wk/pk > wj/pj,

and pjM′j + pkM

′k = pjMj + pkMk, the difference in the cost can be evaluated as

wjM′j + wkM

′k − wjMj − wkMk <

wk

pk

(pjM′j + pkM

′k)−

wj

pj

(pjMj + pkMk) ≤ 0,

which proves the claim.

Finally, we show that if we Stretch Intervals on the schedule of big orders in the m

machines, we can use Smith’s rule (list scheduling) over small orders, yielding a near-optimal

schedule. Indeed, applying Stretch Intervals to a schedule introduces an extra ε2(1 + ε)t idle

time in every machine and interval It, as long as no job of a big order completely crosses

it. Clearly this increases the processing capacity in the m machines enough to ensure that

the load corresponding to small orders processed in any interval surpasses that processed in

the same interval but in the single machine with variable speed. This implies that, for every

small order L, the starting time in the parallel machine schedule Sj and the completion time

in the single machine schedule CSj satisfies Sj ≤ (1 + ε)CS

j . The result follows.

Remark that the variable-speed single machine scheduling problem defined in the proof

of the last lemma is NP-hard when the objective is to minimize sum of weighted completion

times. This follows by a simple reduction from number partition to the restricted case in

which the speed v(s) ∈ 0, 1. We do not know whether there exists a PTAS for this problem,

which would also be enough for the purpose of the proof.

Lemma 4.8 implies that at each time index t we can order small orders by wL/p(L)

and delay the release date of orders that do not fit inside It. After doing this, at most

ε(1 + ε)tm + ε3(1 + ε)tm ≤ ε(1 + ε)t+1m processing time of small orders will be released at

52

(1 + ε)t. Putting together this fact with Theorem 4.7, we obtain the following result and its

corresponding corollary.

Theorem 4.9. At the end of the algorithm the following holds: for every time index t, there

are (f1(ε,K) + ε(1 + ε)) m(1 + ε)t total processing time released at (1 + ε)t.

Corollary 4.10. Let g(ε,K) := ⌈log((f1(ε,K) + ε(1 + ε))ε−2⌉ + s + 1. There exists an

(1 + O(ε))-approximate schedule where every order L ∈ O is processed in between rL and

rL(1 + ε)g(ε,K).

Proof. Let us consider any (1 + O(ε))-approximate schedule. Applying Stretch Intervals

generates ε2(1 + ε)t extra idle space for each interval-machine pair (It, i), and mε2(1 + ε)t in

total for each interval. For a fixed t, we move to the left orders released at (1 + ε)t that are

completely scheduled after (1 + ε)t+g(ε,K)−s, by using the space corresponding to the interval

starting at (1 + ε)t+g(ε,K)−s created by Stretch Intervals. The rest of the orders must start

processing after (1 + ε)t+g(ε,K)−s, and since they do not cross more than s intervals they will

finish before (1+ε)t+g(ε,K). In this way the cost is not increased in more than a factor (1+ε),

since after applying Stretch Intervals we only move jobs to the left. Also, the structure of

the near-optimal solution given in Lemma 4.1 is preserved because orders that are moved

can be processed consecutively.

To conclude we must show that we can process all orders released at (1 + ε)t in the idle

space corresponding to the interval starting at (1 + ε)t+g(ε,K)−s created by Stretch Intervals.

For that is enough to notice that the total processing time released at (1+ε)t is smaller than

the extra idle space in the interval starting at (1 + ε)t+g(ε,K)−s−1. Indeed,

mε2(1 + ε)t+g(ε,K)−s−1 = mε2(1 + ε)t+⌈log((f1(ε,K)+ε(1+ε))ε−2)⌉ ≥ m(1 + ε)t(f1(ε,K) + ε(1 + ε)).

Finally, since for every sufficiently small ε every order released at t is small w.r.t. (1 +

ε)t+g(ε,K)−s−1, we can accommodate all its jobs in (1 + ε)t+g(ε,K)−s. Clearly this can be done

simultaneously for every t = u, . . . , v.

4.3 Polynomial Representation of Order’s Subsets

The objective of this section is to find a collection Θℓ of sets of orders that are processed in

block Bℓ or before in a near-optimal schedule. To do this we will give a collection Uℓ of sets of

53

orders that are processed in Bℓ+1 or later. Clearly, this also defines the sets in Θℓ by simply

taking the complement of each set in Uℓ. Note that every set in Uℓ must contain all orders

with release date larger than or equal to (1+ε)(s+1)(ℓ+1), so it enough to decide which are the

orders that are going to be processed after Bℓ+1 among those released before (1+ ε)(s+1)(ℓ+1).

Also, by Corollary 4.10 no order finishes after g(ε,K) intervals after its release date. Then,

to guess which orders are going to be processed after (1 + ε)(s+1)(ℓ+1) we only consider orders

with release dates between (1 + ε)(s+1)(ℓ+1)−g(ε,K) and (1 + ε)(s+1)(ℓ+1). For the sake of clarity,

we define the sets Θℓ using the following algorithm.

Algorithm: Polynomial Representation of Order’s Subset

1. Let rj := (1+ε)⌊log εpj⌋, u := log(minj∈J rj), v := ⌈log(∑

j∈J pj)⌉, P := ⌈log(maxj∈J pj)⌉be as in Algorithm: Localization. Define u = ⌊ u

s+1⌋ and v = ⌊ v

s+1⌋ as a lower and

upper bound for the indices of blocks.

2. For all t = u, . . . , v:

(i) Consider At ⊆ 0, . . . ,KP , the set of possible types of big orders released at

(1 + ε)t. Note that by Lemma 4.6, |At| = KO(log K/ε) = O(1).

(ii) For every α ∈ At, consider the set Oαt defined in Algorithm: Localization

and order its elements by non-increasing order of weight. Define the collection of

nested sets of heavier orders in Oαt as:

Wαt := W ⊆ Oα

t |W contains the k orders with largest wL in Oαt for some k = 0, . . . , n.

(iii) Consider the set Rt defined in Algorithm Localization and order its elements

by nonincreasing order of wL/p(L). Define the collection of nested sets of orders

having larger weight to processing time ratio in Rt as:

Vt := V ⊆ Rt|V contains the k orders of largest wL/p(L) in Rt for some k = 0 . . . , n .

(iv) Define the collection of possible sets of orders released at (1 + ε)t, formed all

possible unions of sets in Vt and Wαt , for all α ∈ At, as:

Nt :=

V ∪

⋃

α∈At

Wα : V ∈ Vt and Wα ∈ Wαt

.

54

3. For all ℓ = u, . . . , v, define Uℓ as the collection of all sets of the form

L ∈ O : L is released after time (1 + ε)(s+1)(ℓ+1)

∪

g(ε,K)⋃

t=0

Nt,

where Nt ⊆ N(s+1)(ℓ+1)−g(ε,K)+t for t = 0, . . . , g(ε,K).

4. For all ℓ = u, . . . , v, let Θℓ be the collection containing the complement of every set in

Uℓ.

In Step (2) of the algorithm we construct a collection Nt for every time index t, that

contains the sets of orders released at (1 + ε)t that could be processed after an arbitrary

time instant in a near-optimal schedule. Let us consider only orders released at (1 + ε)t. As

described in Step (2.ii), we first construct the collection of possible sets of big order (w.r.t

(1 + ε)t) of a given type α. Since for any two orders with the same type the one with largest

weight is scheduled first, then the orders of type α that are processed the latest are those

with the smallest weight. Then we can restrict to consider at most n sets for every type

of orders. Analogously, by Lemma 4.8, small orders (w.r.t (1 + ε)t) can be scheduled by

non-increasing order of wL/p(L). Finally, in Step (2.iv) we construct Nt as the collection

containing all possible combinations of sets formed as the union of a set of every type of big

order, and a set of small orders. Since we are considering only n sets of orders for each type

of big orders and there are only KO(log K/ε) different types, then |Nt| = nKO(log K/ε).

In Step (3) we define the collections of sets Uℓ by combining all possible sets in Nt for t

corresponding to orders that can be processed in Bℓ+1 or later, i.e. for t = (s + 1)(ℓ + 1)−g(ε,K), . . . , v. Clearly |Uℓ| ≤ |Nt|g(ε,K) = nKO(log K/ε)/ε2

, which is polynomial in n. Finally, in

Step (4) we construct Θℓ by taking the complement of every set in Uℓ.

4.4 Polynomial Representation of Frontiers

We now prove that, for a given block, it is enough to consider only a polynomial number of

outgoing frontiers. This is necessary to apply the described dynamic programming algorithm.

Let us consider a fix block Bℓ.

Recall that an outgoing frontier can be seen as a vector whose i-th component denotes

the value of the frontier corresponding to machine i. In is important to observe that we can

55

restrict ourselves to only consider frontiers whose entries belong to

Γℓ :=(1 + ε)(s+1)(ℓ+1) + kε3(1 + ε)(s+1)(ℓ+1)−1 : k = 0, . . . , ⌈(1 + ε)[(1 + ε)s + 1]/ε3⌉

.

Whenever the frontier corresponds to an outgoing frontier of block Bℓ.

Indeed, notice first that if for some schedule Fℓ−1 and Fℓ are outgoing frontiers of blocks

Bℓ−1 and Bℓ respectively, then there is a schedule such that for every machine either the

difference between the two frontiers is greater than ε2(1 + ε)(s+1)(ℓ+1)−1, or the frontier Fℓ

coincides with the beginning of the block Bℓ+1. Otherwise, the set of jobs processed between

Fℓ−1 and Fℓ in machine i has total processing time smaller than ε2(1 + ε)(s+1)(ℓ+1)−1, thus

Stretch Intervals will create enough time in I(s+1)(ℓ+1)−1 to fit all these jobs. Then Fℓ can be

considered to coincide with the beginning of Bℓ+1.

In this latter case it is clear that by taking k = 0 the corresponding entry of the outgoing

frontier belongs to Γℓ. On the other hand, in the former case we have that the difference

between the frontiers is greater than ε2(1 + ε)(s+1)(ℓ+1)−1, and by Stretch Completion Times

we can generate at least ε3(1 + ε)(s+1)(ℓ+1)−1 total idle time in between the frontiers. By

moving all jobs in between frontiers as much as possible to the left (without modifying left

frontier), all created idle time can be assumed to be next to Fℓ. Then we can move Fℓ to

the left in order to bring its corresponding component to an element of Γℓ. Clearly all this

procedure increases the cost at most in a factor 1 + O(ε).

With the previous observations we can restrict ourselves to only look at |Γℓ|m different

outgoing frontiers for each block. However, this is not of polynomial size. To overcome this

difficulty, we consider a more concise representation of outgoing frontiers.

Concise description of frontier: A concise outgoing frontier Fℓ of block Bℓ is a vector

of |Γℓ| entries, in which the k-th component is the number of machines that have (1 +

ε)((s+1)(ℓ+1)) + (k − 1)ε3(1 + ε)(s+1)(ℓ+1)−1 as a value of frontier. Then the set of all possible

outgoing frontier is given by Fℓ := 0, . . . ,m|Γℓ|.

This description of concise outgoing frontier is not enough to represent any possible

block Bℓ lying in between Fℓ−1 and Fℓ. Nevertheless, by doing some extra enumeration

the representation is achieved. Since all machines are equal, a block Bℓ with incoming and

outgoing frontiers Fℓ−1 and Fℓ, can be fully described in the following way: For each pair of

elements in t1 ∈ Γℓ−1 and t2 ∈ Γℓ we need

|i = 1, . . . ,m : (Fℓ−1)i = t1 and (Fℓ)i = t2| ,

56

i.e., the number machines available from t1 to t2 in the block. Clearly for each pair Fℓ−1 and

Fℓ we can enumerate all possible such descriptions coinciding with Fℓ−1 and Fℓ. Indeed, for

each element z ∈ 0, . . . ,m|Γℓ−1|×|Γℓ|, z coincides with Fℓ−1 and Fℓ if and only if

∑

j∈Γℓ

zij = (Fℓ−1)i for all i ∈ Γℓ−1,

∑

i∈Γℓ−1

zij = (Fℓ−1)j for all j ∈ Γℓ.

This requires to check m|Γℓ−1|·|Γℓ| = mO(1/ε8) possible block descriptions. Also, the number of

concise frontiers is bounded by |Fℓ| ≤ m|Γℓ| = mO(1/ε4).

4.5 A PTAS for a specific block

To conclude our algorithm we need to compute the table W (ℓ, F ′, F, U) as a subroutine in

Algorithm: PTAS-DP. I.e., for a given block Bℓ, concise incoming and outgoing frontiers

F ′ and F , and subset of orders U , we need to find a (1 + ε)-approximate solution of the

schedule minimizing the sum of weighted completion time of jobs in U inside Bℓ. Note that

it is possible to move the frontiers in a factor (1 + ε) without increasing the cost of a global

solution by more than a factor (1 + ε).

In the sequel, we consider orders and jobs as big or small with respect to the beginning of

block Bℓ. In other words, a job will be small if its processing time is smaller than ε3(1+ε)(s+1)ℓ,

and big otherwise. Additionally, an order will be small if its total processing time is smaller

than ε2(1 + ε)(s+1)ℓ, and big otherwise. Following the ideas of the previous sections, we

enumerate over schedules of big orders, and apply Lemma 4.8 to greedily assign small orders

using Smith’s rule.

Algorithm: PTAS for a block

1. Redefine the release dates rL := (1 + ε)(s+1)ℓ for all L ∈ U .

2. Apply Step (2) of Algorithm: Localization, and round the processing time of the

new jobs to the next power of (1 + ε).

3. Let Qℓ := p ∈ R : log(p) ∈ N ∩ [3 log(ε) + (s + 1)ℓ, (s + 1)(ℓ + 2)] be the set of pos-

sible size a job that belongs to a big order L ∈ U could have. Also, define the set of

57

possible types of big orders in U as Cl ⊆ 0, . . . ,KP . Additionally set

Ωℓ :=(1 + ε)(s+1)ℓ

(1 + kε4

): k = 0, . . . ,

⌈((1 + ε)2(s+1) − 1)/ε4

⌉= ω1, . . . , ω|Ωℓ|.

We will see that we can restrict ourselves to schedules where every big job only start

processing in an instant that belongs to Ωℓ.

As we are rounding processing times and starting times we will require some extra room.

Therefore, we redefine: Ωℓ := (1 + ε)5 · Ωℓ and Γℓ := (1 + ε)5 · Γℓ. Here, multiplying a

scalar times a set means that every element gets multiplied.

4. Define a single machine configuration as a vector S with |Ωl| + 2 entries. For k =

1, . . . , |Ωℓ|, its k-th entry contains a pair (qk, ck) ∈ (Qℓ∪0)× (Cℓ∪∅), where qk can

be interpreted as the processing time of a job that starts processing at ωk, and ck as the

type of order that contains the job. To represent that no job starts processing at a time

instant ωk we set Sk = (0, ∅). The last two entries of S, S|Ωℓ|+1 ∈ Γℓ−1 and S|Ωℓ|+2 ∈ Γℓ,

represent the values of the incoming and outgoing frontier of block Bℓ in that machine.

Is it sufficient to consider vectors S where there is enough space to schedule jobs of the

sizes described in S without overlapping, and respecting the corresponding incoming

an outgoing frontier. In other words, a valid S must satisfy:

(a) For each k = 1, . . . , |Ωℓ|, if i > k and ωi < ωk + qk then Si = (0, ∅).

(b) For all k = 1, . . . , |Ωℓ|, if ωk < S|Ωℓ|+1 then Sk = (0, ∅).

(c) For all k = 1, . . . , |Ωℓ|, if ωk + qk > t|Ωℓ|+2 then Sk = (0, ∅).

Thus let the set S ⊆ ((Qℓ ∪ 0)× (Cℓ ∪ ∅))|Ωℓ| × Γℓ−1 × Γℓ be the set of valid single

machine configurations. Notice that S = S1, . . . , S |S| is of constant size.

5. For a given schedule we define its parallel machine configuration as a vector M ∈0, . . . ,m|S| whose i-th component denotes the number of machines having Si as sin-

gle machine configuration. We only consider vectors M that agree with the concise

incoming and outgoing frontier F and F ′. In other words, if sk denotes the k-th ele-

ment in Γℓ−1, and vk the k-th element in Γℓ, we can restrict ourselves to consider vectors

58

M satisfying,

∑

i:Si|Ωℓ|+1

=sk

Mi = Fk, for all k = 1, . . . , |Γℓ−1|

∑

i:Si|Ωℓ|+2

=vk

Mi = F ′k, for all k = 1, . . . , |Γℓ|.

We also only consider vectors M in which all big orders are completely processed, i.e.,

if for every p ∈ Qℓ and c ∈ Cℓ,

|j ∈ J : pj = p and j ∈ L ∈ U for L of type c|

=

|S|∑

i=1

Mi · |k ∈ 1, . . . , |Ωℓ| : Si

k = (p, c)|.

Define the set of all such possible parallel machine configuration asM.

6. For every parallel machine configuration M inM do the following.

(a) For each k = 1, . . . , |S| associate Mk of the machines with the single machine

configuration Sk arbitrarily. After this process, let us call T (i) the single machine

configuration that was associated with machine i. Then, for k = 1, . . . , |Ωℓ| and

for each machine i = 1, . . . ,m do:

i. Call (q, c) = T (i)k the job size and order type given by the single machine

configuration associated with machine i at time ωk.

ii. Choose the order of type c of largest weight that has a job of size q not yet

scheduled, and process it at time ωk in machine i.

The schedule thus constructed is a best possible schedule of big orders that

agrees with the parallel machine configuration M .

(b) Consider small orders as only one job, and schedule them in the available space

using list scheduling in non-increasing order of wL/p(L), respecting the incoming

and outgoing frontier defined by the configuration M , i.e., in each machine i

schedule jobs only between T (i)|Ωℓ|+1 and T (i)|Ωℓ|+2. If this is not possible consider

the cost of the schedule as infinity.

7. For all of the schedules constructed in the two last steps choose the one with lowest

59

cost.

As in Algorithm: Localization, steps (1) and (2) are useful for reducing to constant

size the number of possible different types of big orders. In steps (3) and (4) we classify

single machine schedules by defining single machine configurations. For that we define three

sets. The first set, Qℓ, contains the possible sizes a job that belongs to a big order can take.

As seen in Section 4.2, the grouping in Step (2) ensures that all of these jobs have processing

time greater than ε3(1 + ε)(s+1)ℓ. Also, since all jobs must be processed inside Bℓ, we can

assume that pj ≤ (1 + ε)(s+1)(ℓ+2). This justifies the definition of set Qℓ. As well, is easy

to see that |Qℓ| = 2(s + 1) − 3 log(ε) = O(log(1/ε)). Additionally, this implies that the set

Cℓ ⊆ 0, . . . ,KP of possible types of big orders in U , has cardinality KO(log(1/ε)) = O(1).

Lemma 4.11. By loosing at most a factor (1 + ε) we can assume that in any schedule all

jobs that belongs to big orders start in an instant contained in Ωℓ.

Proof. Noting that after Step (2) all jobs belonging to big orders are big jobs. Since Stretch

Completion Time will produce a gap of at least ε4(1 + ε)(s+1)ℓ before each of these jobs, we

can move each of them to the left such that its starting time is (1 + ε)(s+1)ℓ + iε4(1 + ε)(s+1)ℓ

for some i = 0, . . . , ⌈((1 + ε)2(s+1) − 1)/ε4⌉.

With this it is easy to see that the cardinality of S is KO(log(1/ε)/ε6)/ε8 = KO(log(1/ε)/ε6) =

O(1). Therefore, by construction, the cardinality of the possible parallel machine configura-

tion setM is at most (m + 1)KO(log(1/ε)/ε6)= mO(1).

In Step (6) we enumerate over all possible parallel machine configurations and construct

the schedule of smallest cost. This can be easily done by following the same argument as in

previous sections: for a given type of order the one with the largest weight must be processed

first. This justifies the schedule of big orders constructed in Step (6.a). Finally, following

Lemma 4.8, in Step (6.b) we schedule small orders greedily using Smith’s rule.

Overall, this rounding of processing times, rounding of starting times, and the grouping

and stretching needed for successfully applying Smith’s rule (see Lemma 4.8), requires extra

space in the block to guarantee that our enumeration and greedy processes actually find a

near optimal solution. This extra room is added at the end of Step (2) and it is no more

than a factor (1 + ε)5. We can conclude that Algorithm: PTAS for a block gives a

near-optimal schedule in block Bℓ for orders in L between the concise frontiers F and F ′ with

time moved to the right in a factor (1 + ε)5.

60

It is important to remark that the same cost is achieved for any permutation of machines.

This is useful to reconstruct the optimal solution once the table of the dynamic programming

is filled. Since we are only describing the frontier in a concise manner we do not know

precisely which machine has which value of frontier. A way to overcome this is to construct

the schedule from right to left. First we fix the machine permutation of the outgoing concise

frontier of the last block (i.e., we fix a precise outgoing frontier), with this we can compute a

specific schedule of the block that complies with such a frontier and with the concise incoming

frontier using the previous PTAS. This in turn, fixes a specific incoming frontier, which we

use as outgoing frontier of the previous block. Then we have proved the following.

Theorem 4.12. Algorithm: PTAS-DP is a polynomial time approximation scheme for

the restricted version of P |part|∑wLCL when the number of jobs per order is bounded by a

constant K.

Note that, since n > m for any nontrivial instance, a straightforward calculation shows

that the running time of this algorithm is given by

nKO(log(K/ε))/ε2

mKO(log(1/ε)/ε6)

= nKO(log(K/ε))

mKO(log(1/ε))

= nKO(log(K/ε)/ε6)

,

which is polynomial for fixed K and ε.

4.6 Variations

In the last section we showed a PTAS for minimizing the sum of weighted completion times

of orders in parallel machines, when the number of jobs per order was a constant. Now we

show how to bypass the last assumption by assuming that the number of machines m is a

constant independent of the input. Indeed, we will show that the exact same algorithm gives

a PTAS for this case.

Theorem 4.13. Algorithm: PTAS-DP is a PTAS for Pm|part|∑wLCL.

Proof. It is sufficient to notice that after applying Step (2) of Algorithm: Localization

every order that is small w.r.t. its release date consists of only one job, and that every big

order w.r.t its release date contains jobs bigger than ε3rL. Then, since every order finishes

within s intervals no one can have more than mrL(1 + ε)s/(ε3rL) = O(m) = O(1) jobs in

it.

61

The restricted case when the numbers of orders is constant is considerably simpler for

two reasons. First, the number of possible subset of orders is also constant, and therefore

steps (1) and (2) of Algorithm: PTAS-DP are not necessary: simply define Θℓ as the

power set of O. Also, the number of possible types of orders is also constant, and therefore

Algorithm: PTAS for a block takes polynomial time. Let us call this modified version

Algorithm: PTAS-DP II, then:

Theorem 4.14. Algorithm: PTAS-DP II is a polynomial time approximation scheme

for the restricted version of P |part|∑wLCL when the number of orders bounded by a constant

C.

A simple, though careful, calculation shows that the running time of Algorithm: PTAS-

DP II is O(n) ·mCO(1/ε6), which is polynomial.

62

Chapter 5

Concluding remarks and open

problems

In this work we studied the machine scheduling problem of minimizing sum of weighted

completion time of orders under different environments. In Chapter 2 we first studied some

rounding techniques for the special case of minimizing makespan on unrelated machines. We

showed how a very naive rounding can transform any preemptive schedule to a nonpreemptive

one, without increasing the makespan more than a factor of 4. Then, we proved that this

rounding method is best possible by showing a family of almost tight instances.

In Chapter 3 we presented approximation algorithms for R|rij|∑

wLCL and its preemp-

tive version R|rij, pmpt|∑wLCL. Both algorithms are based on linear program relaxations,

and use, among other things, a rounding technique very similar to the one developed in the

previous chapter for the makespan case. Even if this are the only constant factor approxi-

mation algorithms known for these problems, the large approximation factor leaves several

question open in terms of the approximability of each of them. First, we may ask if the

roundings used in the algorithms can be improved. At first glance, what seems more fea-

sible to improve is the naive trimming of the y values (steps 2 of Algorithm: Greedy

Preemptive LP and Algorithm: Greedy-LP) . Although not a proof, we showed in

Chapter 2 that truncating the variable in a very similar way is best possible for the special

case of minimizing makespan. This suggests that this step cannot be significantly improved

in the more complex algorithms Greedy Preemptive LP and Greedy-LP. To get a more

precise conclusion, it would be interesting to find tight instances for the polyhedrons used

on each of the algorithms. One possible direction for this would be to generalize the family

63

of almost tight instance showed in Section 2.3, although it is not clear how to do this.

Recall that the best known hardness result for R||∑wLCL derives from the fact that

is NP-hard to approximate R||Cmax within a factor better than 3/2. Considering that the

algorithm given in this work achieves a performance guarantee of 27/2, it would be interesting

to close this gap, or at least diminish it. Given the generality of our model, it seems easier

to do this by giving a reduction specifically designed for our problem, improving the 3/2

hardness result for our case.

In Chapter 4 we gave a PTAS for P |part|∑wLCL, where either the number jobs per order,

the number of orders or the number of machines is constant. This generalizes several PTASs

previously known, as the ones for P ||Cmax and P ||∑wjCj. Thus, it would be interesting to

settle whether the unrestricted case P |part|∑wLCL is APX-hard. Also, in this chapter we

introduced the problem of minimizing the sum of weighted midpoints of jobs on a variable

speed machine, proving that it can be polynomially solved by a greedy algorithm. Also,

we briefly discussed the problem of minimizing the sum of weighted completion times on

a variable speed machine. This problem, which can be proved to be NP-hard, has not

known constant factor approximation algorithm, nor a proof showing that this cannot be

accomplished. Answering this question would be of great interest given the very natural

settings where this problem could arise.

Finally, another possible direction for further research is to study the problem of mini-

mizing the sum of weighted completion times of orders on an online setting. In this variant

orders arrive over time and no information is known about an order before it has arrived.

In online problems we are interested in comparing the cost of our solution to the cost of the

optimal solution on the offline setting, where all the information is known since time 0. To

this goal the notion of α-points (see for example [18, 5, 35, 10]) has proven useful for the

problem of minimizing the sum of weighted times of jobs, and thus it would be interesting

to study the use of this technique in our more general setting.

64

Bibliography

[1] F. Afrati, E. Bampis, C. Chekuri, D. Karger, C. Kenyon, S. Khanna, I. Milis, M.

Queyranne, M. Skutella, C. Stein, M. Sviridenko, 1999. “Approximation schemes for

minimizing average weighted completion time with release dates.” Proceedings of the

40th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 32–43.

[2] C. Ambuhl and M. Mastrolilli, 2006. “Single Machine Precedence Constrained Schedul-

ing is a Vertex Cover Problem”, Proceedings of the 14th Annual European Symposium

on Algorithms (ESA), LNCS 4168, 28–39.

[3] C. Ambuhl, M. Mastrolilli, O. Svensson, 2007. “Inapproximability Results for Sparsest

Cut, Optimal Linear Arrangement, and Precedence Constrained Scheduling.” Procced-

ings of the 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS),

329–337.

[4] C. Chekuri, R. Motwani, 1999. “Precedence constrained scheduling to minimize sum of

weighted completion times on a single machine”. Discrete Applied Mathematics, 98:29–

38.

[5] C. Chekuri, R. Motwani, B. Natarajan, C. Stein, 2001. “Approximation techniques for

average completion time scheduling”, SIAM Journal on Computing, 31:146–166.

[6] F. Chudak, D. S. Hochbaum, 1999. “A half-integral linear programming relaxation for

scheduling precedence-constrained jobs on a single machine”. Operations Research Let-

ters, 25:199–204.

[7] Z. Chen and N.G. Hall, 2001. “Supply chain scheduling: assembly systems.” Working

Paper, Department of Systems Engineering, University of Pennsylvania.

65

[8] Z. Chen and N.G. Hall, 2007. “Supply chain scheduling: conflict and cooperation in

assembly systems.” Operations Research, 55:1072–1089.

[9] R. W. Conway, W. L. Maxwell, and L. W. Miller, 1967. “Theory of Scheduling”, Addison-

Wesley, Reading, Mass.

[10] J. R. Correa, M. Wagner, 2005. “LP-Based Online Scheduling: From Single to Parallel

Machines”. Proceedings of the 11th Conference on Integer Programming and Combina-

torial Optimization (IPCO), 3509:196–209.

[11] Stephen Cook, 1971. “The complexity of theorem proving procedures”. Proceedings of

the 3rd Annual ACM Symposium on Theory of Computing, 151–158.

[12] J.R. Correa and A.S. Schulz, 2005. “Single Machine Scheduling with Precedence Con-

straints.” Mathematics of Operations Research, 30:1005–1021.

[13] M.E. Dyer and L. A. Wolsey, 1999. “Formulating the single machine sequencing problem

with release dates as a mixed integer program.” Discrete Applied Mathematics, 26:255–

270.

[14] L. Danzer, B. Grunbaum, and V. Klee, 1963. “Helly’s theorem and its relatives, in

“Convexity” “. Proceedings of the Symposium in Pure Mathematics, 7:101–180.

[15] W.L. Eastman, S. Even, I.M. Isaacs, 1964. “Bounds for the optimal scheduling of n jobs

on m processors”, Management Science, 11:268–279.

[16] J. Eckhoff, 1993. ”Helly, Radon, and Caratheodory type theorems“. In P. M. Gruber

and J. M. Wills, editors, Handbook of Convex Geometry, 389–448, North-Holland, Am-

sterdam.

[17] M.R. Garey, D.S. Johnson, 1979. “Computers and Intractability: A Guide to the Theory

of NP-completness”. Freeman, New York.

[18] M. X. Goemans, 1997. “Improved approximation algorithms for scheduling with release

dates”. Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms,

New Orleans, 591–598.

[19] R. L. Graham, 1966. “Bounds for certain multiprocessing anomalies,” Bell Systems

Technical Journal, 45:1563–1581.

66

[20] R.L. Graham, E. L. Lawler, J. K. Lenstra, A. H. G. Rinnooy Kan, 1979. “Optimization

and approximation in deterministic sequencing and scheduling: a survey”. Annals of

Discrete Mathematics, 5:287–326.

[21] L. A. Hall, A. S. Schulz, D. B. Shmoys, J. Wein, 1997. “Scheduling to minimize aver-

age completion time: off-line and on-line approximation algorithms”. Mathematics of

Operations Research, 22:513–544.

[22] D. Hochbaum and D. Shmoys, 1987. “Using dual approximation algorithm for scheduling

problems: Theoretical and practical results”, Journal of the ACM, 34:144–162.

[23] H. Hoogeveen, P. Schuurman, G. J. Woeginger, 2001, “Non-approximability results for

scheduling problems with minsum criteria”, INFORMS Journal on Computing, 13:157–

168.

[24] R. M. Karp, 1972. “Reducibility Among Combinatorial Problems”. In R. E. Miller and

J. W. Thatcher, editors, Complexity of Computer Computations, Plenum, New York,

85–103.

[25] E. L. Lawler and , J. Labetoulle, 1978. “On Preemptive Scheduling of Unrelated Parallel

Processors by Linear Programming”. Journal of the ACM 25:612–619.

[26] J. K Lenstra, A. H. G. Rinnooy Kan, 1978. “Complexity of scheduling under precedence

constrains”. Operations Research, 26:22–35.

[27] J. K. Lenstra, D.B Shmoys and E. Tardos, 1990, “Approximation algorithms for schedul-

ing unrelated parallel machines”, Mathematical Programming, 46:259–271.

[28] J. Leung, H. Li, and M. Pinedo, 2006. “Approximation algorithm for minimizing to-

tal weighted completion time of orders on identical parallel machines.” Naval Research

Logistics, 53:243–260.

[29] J. Leung, H. Li, M. Pinedo and J. Zhang, 2007. “Minimizing Total Weighted Comple-

tion Time when Scheduling Orders in a Flexible Environment with Uniform Machines.”

Information Processing Letters, 103:119–129.

[30] J. Leung, H. Li, and M. Pinedo, 2007. “Scheduling orders for multiple product types to

minimize total weighted completion time.” Discrete Applied Mathematics, 155:945–970.

67

[31] L. Levin, 1973. “Universal sorting problems”, Problems in Information Transmission,

9:165–266.

[32] F. Margot, M. Queyranne, Y. Wang, 2003. “Decompositions, network flows, and a prece-

dence constrained single machine scheduling problem.” Operations Research, 51:981–992.

[33] M. Queyranne, 1993. “Structure of a simple scheduling polyhedron”, Mathematical Pro-

gramming, 58:263–285.

[34] A. Schrijver, 2004. “Combinatorial Optimization”, Springer-Verlag, Germany, Volu-

men A.

[35] A. Schulz and M. Skutella, 2002, “Scheduling unrelated machines by randomize round-

ing”, SIAM Journal on Discrete Mathematics, 15:450–469.

[36] A. S. Schulz and M. Skutella, 1997. “Random-based scheduling: New approximations

and LP lower bounds”. In J. Rolim, editor, Randomization and Approximation Tech-

niques in Computer Science, LNCS 1269, 119–133.

[37] D. B. Shmoys, E. Tardos, 1993. “An approximation algorithm for the generalized as-

signment problem”. Mathematical Programming, 62:561–474.

[38] M. Skutella, 2001. “Convex quadratic and semidefinite programming relaxations in

scheduling”, Journal of the ACM, 48:206–242.

[39] M. Skutella, 2002. “List Scheduling in Order of α-Points on a Single Machine”. In

E. Bampis, K. Jansen and C. Kenyon, editors, Efficient Approximation and Online

Algorithms, Springer-Verlag, Berlin, 250–291.

[40] M. Skutella and G. J. Woeginger, 2000. “Minimizing the total weighted completion time

on identical parallel machines,” Mathematics of Operations Research, 25:63–75.

[41] W. E. Smith, 1956. “Various optimizers for single-stage production.” Naval Research

Logics Quarterly, 3:59-66.

[42] V. Vazirani, 2001. “Approximation Algorithms”. Springer-Verlag, New York.

[43] G. J. Woeginger, 2003. “On the approximability of average completion time scheduling

under precedence constraints”. Discreet Applied Mathematics, 131:237–252.

68

[44] J. Yang and M.E. Posner, 2005. “Scheduling parallel machines for the customer order

problem.” Journal of Scheduling, 8:49–74.

69

Documents

UNIVERSIDAD DE CHILE FACULTAD DE CIENCIAS F´ISICAS Y ...jverschae/memoria.pdf · approximation algorithms for scheduling orders on parallel machines submitted in partial fulfillment