HANDBOOK OF SCHEDULING: ALGORITHMS, MODELS, AND PERFORMANCE
ANALYSIS Joseph Y-T. Leung
FORTHCOM ING T I T LE S
HANDBOOK OF COMPUTATIONAL MOLECULAR BIOLOGY Srinivas Aluru
HANDBOOK OF ALGORITHMS FOR WIRELESS AND MOBILE NETWORKS AND
COMPUTING Azzedine Boukerche
DISTRIBUTED SENSOR NETWORKS S. Sitharama Iyengar and Richard R.
Brooks
SPECULATIVE EXECUTION IN HIGH PERFORMANCE COMPUTER ARCHITECTURES
David Kaeli and Pen-Chung Yew
HANDBOOK OF DATA STRUCTURES AND APPLICATIONS Dinesh P. Mehta and
Sartaj Sahni
HANDBOOK OF BIOINSPIRED ALGORITHMS AND APPLICATIONS Stephan Olariu
and Albert Y. Zomaya
HANDBOOK OF DATA MINING Sanjay Ranka
THE PRACTICAL HANDBOOK OF INTERNET COMPUTING Munindar P.
Singh
SCALABLE AND SECURE INTERNET SERVICE AND ARCHITECTURE Cheng Zhong
Xu
Series Editor: Sartaj Sahni
CHAPMAN & HALL/CRC COMPUTER and INFORMATION SCIENCE
SERIES
CHAPMAN & HALL/CRC
Preface
Scheduling is a form of decision-making that plays an important
role in many disciplines. It is concerned
withtheallocationofscarceresourcestoactivitieswiththeobjectiveofoptimizingoneormoreperformance
measures. Depending on the situation, resources and activities can
take on many different forms. Resources
may be nurses in a hospital, bus drivers, machines in an assembly
plant, CPUs, mechanics in an automobile
repair shop, etc. Activities may be operations in a manufacturing
process, duties of nurses in a hospital,
executions of computer programs, car repairs in an automobile
repair shop, and so on. There are also
many different performance measures to optimize. One objective may
be the minimization of the mean
flow time, while another objective may be the minimization of the
number of jobs completed after their
due dates.
Scheduling has been studied intensivelyfor more than 50 years, by
researchers in management, industrial
engineering, operations research, and computer science. There is
now an astounding body of knowledge
in this field. This book is the first handbook on scheduling. It is
intended to provide a comprehensive
coverage of the
most advanced and timely topics in
scheduling. A major goal of this project is to bring
together researchers in the above disciplines in order to
facilitate cross fertilization. The authors and topics
chosen cut across all these disciplines.
IwouldliketothankSartajSahniforinvitingmetoeditthishandbook.Iamgratefultoalltheauthorsand
co-authors (more than 90 in total) who took time from their busy
schedules to contribute to this handbook.
Without their efforts, this handbook would not have been possible.
Edmund Burke and Michael Pinedo
have given me valuable advice in picking topics and authors. Helena
Redshaw and Jessica Vakili at CRC
Press have done a superb job in managing the project.
I would like to thank Ed Coffman for teaching me scheduling theory
when I was a graduate student at
Penn State. My wife, Maria, gave me encouragement and strong
support for this project.
This work was supported in part by the Federal Aviation
Administration (FAA) and in part by the
National Science Foundation (NSF). Findings contained herein are
not necessarily those of the FAA or
NSF.
The Editor
Joseph Y-T. Leung, Ph.D., is Distinguished Professor of
Computer Science in New Jersey Institute of
Technology. He received his B.A. in Mathematics from Southern
Illinois University at Carbondale and his
Ph.D. in Computer Science from the Pennsylvania State University.
Since receiving his Ph.D., he has taught
at Virginia Tech, Northwestern University, University of Texas at
Dallas, University of Nebraska at Lincoln,
and New Jersey Institute of Technology. He has been chairman at
University of Nebraska at Lincoln and
New Jersey Institute of Technology.
Dr. Leung is a member of ACM and a senior member of IEEE. His
research interests include scheduling
theory, computational complexity, discrete optimization, real-time
systems, and operating systems. His
science, industrial engineering, operations research, and
management science so that cross fertilization
can be facilitated. The authors and topics chosen cut across all of
these disciplines.
1.2 Overview of the Book
The book comprises six major parts, each of which has several
chapters.
Part I presents introductory materials and notation.
Chapter 1 gives an overview of the book and
the α|β|γ notation for classical scheduling
problems. Chapter 2 is a tutorial on complexity theory.
It
is included for those readers who are unfamiliar with the theory of
NP-completeness and NP-hardness.
Complexity theory plays an important role in scheduling theory.
Anyone whowants toengagein theoretical
scheduling research should be proficient in this
topic. Chapter 3 describes some of the basic
scheduling
algorithms for classical scheduling problems. They include Hu’s,
Coffman-Graham, LPT, McNaughton’s,
and Muntz-Coffman algorithms for makespan minimization; SPT, Ratio,
Baker’s, Generalized Baker’s,
Smith’s,andGeneralizedSmith’srulesfortheminimizationoftotal(weighted)completiontime;algorithms
for dual objectives (makespan and total completion time); EDD,
Lawler’s, and Horn’s algorithms for the
minimization of maximum lateness; Hodgson-Moore algorithm for
minimizing the number of late jobs;
Lawler’s pseudo-polynomial algorithm for minimizing the total
tardiness.
Part II is devoted to classical scheduling problems.
These problems are among the first studied by
scheduling theorists, and for which the 3-field notation
(α|β|γ ) was introduced for classification.
Chapters 4 to 7 deal with job shop, flow shop, open shop, and cycle
shop, respectively. Job shop problems
are among the most dif ficult scheduling problems. There was
an instance of job shop with 10 machines
and 10 jobs that was not solved for a very long time. Exact
solutions are obtained by enumerative search.
Chapter 4 gives a concise survey of elimination rules and
extensions that are one of the most powerful
tools for enumerative search designed in the last two decades.
Hybrid flow shops are flow shops where
each stage consists of parallel and identical
machines. Chapter 5 describes a number of
approximation
algorithms for two-stage flexible hybrid flow shops with
the objective of minimizing the makespan. Open
shops are like flow shops, except that the order of processing
on the various machines is immaterial.
Chapter 6 discusses the complexity of generating exact and
approximate solutions for both nonpreemp-
tive and preemptive schedules, under several classical objective
functions. Cycle shops are like job shops,
except that each job passes through the same route on the
machines. Chapter 7 gives polynomial-time
and pseudo-polynomial algorithms for cycle shops, as well as
NP-hardness results and approximation
algorithms.
Chapter 8 shows a connection between an NP-hard preemptive
scheduling problem on parallel and
identical machines with the corresponding problem in a job shop or
open shop environment for a set of
chains of equal-processing-time jobs. The author shows that a
number of NP-hardness proofs for parallel
and identical machines can be used to show the NP-hardness of the
corresponding problem in a job shop
or open shop.
Chapters 9 to 13 cover the five major objective functions in
classical scheduling theory: makespan,
maximum lateness, total weighted completion time, total weighted
number of late jobs, and total weighted
tardiness. Chapter 9 discusses the makespan objective on
parallel and identical machines. The author
presents polynomial solvability and approximability, enumerative
algorithm, and polynomial-time ap-
proximations under this framework. Chapter 10 deals with
the topic of minimizing maximum lateness
on parallel and identical machines. Complexity results and exact
and approximation algorithms are given
for nonpreemptive and preemptive jobs, as well as jobs with
precedence constraints. Chapter 11 gives a
comprehensive review of recently developed approximation algorithms
and approximation schemes for
minimizing the total weighted completion time on parallel and
identical machines. The model includes
jobs with release dates and/or precedence
constraints. Chapter 12 gives a survey of the problem of
mini-
mizing the total weighted number of late jobs. The chapter
concentrates mostly on exact algorithms and
their correctness proofs. Total tardiness is among the most
dif ficult objective functions to solve, even
for a single machine. Chapter 13 gives branch-and-bound
algorithms for minimizing the total weighted
tardiness on one machine, where jobs are nonpreemptible and have
release dates (but not precedence
constraints).
Many NP-hard scheduling problems become solvable in polynomial time
when the jobs have identical
processing times. Chapter 14 gives polynomial-time
algorithms for several of these cases, concentrating
on one machine as well as parallel and identical
machines’ environments.
The scheduling problems dealt in the above-mentioned chapters are
all of fline deterministic scheduling
problems. This means that the jobs’ characteristics are known
to the decision maker before a schedule
is constructed. In contrast, online scheduling restricts the
decision maker to schedule jobs based on the
currently available information. In particular, the jobs’
characteristics are not known until they arrive.
Chapter 15 surveys the literature in online scheduling.
A number of approximation algorithms for scheduling problems have
been developed that are based
on linear programming. The basic idea is to formulate the
scheduling problem as an integer programming
problem, solve the underlying linear programming relaxation to
obtain an optimal fractional solution,
and then round the fractional solution to a feasible integer
solution in such a way that the error can be
bounded. Chapter 16 describes this technique as applied
to the problem of minimizing the total weighted
completion time on unrelated machines.
Part III is devoted to scheduling models that are different
from the classical scheduling models. Some of
these problems come from applications in computer science and some
from the operations research and
management community.
Chapter 17 discusses the master-slave scheduling model. In
this model, each job consists of three stages
and processed in the same order: preprocessing, slave processing,
and postprocessing. The preprocessing
and postprocessing of a job are done on a master machine (which is
limited in quantity), while the slave
processing is done on a slave machine (which is unlimited in
quantity). Chapter 17 gives NP-hardness
results, polynomial-time algorithms, and approximation algorithms
for makespan minimization.
Local area networks (LAN) and wide area networks (WAN) have been
the two most studied networks in
the literature. With the proliferation of hand-held computers,
Bluetooth network is gaining importance.
Bluetooth networks are networks that have an even smaller distance
than LANs. Chapter 18 discusses
scheduling problems that arise in Bluetooth networks.
Suppose a manufacturer needs to produce d i units
of a certain product for customer i , 1 ≤ i
≤ n.
Assume that each unit takes one unit of time to produce. The total
time taken to satisfy all customers
is D =
d j . If we produce all units for a customer before
we produce for the next customer, then the
last customer will have to wait for a long time. Fair sequences are
those such that each customer would
ideally receive (d j /D)t units at time
t . Chapter 19 gives a review of fair sequences. Note that
fair sequences
are related to Pfair scheduling in Chapter 27 and fair
scheduling of real-time tasks on multiprocessors in
Chapter 30.
In scheduling problems with due date-related objectives, the due
date of a job is given a priori and the
scheduler needs to schedule jobs with the given due dates. In
modern day manufacturing operations, the
manufacturer can negotiate due dates with customers. If the due
date is too short, the manufacturer runs
the risk of missing the due date. On the other hand, if the due
date is too long, the manufacturer runs
the risk of loosing the customer. Thus, due date assignment and
scheduling should be integrated to make
better decisions. Chapters 20 and 21 discuss
due date assignment problems.
In classical scheduling problems, machines are assumed to be
continuously available for processing. In
practice, machines may become unavailable for processing due to
maintenance or breakdowns. Chapter 22
describes scheduling problems with availability constraints,
concentrating on NP-hardness results and
approximation algorithms.
So far we have assumed that a job only needs a machine for
processing without any additional resources.
For certain applications, we may need additional resources, such as
disk drives, memory, and tape drives,
etc. Chapters 23 and 24 present scheduling
problems with resource constraints. Chapter 23 discusses
discrete resources, while Chapter 24 discusses continuous
resources.
In classical scheduling theory, we assume that each job is
processed by one machine at a time. With the
advent of parallel algorithms, this assumption is no longer valid.
It is now possible to process a job with
several machines simultaneously so as to reduce the time needed to
complete the job. Chapters 25 and 26
deal with this model. Chapter 25 gives complexity results
and exact algorithms, while Chapter 26 presents
approximation algorithms.
Part IV is devoted to scheduling problems that arise in
real-time systems. Real-time systems are those
that control real-time processes. As such, the primary concern is
to meet hard deadline constraints, while
the secondary concern is to maximize machine utilization. Real-time
systems will be even more important
in the future, as computers are used more often to control our
daily appliances.
Chapter 27 surveys the pinwheel scheduling problem, which is
motivated by the following application.
Supposewehave n satellites andone receiver in a
groundstation.Whensatellite j wantsto
sendinformation
to the ground, it will repeatedly send the same information
in a j consecutive time slots, after which it
will
cease to send that piece of information. The receiver in the ground
station must reserve one time slot for
satellite j during those a
j consecutive time slots, or else the information
is lost. Information is sent by
the satellites dynamically. How do we schedule the receiver to
serve the n satellites so that no information
is ever lost? The question is equivalent to the following: Is it
possible to write an infinite sequence of
integers, drawn from the set {1 , 2 , . .
. , n}, so that each integer j , 1 ≤
j ≤ n, appears at least once in any
a j consecutive positions? The answer, of course,
depends on the values of a j .
Suf ficient conditions and
algorithms to construct a schedule are presented in Chapter
27.
In the last two decades, a lot of attention has been paid to the
following scheduling problem. There are
n periodic, real-time jobs. Each job i has an
initial start time si , a computation time c i , a
relative deadline
d i , and a period pi . Job i initially makes
a request for execution at time si , and thereafter at
times si + kpi ,
k = 1 , 2 , . . . . Each request for
execution requires c i time units and it
must finish its execution within d i time units from
the time the request is made. Given m ≥ 1 machines, is it
possible to schedule the requests
of these jobs so that the deadline of each request is
met? Chapter 28 surveys the current state of the art
of
this scheduling problem.
Chapter 29 discusses an important issue in the scheduling of
periodic, real-time jobs — a high-priority
job is blocked by a low-priority job due to priority
inversion. This can occur when a low-priority job gains
access to shared data, which will not be released by the job until
it is finished; in other words, the low-
priority job cannot be preempted while it is holding the shared
data. Chapter 29 discusses some solutions
to this problem.
Chapter 30 presents Pfair scheduling algorithms for real-time
jobs. Pfair algorithms produce schedules
in which jobs are executed at a steady rate. This is similar to
fair sequences in Chapter 19, except that the
jobs are periodic, real-time jobs.
Chapter 31 discusses several approaches in scheduling
periodic, real-time jobs on parallel and identical
machines. One possibility is to partition the jobs so that each
partition is assigned to a single machine.
Another possibility is to treat the machines as a pool and allocate
upon demand. Chapter 31 compares
several approaches in terms of the effectiveness of optimal
algorithms with each approach.
Chapter 32 describes several approximation algorithms for
partitioning a set of periodic, real-time jobs
into a minimum number of partitions so that each partition can be
feasibly scheduled on one machine.
Worst-case analyses of these algorithms are also presented.
When a real-time system is overloaded, some time-critical jobs
willsurely miss their deadlines. Assuming
that each time-critical job will earn a value if it is completed on
time, how do we maximize the total value?
Chapter 33 presents several algorithms, analyzes their
competitive ratios, and gives lower bounds for any
competitive ratios. Note that this problem is equivalent to online
scheduling of independent jobs with the
goal of minimizing the weighted number of late jobs.
One way to cope with an overloaded system is to completely abandon
a job that cannot meet its deadline.
Another way is to execute less of each job with the hope that more
jobs can meet their deadlines. This model
is called the imprecise computation model. In this model each
job i has a minimum execution time mini
and a maximum execution time max i , and the job is expected
to execute αi time units, mini ≤ αi ≤ max i
.
If job i executes less than max i time units,
then it incurs a cost equal to max i −αi . The objective is
to find
a schedule that minimizes the total (weighted) cost or the maximum
(weighted) cost. Chapter 34 presents
algorithms that minimize total weighted cost, and Chapter35
presentsalgorithms that minimize maximum
weighted cost as well as dual criteria (total weighted cost and
maximum weighted cost). Chapter 36 studies
the same problem with arbitrary cost functions. It is noted there
that this problem has some connections
with power-aware scheduling.
Chapter 37 presents routing problems of real-time messages on
a network. A set of n messages reside
at various nodes in the network. Each message M i
has a release time r i and a
deadline d i . The message is
to be routed from its origin node to its destination node. Both
online and of fline routing are discussed.
NP-hardness results and optimal algorithms are presented.
Part V is devoted to stochastic scheduling and queueing
networks. The chapters in this part differ from
the previous chapters in that the characteristics of the jobs (such
as processing times and arrival times) are
not deterministic; instead, they are governed by some probability
distribution functions.
Chapter38 comparesthe three classes of scheduling: of fline
deterministic scheduling, stochastic schedul-
ing, and online deterministic scheduling. The author points out the
similarities and differences among
these three classes.
Chapter 39 deals with the earliness and tardiness penalties.
In Just-in-Time (JIT) systems, a job should
be completed close to its due date. In other words, a job should
not be completed too early or too late. This
is particularly important for products that are perishable, such as
fresh vegetables and fish. Harvesting is
another activity that should be completed close to its due date.
The authors studied this problem under
the stochastic setting, comparing the results with the
deterministic counterparts.
The methods to solve queueing network problems can be classified
into exact solution methods and
approximation solution method. Chapter 40 reviews the
latest developments in queueing networks with
exact solutions. The author presents suf ficient conditions
for the network to possess a product-form
solution, and in some cases necessary conditions are also
presented.
Chapter 41 studies disk scheduling problems. Magnetic disks
are based on technology developed 50
years ago. There have been tremendous advances in magnetic
recording density resulting in disks whose
capacity is several hundred gigabytes,butthe mechanical nature of
disk access remains a seriousbottleneck.
This chapter presents scheduling techniques to improve the
performance of disk access.
The Internet has become an indispensable part of our life. Millions
of messages are sent over the Internet
everyday. Globally managing traf fic in such a large-scale
communication network is almost impossible.
In the absence of global control, it is typically assumed in
traf fic modeling that the network users follow
the most rational approach; i.e., they behave selfishly to optimize
their own individual welfare. Under
these assumptions, the routing process should arrive into a Nash
equilibrium. It is well known that
Nash equilibria do not always optimize the overall performance of
the system. Chapter 42 reviews the
analysis of the coordination ratio, which is the ratio of the worst
possible Nash equilibrium and the overall
optimum.
Part VI is devoted to applications. There are chapters
that discuss scheduling problems that arise in the
airline industry, process industry, hospitals, transportation
industry, and educational institutions.
Suppose you are running a professional training firm.
Your firm offers a set of training programs, with
each program yielding a different payoff. Each employee can teach a
subset of the training programs.
Client requests arrive dynamically, and the firm must decide
whether to accept the request, and if so
which instructor to assign to the training program(s). The goal of
the decision maker is to maximize
the expected payoff by intelligently utilizing the limited
resources to meet the stochastic demand for the
training programs. Chapter 43 describes a formulation of
this problem as a stochastic dynamic program
and proposes solution methods for some special cases.
Constructing timetables of work for personnel in healthcare
institutions is a highly constrained and
dif ficult problem to solve. Chapter 44 presents an
overview of the algorithms that underpin a commercial
nurse rostering decision support system that is in use in over 40
hospitals in Belgium.
University timetabling problems can be classified into two main
categories: course and examination
timetabling. Chapter 45 discusses the constraints for each of them
and provides an overview of some recent
research advances made by the authors and members of their research
team.
Chapter 46 describes a solution method for assigning teachers
to classes. The authors have developed a
system (GATES) that schedules incoming and outgoing
airline flights to gates at the JFK airport in New
York City. Using the GATES framework, the authors continue with its
applications to the new domain of
assigning teachers to classes.
Chapter 47 provides an introduction to constraint programming
(CP), focusing on its application to
production scheduling. The authors provide several examples of
classes of scheduling problems that lend
themselves to this approach and that are either impossible to
formulate, using conventional Operations
Research methods or are clumsy to do so.
Chapter 48 discusses batch scheduling problems in the process
industry (e.g., chemical, pharmaceutical,
or metal casting industries), which consist of scheduling batches
on processing units (e.g., reactors, heaters,
dryers, filters, or agitators) such that a time-based
objective function (e.g., makespan, maximum lateness,
or weighted earliness plus tardiness) is minimized.
The classical vehicle routing problem is known to be NP-hard. Many
different heuristics have been
proposed in the past. Chapter 49 surveys most of these
methods and proposes a new heuristic, called Very
Large Scale Neighborhood Search, for the problem. Computational
tests indicate that the proposed heuristic
is competitive with the best local search methods.
Being in a time-sensitive and mission-critical business, the
airline industry bumps from the left to the
right into allsorts of scheduling problems. Chapter 50 discussesthe
challenges posed by aircraft scheduling,
crew scheduling, manpower scheduling, and other long-term business
planning and real-time operational
problems that involve scheduling.
Chapter 51 discusses bus and train driver scheduling. Driver
wages represent a big percentage, about 45
percent for the bus sector in the U.K., of the running costs of
transport operations. Ef ficient scheduling of
drivers is vital to the survival of transport operators. This
chapter describes several approaches that have
been successful in solving these problems.
Sports scheduling is interesting from both a practical and
theoretical standpoint. Chapter 52 surveys the
current body of sports scheduling literature covering a period of
time from the early 1970s to the present
day. While the emphasis is on Single Round Robin Tournament
Problem and Double Round Robin Tourna-
ment Problem, the chapter also discusses Balanced Tournament
Design Problem and Bipartite Tournament
Problem.
1.3 Notation
In all of the scheduling problems considered in this book, the
number of jobs ( n) and machines (m) are
assumed to be finite. Usually, the subscript
j refers to a job and the subscript i refers
to a machine. The
following data are associated with job j :
Processing Time ( pi j ) — If job
j requires processing on machine i , then pi
j represents the processing
time of job j on machine i . The
subscript i is omitted if job j is
only to be processed on one machine (any
machine).
Release Date (r j ) — The release
date r j of job j is
the time the job arrives at the system, which is the
earliest time at which job j can start its
processing.
Due Date (d j ) — The due date
d j of job
j represents the date the job is expected to complete.
Com-
pletion of a job after its due date is allowed, but it will incur a
cost.
Deadline ( d j ) — The deadline
d j of job
j represents the hard deadline that the job must
respect; i.e.,
job j must be completed
by d j .
Weight (w j ) — The
weight w j of job
j reflects the importance of the job.
Graham et al. [12] introduced the α|β|γ notation to
classify scheduling problems. Theα field describes
the machine environment and contains a single entry. The β
field provides details of job characteristics
and scheduling constraints. It may contain multiple entries or no
entry at all. The γ field contains the
objective function to optimize. It usually contains a single
entry.
The possible machine environment in the α field are as
follows:
Single Machine (1) — There is only one machine in
the system. This case is a special case of all other
more complicated machine environments.
Parallel and Identical Machines (Pm) — There
are m identical machines in parallel. In the
remainder
of this section, if m is omitted, it means that
the number of machines is arbitrary; i.e., the number
of
machines will be specified as a parameter in the input. Each job
j requires a single operation and may be
processed on any one of the m machines.
Uniform Machines (Qm) — There are m machines in
parallel, but the machines have different speeds.
Machine i , 1 ≤ i ≤ m, has speed
si . The time pi j that job
j spends on machine i is equal to
p j /si ,
assuming that job j is completely processed on
machine i .
Unrelated Machines (Rm) — There
are m machines in parallel, but each machine can process
the jobs
at a different speed. Machine i can process job
j at speed s i j . The time pi
j that job j spends on
machine i
is equal to p j /si j , assuming that job
j is completely processed on machine i
.
Job Shop ( Jm) — In a job shop
with m machines, each job has its own predetermined route
to follow.
It may visit some machines more than once and it may not visit some
machines at all.
Flow Shop (Fm) — In a flow shop
with m machines, the machines are linearly ordered and
the jobs all
follow the same route (from the first machine to the last
machine).
Open Shop (Om) — In an open shop
with m machines, each job needs to be processed exactly
once on
each of the machines. But the order of processing is
immaterial.
The job characteristics and scheduling constraints specified in
the β field may contain multiple entries.
The possible entries
are β1 , β2 , β3 , β4 , β5 , β6 , β7 , β8.
Preemptions ( pmtn) — Jobs can be preempted and
later resumed possibly on a different machine. If
preemptions are allowed, pmtn is included in the
β field, otherwise, it is not included in
the β field.
No-Wait (nwt ) — The no-wait constraint is
for flow shops only. Jobs are not allowed to wait
between
two successive machines. If nw t is not
specified in the β field, waiting is allowed between two
successive
machines.
Precedence Constraints ( prec ) — The
precedence constraints specify the scheduling constraints of
the
jobs, in the sense that certain jobs must be completed before
certain other jobs can start processing. The
most general form of precedence constraints, denoted
by prec , is represented by a directed acyclic
graph,
where each vertex represents a job and job i precedes
job j if there is a directed arc from i
to j . If each
job has at most one predecessor and at most one successor,
the constraints are referred to as chains. If
each job has at most one successor, the constraints are referred to
as an intree. If each job has at most one
predecessor, the constraints are referred to as an outtree.
If prec is not specified in
the β field, the jobs are
not subject to precedence constraints.
Release Dates (r j ) — The release date
r j of job j is the
earliest time at which job j can begin
processing.
If this symbol is not present, then the processing of job
j may start at any time.
Restrictions on the Number of Jobs (nbr ) — If
this symbol is present, then the number of jobs is
restricted; e.g., nbr = 5 means that there are at
mostfive jobs to be processed. If this symbol is not present,
then the number of jobs is unrestricted and is given as an input
parameter n.
Restrictions on the Number of Operations in Jobs (n j ) —
This subfield is only applicable to job shops.
If this symbol is present, then the number of operations of each
job is restricted; e.g., n j = 4 means
that
each job is limited to at most four operations. If this symbol is
not present, then the number of operations
is unrestricted.
Restrictions on the Processing Times
( p j ) — If this symbol is
present, then the processing time of
each job is restricted; e.g., p j = p
means that each job’s processing time is p units.
If this symbol is not
present, then the processing time is not restricted.
Deadlines ( d j ) — If this
symbol is present, then each job j must be
completed by its deadline d j .
If
the symbol is not present, then the jobs are not subject to
deadline constraints.
The objective to be minimized is always a function of the
completion times of the jobs. With respect to
a schedule, let C j denote the
completion time of job j . The lateness of
job j is defined as
L j = C j −
d j
The tardiness of job j is defined
as
T j = max(L
j , 0)
The unit penalty of job j is
defined as U j = 1
if C j > d j ;
otherwise, U j = 0.
The objective functions to be minimized are as follows:
Makespan (C max ) — The makespan is
defined as max(C 1 , . . . , C n).
Maximum Lateness (L max ) — The maximum
lateness is defined as max(L 1 , . . . , L n).
Total Weighted Completion Time (
w j C j ) — The total
(unweighted) completion time is denoted
by
Total Weighted Tardiness (
w j T j ) — The total
(unweighted) tardiness is denoted by
T j .
Weighted Numberof Tardy Jobs (
w j U j ) — The total (unweighted)
number of tardy jobs is denoted
by
osnabrueck.de/research/OR/class/
Acknowledgment
This work is supported in part by the NSF Grant DMI-0300156.
References
[1] J. R. Jackson, Scheduling a production line to minimize maximum
tardiness, Research Report 43,
Management Science Research Project, University of California, Los
Angeles, 1955.
[2] J. R. Jackson, An extension of Johnson’s results on job lot
scheduling, Naval Research Logistics
Quarterly , 3, 201–203, 1956.
[3] S. M. Johnson, Optimal two and three-stage production schedules
with setup times included, Naval
Research Logistics Quarterly , 1, 61–67, 1954.
[4] W. E. Smith, Various optimizers for single stage
production, Naval Research Logistics Quarterly , 3,
59–66, 1956.
[5] S. A. Cook, The complexity of theorem-proving procedures,
in Procedings of the 3rd Annual ACM
Symposium on Theory of Computing , Association for Computing
Machinery, New York, 1971,
pp. 151–158.
[6] M. R. Garey and D. S. Johnson, Computers and
Intractability: A Guide to the Theory of NP-
Completeness, W. H. Freeman, New York, 1979.
[7] R. M. Karp, Reducibility among combinatorial problems, in R. E.
Miller and J. W. Thatcher (eds),
Complexity of Computer Computations, Plenum Press, New York, 1972,
pp. 85–103.
[8] P. Brucker, Scheduling Algorithms, 3rd ed.,
Springer-Verlag, New York, 2001.
[9] J. K. Lenstra and A. H. G. Rinnooy Kan, Computational
complexity of scheduling under precedence
constraints, Operations Research, 26, 22–35, 1978.
fixed amount of information from external media into memory.
Computers cannot, in one step, add two
vectors of numbers, where the dimension of the vectors is
unbounded. To add two vectors of numbers
with dimension n, we need n basic steps to
accomplish this.
We measure the running time of an algorithm as a function of the
size of the input. This is reasonable
since we expect the algorithm to take longer time when the input
size grows larger. Let us illustrate the
process of analyzing the running time of an algorithm by means of a
simple example. Shown below is an
algorithm that implements bubble sort. Step 1 reads n, the
number of numbers to be sorted, and Step 2
reads the n numbers into the array A. Step 3 to 5
sort the numbers in ascending order. Finally, Step 6 prints
the numbers in sorted order.
2.2.1 Bubble Sort
1. Read n;
{
4. For j = 1 to n − i do
} }
6. For i = 1
to n do { Print A(i ); }
Step 1 takes c 1 basic steps, where c 1
is a constant that is dependent on the machine, but
independent
of the input size. Step 2 takes c 2n basic steps,
where c 2 is a constant dependent on the machine
only. Step
5 takes c 3 basic steps each time it is executed,
where c 3 is a constant dependent on the machine
only.
However, Step 5 is nested inside a double loop given by Steps 3 and
4. We can calculate the number of
times Step 5 is executed as follows. The outer loop in Step 3 is
executed n − 1 times. In the i th iteration
of
Step 3, Step 4 is executed exactly n − i times.
Thus, the number of times Step 5 is executed is
n−1
i =1
2
Therefore, Step 5 takes a total of c 3n(n − 1)/2
basic steps. Finally, Step 6 takes c 4n basic steps,
where c 4 is a constant dependent on the machine only.
Adding them together, the running time of the algorithm,
T (n), is
2 + c 4n
In practice, it is not necessary, or desirable, to get such a
detailed function for T (n). Very often, we are
only interested in the growth rate of T (n). We can
see from the above function that T (n) is dominated
by
the n2 term. Thus, we say that T (n) is O
(n2), ignoring the constants and any terms that grow slower
than
n2. Formally, we say that a function f (n)
is O ( g (n)) if there are constants
c and n such that f (n) ≤ c
g (n)
for all n ≥ n .
In the remainder of this chapter, we will be talking about the
running time of an algorithm in terms of
its growth rate O(·) only. Suppose an
algorithm A has running time T (n) =
O( g (n)). We say that A is
a polynomial-time algorithm if g (n) is a
polynomial function of n; otherwise, it is an
exponential-time
algorithm. For example, if T (n) = O(n100),
then A is a polynomial-time algorithm. On the other hand,
if
T (n) = O(2n), then A is an exponential-time
algorithm.
Since exponential functions grow much faster than polynomial
functions, it is clearly more desirable to
have polynomial-time algorithms than exponential-time algorithms.
Indeed, exponential-time algorithms
T (n) = O(2n). The fastest computer known today executes one
trillion (1012) instructions per second.
If n = 100, the algorithm will take more than 30 billion
years using the fastest computer! This is clearly
infeasible since nobody lives long enough to see the algorithm
terminates.
Wesaythataproblemistractable if there is a
polynomial-timealgorithmforit;otherwise, it is intractable.
The theory of NP-hardness suggests that there is a large class of
problems, namely, the NP-hard problems,
that may be intractable. We emphasize the
words “may be” since it is still an open question
whether
the NP-hard problems can be solved in polynomial time. However,
there are circumstantial evidence
suggesting that they are intractable. Notice that we are only
making a distinction between polynomial time
and exponential time. This is reasonable since exponential
functions grow much faster than polynomial
functions, regardless of the degree of the polynomial.
Before we leave this section, we should revisit the issue
of “the size of the input.” How do we
define “the
size of the input”? The of ficial definition is the number
of “symbols” (drawn from a fixed set of
symbols)
necessary to represent the input. This definition still leaves a
lot of room for disagreement. Let us illustrate
this by means of the bubble sort algorithm given above. Most people
would agree that n, the number of
numbers to be sorted, should be part of the size of the input. But
what about the numbers themselves? If
we assume that each number can fit into a computer word (which
has a fixed size), then the number of
symbols necessary to represent each number is bounded above by a
constant. Under this assumption, we
can say that the size of the input is O (n). If this
assumption is not valid, then we have to take into account
the representation of the numbers. Suppose a is the magnitude
of the largest number out of the n numbers.
If we represent each number as a binary number (base 2), then we
can say that the size of the input is
O(n log a). On the other hand, if we represent each number as a
unary number (base 1), then the size of
the input becomes O(na). Thus, the size of the input can
differ greatly, depending on the assumptions you
make. Since the running time of an algorithm is a function of the
size of the input, they differ greatly as
well. In particular, a polynomial-time algorithm with respect to
one measure of the size of the input may
become an exponential-time algorithm with respect to another. For
example, a polynomial-timealgorithm
with respect to O (na) may in fact be an exponential-time
algorithm with respect to O (n log a).
In our analysis of the running time of bubble sort, we have
implicitly assumed that each integer fits
into a computer word. If this assumption is not valid, the running
time of the algorithm should be
T (n) = O(n2 log a).
For scheduling problems, we usually assume that the number of jobs,
n, and the number of machines,
m, should be part of the size of the input. Precedence constraint
poses no problem, since there are at most
O(n2) precedence relations for n jobs. What about
processing times, due dates, weights, etc.? They can be
represented by binary numbers or unary numbers, and the two
representations can affect the complexity
of the problem. As we shall see later in the chapter, there are
scheduling problems that are NP-hard with
respect to binary encodings but not unary encodings. We say that
these problems are NP-hard in the
ordinary sense. On the other hand, there are scheduling
problems that are NP-hard with respect to unary
encodings. We say that these problems are NP-hard in
the strong sense.
The above is just a rule of thumb. There are always exceptions to
this rule. For example, consider the
problem of scheduling a set of chains of unit-length jobs to
minimize C max . Suppose there are k
chains,
C 1 , C 2 , . . . , C k , with
n j jobs in chain
C j , 1 ≤ j ≤
k. Clearly, the number of jobs, n, is n =
n j .
According to the above, the size of the input should be at least
proportional to n. However, some authors
insist that each n j should be encoded in
binary and hence the size of the input should be proportional
to
log n j . Consequently, a polynomial-time algorithm with
respect to n becomes an exponential-time
algorithm with respect to
log n j . Thus, when we study the complexity of a
problem, we should bear in
mind the encoding scheme we use for the problem.
2.3 Polynomial Reduction
Central to the theory of NP-hardness is the notion of polynomial
reduction. Before we get to this topic,
we want to differentiate between decision problems and optimization
problems. Consider the following
three problems.
2.3.1 Partition
Given a list A = (a1 , a2 , . .
. , a n) of n integers, can A be partitioned
into A1 and A2 such that
a j ∈ A1 a j =
a j ∈ A2 a j =
1 2
a j ?
2.3.2 Traveling Salesman Optimization
Given n cities, c 1 ,
c 2 , . . . , c n, and a distance function
d (i, j ) for every pair of cities c i
and c j (d (i, j )
=
d ( j, i )), find a tour of the n
cities so that the total distance of the tour is minimum. That is,
find a
permutation σ = (i1 , i 2 , . .
. , i n) such that n−1
j =1 d (i j , i
j +1) + d (in , i 1) is minimum.
2.3.3 0/1-Knapsack Optimization
Given a set U of n
items, U = {u1 , u 2 , . .
. , u n}, with each item u j having a
size s j and a
value v j , and a
knapsack with size K , find a subset U
⊆ U such that all the items in U can be
packed into the knapsack
and such that the total value of the items in U is
maximum.
Thefirstproblem, Partition, is a decision problem.It
hasonly “Yes” or “No” answer. The second problem,
Traveling Salesman Optimization, is a minimization problem. It
seeks a tour such that the total distance
of the tour is minimum. The third problem, 0/1-Knapsack
Optimization, is a maximization problem. It
seeks a packing of a subset of the items such that the total value
of the items packed is maximum.
All optimization (minimization or maximization) problems can be
converted into a corresponding
decision problem by providing an additional parameter ω, and
simply asking whether there is a feasible
solution such that the cost of the solution is ≤ (or≥ in case of a
maximization problem) ω. For example,
the above optimization problems can be converted into the following
decision problems.
2.3.4 Traveling Salesman Decision
Given n cities, c 1 , c 2 , . .
. , c n, a distance function d (i, j ) for
every pair of cities c i and
c j (d (i, j ) =
d ( j, i )),
and a bound B , is there a tour of the n cities so
that the total distance of the tour is less than or equal to
B? That is, is there a
permutation σ = (i1 , i 2 , . .
. , i n) such that n−1
j =1 d (i j , i
j +1) + d (in , i 1) ≤ B ?
2.3.5 0/1-Knapsack Decision
Givenaset U of n items, U = {u1 ,
u2 , . . . , un}, with each item
u j having a sizes j andavalue
v j , a knapsack
with size K , and a bound B , is there a
subset U ⊆ U such that
u j ∈U s j ≤
K and
u j ∈U v j ≥
B?
It turns out that the theory of NP-hardness applies to decision
problems only. Since almost all of
the scheduling problems are optimization problems, it seems that
the theory of NP-hardness is of little
use in scheduling theory. Fortunately, as far as polynomial-time
hierarchy is concerned, the complexity
of an optimization problem is closely related to the complexity of
its corresponding decision problem.
That is, an optimization problem is solvable in polynomial time if
and only if its corresponding decision
problem is solvable in polynomial time. To see this, let us first
assume that an optimization problem can be
solved in polynomial time. We can solve its corresponding decision
problem by simply finding an optimal
solution and comparing its objective value against the given bound.
Conversely, if we can solve the decision
problem, we can solve the optimization problem by conducting a
binary search in the interval bounded
by a lower bound (LB) and an upper bound (UB) of its optimal value.
For most scheduling problems, the
objective functions are integer-valued, and LB and UB have values
at most a polynomial function of its
input parameters. Let the length of the interval between LB and UB
be l . In O (log l ) iterations, the
binary
search will converge to the optimal value. Thus, if the decision
problem can be solved in polynomial time,
then the algorithm of finding an optimal value also runs
in polynomial time, since log l is bounded above
FIGURE 2.1 Illustrating polynomial reducibility.
Because of the relationship between the complexity of an
optimization problem and its corresponding
decision problem, from now on we shall concentrate only on the
complexity of decision problems. Recall
that decision problems have
only “Yes” or “No” answer. We say that an
instance I is a “Yes”-instance
if I
has a “Yes” answer; otherwise, I is
a “No”-instance.
Central to the theory of NP-hardness is the notion of polynomial
reducibility. Let P and Q be
two
decision problems. We say that
P is polynomially reducible (or
simply reducible) to Q , denoted
by P ∝ Q,
if there is a function f that maps
every instance I P
of P into an instance I Q
of Q such that I P
is a
Yes-instance if and only if I Q is a
Yes-instance. Further, f can be computed in
polynomial time.
Figure 2.1 depicts the function f . Notice that
f does not have to be one-to-one or onto. Also,
f maps an
instance I P of P without
knowing whether I P is a Yes-instance or
No-instance. That is, the status of I P
is
unknown to f . All that is required is that
Yes-instances are mapped to Yes-instances and No-instances
are
mapped to No-instances. From the definition, it is clear that
P ∝ Q does not imply that Q
∝ P . Further,
reducibility is transitive, i.e., if P ∝
Q and Q ∝ R, then P ∝ R.
Theorem 2.1
Suppose we have two decision problems P and Q such that
P ∝ Q. If Q is solvable in polynomial time, then
P is also solvable in polynomial time. Equivalently, if P cannot be
solved in polynomial time, then Q cannot
be solved in polynomial time.
Proof
Since P ∝ Q, we can solve
P indirectly through Q. Given an instance
I P of P , we use the
function f to
map it into an instance I Q of Q. This
mapping takes polynomial time, by definition. Since Q
can be solved
in polynomial time, we can decide whether I Q is a
Yes-instance. But I Q is a Yes-instance if and
only if I P
is a Yes-instance. So we can decide
if I P is a Yes-instance in polynomial
time.
We shall show several reductions in the remainder of this section.
As we shall see later, sometimes we can
reduce a problem P from one domain to another
problem Q in a totally different domain. For
example,
a problem in logic may be reducible to a graph problem.
Theorem 2.2
Proof
Let A = (a1 , a 2 , . . . , a
n) be a given instance of Partition. We create an instance of
0/1-Knapsack Decision
as follows. Let there be n items, U =
{u1 , u2 , . . . , un},
with u j having a size s
j = a j and a
value v j = a j .
In essence, each item u j corresponds to the
integer a j in the instance of Partition. The
knapsack size K and
the bound B are chosen to be K =
B = 1 2
a j . It is clear that the mapping can be done in
polynomial
time.
It remains to be shown that the given instance of Partition is a
Yes-instance if and only if the con-
structed instance of 0/1-Knapsack is a Yes-instance. Suppose
I ⊆ {1 , 2 , . . . ,
n} is an index set such that i∈I ai =
1 2
a j . Then U = {ui | i ∈
I } forms a solution for the instance of 0/1-Knapsack
Decision,
since
a j = K = B =
i∈I v i . Conversely, if there is an index set
I ⊆ {1 , 2 , . . . , n}
such that U = {ui | i ∈ I }
forms a solution for the instance of 0/1-Knapsack Decision,
then A =
{ai | i ∈ I } forms a solution for
Partition. This is because
i∈I s i =
a j and
a j . For both inequalities to hold, we must have
i∈I ai =
1 2
a j .
In the above proof, we have shown how to obtain a solution for
Partition from a solution for
0/1-Knapsack Decision. This is the characteristic of reduction. In
essence, we can solve Partition indirectly
via 0/1-Knapsack Decision.
Theorem 2.3
The Partition problem is reducible to the decision version of
P 2 || C max .
Proof
Let A = (a1 , a 2 , . . . ,
an) be a given instance of Partition. We create an instance of the
decision version
of P 2 || C max as follows. Let
there be n jobs, with job i having
processing time ai . In essence, each job
correspondsto an integer in A.Letthebound B be
1 2
a j . Clearly, themapping canbe done in polynomial
time. It is easy to see that there is a partition
of A if and only if there is a schedule with
makespan no larger
than B .
In the above reduction, we create a job with processing time equal
to an integer in the instance of the
Partition problem. The given integers can be partitioned into two
equal groups if and only if the jobs can be
scheduled on two parallel and identical machines with makespan
equal to one half of the total processing
time.
Theorem 2.4
The 0/1-Knapsack Decision problem is reducible to the decision
version of 1 | d j = d
|
w j U j .
Proof
Let U = {u1 , u2 , . . . ,
un}, K , and B be a given instance of
0/1-Knapsack Decision, where u j has a
size s j
and a value v j . We create an instance of the
decision version of 1 | d j = d
|
w j U j as follows. For
each
item u j in U ,wecreateajob
j with processing time p j = s
j and weight
w j = v j . The jobs
have a common
due date d = K . The threshold ω
for the decision version of 1 |
d j = d |
w j U j is ω
=
v j − B .
Suppose the given instance of 0/1-Knapsack Decision is a
Yes-instance. Let I ⊆ {1 , 2 , . .
. , n} be the
index set such that
j ∈I s j ≤
K and
j ∈I v j ≥ B .
Then I is a subset of jobs that can be scheduled
on
time, since
j ∈I p j ≤
K = d . The total weight of all the jobs
in I is
j ∈I w j =
j ∈I v j ≥ B .
Hence
the total weight of all the tardy jobs is less than or equal
to
v j − B . Thus, the constructed instance
of
the decision version of 1 |
d j = d |
w j U j is a
Yes-instance.
Conversely, if the constructed instance of the decision version of
1 | d j = d |
w j U j is a
v j −
v j + B = B . The
set U = {ui | i ∈ I } forms a
solution to the instance of the
0/1-Knapsack Decision problem.
In the above reduction, we create, for each item
u j in the 0/1-Knapsack Decision, a job with a
processing
time equal to the size of u j and a
weight equal to the value of u j . We make
the knapsack size K to be
the common due date of all the jobs. The idea is that if an item is
packed into the knapsack, then the
corresponding job is an on-time job; otherwise, it is a tardy job.
Thus, there is a packing into the knapsack
with value greater than or equal to B if and only if
there is a schedule with the total weight of all the
tardy
jobs less than or equal to
v j − B .
Before we proceed further, we need to define several decision
problems.
Hamiltonian Circuit. Given an undirected graph G =
(V, E ), is there a circuit that goes through each
vertex in G exactly once?
3-Dimensional Matching. Let A = {a1 , a
2 , . . . , aq }, B = {b1 ,
b2 , . . . , b q }, and C =
{c 1 , c 2 , . . . , c q }
be
three disjoint sets of q elements each.
Let T = {t 1 , t 2 , . .
. , t l } be a set of triples such that
each t j consists
of one element from A, one element from B , and one
element from C . Is there a subset T
⊆ T such that
every element in A, B , and C appears in
exactly one triple in T ?
Deadline Scheduling. Given one machine and a set
of n jobs, with each job
j having a processing time
p j , a release time r j , and a
deadline d j , is there a nonpreemptive
schedule of the n jobs such that each job
is executed within its executable interval
[r j , d j ]?
Theorem 2.5
The Hamiltonian Circuit problem is reducible to the Traveling
Salesman Decision problem.
Proof
Let G = (V, E ) be an instance of Hamiltonian
Circuit, where V consists of n
vertexes. We construct an
instance of Traveling Salesman Decision as follows. For each
vertex v i in V , we create a
city c i . The distance
function d (i, j ) is defined as
follows: d (i, j ) = 1 if
(v i , v j ) ∈ E ;
otherwise, d (i, j ) = 2.
We choose B to be n.
Suppose (v i1 , v i2
, . . . , v in ) is a Hamiltonian Circuit.
Then (c i1
, c i2 , . . . , c in
) is a tour of the n cities with
total distance equal to n; i.e., n−1
j =1 d (i j , i
j +1) + d (in , i 1) = n = B .
Thus, the constructed instance of
Traveling Salesman Decision is a Yes-instance.
Conversely, suppose (c i1 , c i2
, . . . , c in ) is a tour of the n
cities with total distance less than or equal
to B . Then the distance between any pair of adjacent cities
is exactly 1, since the total distance is the
sum of n distances, and the smallest value of
the distance function is 1. By the definition of
the distance function, if d (i j ,
i j +1) = 1, then (v i j
, v i j +1
) ∈ E . Thus, (v i1 , v i2
, . . . , v in ) is a Hamiltonian
Circuit.
The idea in the above reduction is to create a city for each vertex
in G . We define the distance function
in such a way that the distance between two cities is smaller if
their corresponding vertexes are adjacent in
G than if they are not. In our reduction we use the values 1
and 2, respectively, but other values will work
too, as long as they satisfy the above condition. We choose the
distance bound B in such a way that there
is a tour with total distance less than or equal to B
if and only if there is a Hamiltonian Circuit in G .
For
our choice of distance values of 1 and 2, the choice
of B equal to n (which
is n times the smaller value of
the distance function) will work.
Theorem 2.6
Proof
Let A = (a1 , a 2 , . . . ,
a n) be an instance of Partition. We create an instance of Deadline
Scheduling as
follows. There will be n + 1 jobs. The first n
jobs are called “Partition” jobs and the last job
is called the
“Divider” job. For each job j , 1 ≤
j ≤ n, p j = a
j , r j = 0, and
d j =
a j + 1. For job n + 1, pn+1
= 1,
r n+1 = 1
a j , and d n+1 = r n+1 +
1.
Suppose the given instance of Partition has a solution. Let
I ⊆ {1 , 2 , . . . , n} be an
index set such that i∈I ai =
1 2
a j . We schedule the jobs in the index set from time 0
until time 1
2
We then schedule job n + 1 from time 1 2
a j until time 1
2
Partition jobs from time 1 2
a j + 1 onward, in any order. It is easy to
see that all jobs can meet their
deadline.
Conversely, if there is a schedule such that every job is executed
in its executable interval, then the
Divider job must be scheduled in the time interval [ 1 2
a j ,
1 2
into two disjoint intervals – [0 , 1 2
a j ] and [ 1
a j + 1] – into which the Partition jobs
are
scheduled. Clearly, there must be a partition of A.
The idea behind the above reduction is to create a Divider job with
a very tight executable interval
([ 1 2
a j ,
1 2
a j + 1]). Because of the tightness of the
interval, the Divider job must be scheduled
entirely in its executable interval. This means that the timeline
is divided into two disjoint intervals, each
of which has length exactly 1 2
a j . Since the Partition jobs are scheduled in these two
intervals, there is a
feasible schedule if and only if there is a partition.
Theorem 2.7
The 3-Dimensional Matching problem is reducible to the decision
version of R || C max .
Proof
Let A = {a1 , a 2 , . . . , a q },
B = {b1 , b2 , . . . , bq },
C = {c 1 , c 2 , . .
. , c q }, and T = {t 1 ,
t 2 , . . . , t l } be a
given
instance of 3-Dimensional Matching. We construct an instance of the
decision version of R || C max
as
follows. Let there be l machines and 3q
+ (l − q ) jobs. For each 1 ≤
j ≤ l , machine
j corresponds to
the triple t j . The first 3q jobs
correspond to the elements in A, B , and C .
For each 1 ≤ i ≤ q , job i
(resp.
q + i and 2q + i ) corresponds to the
element ai (resp. bi and c i ). The
last l − q jobs are dummy jobs. For
each 3q + 1 ≤ i ≤ 3q + (l −q ), the
processing time of job i on any machine is 3 units.
In other words, the
dummy jobs have processing time 3 units on any machine. For each
1 ≤ i ≤ 3q , job i has processing
time
1 unit on machine j if the element corresponding
to job i is in the triple t j ;
otherwise, it has processing
time 2 units. The threshold ω for the decision version
of R || C max is ω =
3.
Suppose T = {t i1 , t i2
, . . . , t iq } is a matching. Then we
can schedule the first 3q jobs on machines
i1 , i 2 , . . . , i q . In particular, the
three jobs that correspond to the three elements in
t i j will be sched-
uled on machine i j . The finishing time of each of these
q machines is 3. The dummy jobs will be scheduled
on the remaining machines, one job per machine. Again,
the finishing time of each of these machines is 3.
Thus, there is a schedule with C max = 3
= ω.
Conversely, if there is a schedule
with C max ≤ ω, then the makespan of the
schedule must be exactly 3
(since the dummy jobs have processing time 3 units on any machine).
Each of the dummy jobs must be
scheduled one job per machine; otherwise, the makespan will be
larger than ω. This leaves q machines to
schedulethefirst3q jobs.These
q machinesmustalsofinishattime3,whichimpliesthateachjobscheduled
on these machines must have processing time 1 unit. But this means
that the triples corresponding to these
q machines must be a matching, by the definition of the
processing time of the first 3q jobs.
The idea in the above reduction is to create a machine for each
triple. We add l − q dummy jobs
to
jam up l − q machines. This
leaves q machines to schedule other jobs. We then create
one job for each
element in A, B , and C . Each of these jobs has a
smaller processing time (1 unit) if it were scheduled on the
it will have a larger processing time (2 units). Therefore, there
is a schedule with C max = 3 if and
only if
there is a matching.
Using the same idea, we can prove the following theorem.
Theorem 2.8
The 3-Dimensional Matching problem is reducible to the decision
version of R | d j = d
|
C j .
Proof
Let A = {a1 , a 2 , . . . , a q },
B = {b1 , b2 , . . . , bq },
C = {c 1 , c 2&nbs