CSE 550 Combinatorial Algorithms and Intractability Instructor: Arun Sen Instructor: Arun Sen Office: BYENG 530 Office: BYENG 530 Tel: 480-965-6153 Tel:

CSE 550Combinatorial Algorithms and

Intractability

Instructor: Arun Sen Office: BYENG 530 Tel: 480-965-6153 E-mail: [email protected] Office Hours: MW 3:30-4:30 or by

appointment

mailto:[email protected]

Two additional (recommended) books

Approximation Algorithms for NP-hard Problems – Dorit S. Hochbaum

Approximation Algorithms – Vijay V. Vazirani

Grading Policy There will be one mid-term and a final. In

addition, there will be programming and homework assignments Mid-term 30% Final 40% Assignments 30%

For 90% will ensure A, 80% will ensure B, 70% will ensure C and so on

Loss of points due to late submission of assignments 1 day 50% 2 days 75% 3 days 100%

A combinatorial algorithm is an algorithm for a combinatorial problem.

What is a Combinatorial Problem?

Combinatorics is the branch of mathematics concerned with the study of arrangements, patterns, designs, assignments, schedules, connections and configurations.

Examples

A shop supervisor prepares assignments of workers to tools or work areas

An Industrial Engineer considers production schedules and workplace configurations to maximize production

A geneticist considers arrangements of bases into chains of DNA and RNA

What is a Combinatorial Algorithm?

Types of Combinatorial Problems

Three types of problems in Combinatorics Existence Problems Counting Problems Optimization Problems

Optimization Problems are concerned with the choice of the “best” (according to some criterion) solution among all possible solutions.

In this class, we will focus on optimization and related problems.

How many binary trees can you draw with n nodes?

n=1: b1=1

n=2: b2=2

n=3: b3=5


n=4 b4=14


Suppose b(n) is the number of binary trees that can be constructed with n nodes. In that

case, b(n) can be expressed with the following recurrence relation

Number]Catalan [ 2

)1(

1)(

:Functions Generating

)definition(by 1)0(

)1()()(

)0()1(....

)1()(...)2()1()1()0()(

1

0

thnn

n

nnb

b

xnbxbnb

bnb

xnbxbnbbnbbnb

n

x

Combinatorial Problem in Manufacturing

Various wafers (tasks) are to be processed in a series of stations. The processing time of the wafers in different stations is different. Once a wafer is processed on a station it needs to be processed on the next station immediately, i.e., there cannot be any wait. In what order should the wafers be supplied to the assembly line so that the completion time of processing of all wafers is minimized?

S1 S2 S8

w1 t11 t12 t18

w2 t21 t22 t28

w3 t31 t32 t38

w1 : t11 = 4, t12= 5;

w2 : t21 = 2, t22 = 4;

w1 : w2 : w2:

w2 : w1: Completion Time in the first ordering =

13Completion Time in the second ordering

= 11

S1 : 4 S2 : 5

S1: 2

S2 : 4

S1:2

S2 : 4

S1: 2

S2 : 4

S1 : 4 S2 : 5

Sensor Placement Problem

Sensor Placement in a Temperature Sensitive Environment


Sensor Placement in a Temperature Sensitive Environment


Bio-sensors implanted in human body dissipate energy during their operation and consequently raise the temperature of the surrounding.

A temperature sensitive environment like the human body or brain can tolerate increase in temperature only up to a certain threshold.

One needs to make sure that the rise in temperature due to the operation of implanted bio-sensors in temperature sensitive environments such as human/animal body does not exceed the threshold and cause any adverse impact.

Thermal Model and Analysis A sensor is expected to operate for a certain duration of

time.

The rise is temperature in the area surrounding the sensor will be dependent on the duration of operation.

The sensor surroundings will attain a maximum temperature during the time of operation (steady state temperature).

If the steady state temperature of a sensor exceeds the maximum allowable threshold in the surrounding, such a sensor cannot be deployed.

Question: Is it possible that a sensor whose steady state temperature does not exceed the threshold when operating in isolation, may exceed the threshold when operating with multiple other sensors?

Thermal Model and Analysis

We perform a thorough analysis of the heat distribution phenomenon in a temperature sensitive environment and come to the following conclusion:

There exists a critical inter-sensor distance dcr, such that if the distance between any two deployed sensors is less than dcr, then the temperature in the vicinity of the sensors will exceed the maximum allowable threshold.

Therefore, attention must be paid during sensor deployment to ensure that the distance between any two sensors is at least as large as dcr.

Sensor Coverage Problem Given:

A set of locations (or points pi) to be sensed A set of potential locations (or points q i ) for the

placement of the sensors A minimum separation distance (dcr) between each

pair of sensors.

Objective: To deploy as few sensors as possible in the potential

placement locations such that all points pi are sensed and the distance between any two sensors is at least as large as dcr.

Assumption Each sensor is capable of sensing a circular area of

radius rsen with the location of the sensor being the center of the circle.

Sensor Coverage Problem

Sensor Coverage Problem Formal definition:

Set Cover Problem

Sensor Coverage as Generalized Set Cover Problem

Search Space

The solution is somewhere here

Solution can be found by exhaustive search in the search space

Search space for the solution may be very large Large search space implies long computation

time to find solution (?) Not necessarily true Search space for the sorting problem is very

large The trick in the design of efficient algorithms

lies in finding ways to reduce the search space

Evaluating Quality of Algorithms

Often there are several different ways to solve a problem, i.e., there are several different algorithms to solve a problem

What is the “best” way to solve a problem?

What is the “best” algorithm? How do you measure the “goodness” of

an algorithm? What metric(s) should be used to

measure the “goodness” of an algorithm?

Time Space ** What about Power?

Problem and Instance Algorithms are designed to solve

problems What is a problem?

A problem is a general question to be answered, usually processing several parameters, or free variables, whose values are left unspecified. A problem is described by giving (i) a general description of all its parameters and (ii) a statement of what properties the answer, or the solution, required to satisfy.

What is an instance? An instance of a problem is obtained by specifying

particular values for all the problem parameters.

Traveling Salesman Problem

Instance: A finite set C={c1, c2, …, cm} of cities, a distance d(ci, cj) є Z+ for each pair of cities ci, cj є C and a bound B є Z+ (where Z+ denotes the positive integers).

Question: Is there a tour of all cities in C having total length no more than B, that is an ordering <cπ(1), cπ(2), …, cπ(m)> of C such that,

1

1

)1(),()1(),( )()(m

i

mii BCCdCCd

Algorithms are general step-by-step procedures for solving problems.

An algorithm is said to solve a problem Π if that algorithm can be applied to any instance I of Π and is guaranteed always to produce a solution for that instance I.

In general we are interested in finding the most efficient algorithm for solving a problem.

The time requirements of an algorithm are expressed in terms of a single variable, the size of a problem instance, which is intended to reflect the amount of input data needed to describe the instance.

Measuring efficiency of algorithms

One possible way to measure efficiency may be to note the execution time on some machine

Suppose that the problem P can be solved by two different algorithms A1 and A2.

Algorithms A1 and A2 were coded and using a data set D, the programs were executed on some machine M

A1 and A2 took 10 and 15 seconds to run to completion

Can we now say that A1 is more efficient that A2?


What happens if instead of data set D we use a different dataset D’? A1 may end up taking more time than A2

What happens if instead of machine M we use a different machine M’? A1 may end up taking more time than A2

If one want to make a statement about the efficiency of two algorithms based on timing values, it should read “A1 is more efficient that A2 on machine M, using data set D”, instead of an unqualified statement like “A1 is more efficient that A2”


The qualified statement “A1 is more efficient that A2 on machine M, using data set D” is of limited value as someone may use different data set or a different machine

Ideally, one would like to make an unqualified statement like “A1 is more efficient that A2” , that is independent of data set and machine

We cannot make such an unqualified statement by observing execution time on a machine

Data and Machine independent statement can be made if we note the number of “basic operations” needed by the algorithms The “basic” or “elementary” operations are operations of the

form addition, multiplication, comparison etc

Analysis of Algorithms Size= nTimeCompl Func

10 20 30 40 50 60

(A1) n.00001

sec

(A2) n2

(A3) n3

(A4) n5

(A5) 2n

(A6) 3n

.00003 sec

.00002 sec

.00004 sec

.00005 sec

.00006 sec

.0001 sec

.0004 sec

.0009 sec

.0016 sec

.0025 sec

.0036 sec

.001 sec

.008 .027 .064 .125 .216 sec sec sec sec sec

58 6.5 3855 2*108 1.3*1013

min years cents. cents. cents.

3.2 sec 24.3 sec 1.7 min 5.2 min 13.0 min

.1 sec

.001 sec

.059 sec

1.0 sec

17.9

min

12.7

days

35.7

years

366 centuri

es

Size of Largest Problem Instance Solvable in 1 Hour

Time complexity

function

With present

computer

With computers 100 times

faster

With computer

1000 times faster

n N1

n2 N2

n3 N3

n5 N4

2n N5

3n N6

100 N1 1000 N1

10 N2 31.6 N2 4.64 N3 10 N3

2.5 N4 3.98 N4

N5 + 6.64 N5 + 9.97

N6 + 4.19 N6 + 6.29

Growth of Functions: Asymptotic Notations

O(g(n)) = {f(n): there exists positive constants c and n0 such

that 0<=f(n)<=c * g(n) for all n >= n0}

Ω(g(n)) = {f(n): there exists positive constants c and n0 such

that 0<=c * g(n)<=f(n) for all n >= n0}

Q(g(n)) = {f(n): there exists positive constants c1, c2 and n0

such that 0<= c1 * g(n)<=f(n)<=c2*g(n) for all n >= n0}

o(g(n) = {f(n): for any positive constant c>0 there exists a constant n0>0 such that 0<=f(n)<c * g(n) for all n >= n0}

w(g(n)) = {f(n): for any positive constant c>0 there exists a constant n0 such that 0<=<c * g(n)< f(n) for all n >= n0}

A function f(n) is said to be of the order of another function g(n) and

is denoted by O(g(n)) if there exists positive constants c and n0 such

that 0<=f(n)<=c * g(n) for all n >= n0}

Basic Operations and Data Set

To evaluate efficiency of an algorithm, we decided to count the number of basic operations performed by the algorithm

This is usually expressed as a function of the input data size

The number of basic operations in an algorithm Is it independent of the data set ? Is it dependent on the data set?

Given a set of records R1, …, Rn with keys k1, …,kn.

Sort the records in ascending order of the keys.

Basic Operations and Data Set

The number of basic operations in an algorithm Is it independent of the data set ? Is it dependent on the data set?

If the number of basic operations in an algorithm depends on the data set then one needs to consider Best case complexity Worst case complexity Average case complexity

What does “average” mean? Average over what?

Given n elements X[1], …, X[n], the algorithm finds m and j such that m = X[j] = max 1<=k<=n X[k], and for which j is as large as possible.

Algorithm FindMaxStep 1. Set j n, k n – 1, m X[n]Step 2. If k=0, the algorithm terminates.

Step 3. If X[k] <= m, go to step 5.Step 4. Set j k, m X[k].Step 5. Decrease k by 1, and return to step 2

Moore’s law says that computing power (hardware speed) doubles every eighteen months

How long will it take to have a thousand-fold speed-up in computation, if we rely on hardware speed alone? Answer: 15 years Expected cost: significant

How long will it take to have a thousand-fold speed-up in computation, if we rely on the design of clever algorithms? Thousand-fold speed-up can be attained if currently used O(n5)

complexity algorithm is replaced by a new algorithm with complexity O(n2) for n=10.

How long will it take to develop a O(n2) complexity algorithm which does the same thing as the currently used O(n5) complexity algorithm?

Answer: May be as little as one afternoon Ingredients needed

Pencil Paper A beautiful mind

Expected cost: significantly less than what will be needed if we rely on hardware alone

Computational Speed-up and the Role of Algorithms

Computational Speed-up and the Role of Algorithms

A clever algorithm can achieve overnight what progress in hardware would require decades to accomplish.

“The algorithm things are really startling, because when you get those right you can jump three orders of magnitude in one afternoon.”

William PulleyblankSenior Scientist, IBM Research

Algorithm Design Techniques

Divide and Conquer Dynamic Programming Greedy Algorithms Backtracking Branch and Bound Approximation Algorithms Probabilistic (Randomized) Algorithms Mathematical Programming Parallel and Distributed Algorithms Simulated Annealing Genetic Algorithms Tabu Search

How do you “prove” a problem to be “difficult”?

Suppose that the algorithm you developed for the problem to be solved (after many sleepless nights) turned out to be very time consuming

Possibilities

You haven’t designed an efficient algorithm for the problem

May be you are not that great an algorithm designer May be you are a better fashion designer May be you have not taken CSE 450/598

May be the problem is difficult and more efficient algorithm cannot be designed

How do you know that more efficient algorithm cannot be designed? It is difficult to substantiate a claim that more efficient algorithm cannot be designed Your inability to design an efficient algorithm does not necessarily mean that the

problem is “difficult” It may be easier to claim that the problem “probably” is “difficult” How do you substantiate the claim that the problem “probably” is “difficult”? What if you line up a bunch of “smart” people who will testify that they also think that

the problem is difficult? Theory of NP-Completeness

Problems

Undecidable

Decidable

Tractable(deterministically)

Intractable(deterministically)

Tractable(non-deterministically)

Intractable(non-deterministically)

Taxonomy of Problems

NP-Complete Problems(Most likely deterministically Intractable

Complexity of Algorithms and Problems

In algorithms classes (e.g., CSE 450) we make distinctions between algorithms of complexity O(n2), O(n3), and O(n5).

In this class, we take a much coarse grain view and divide algorithms into only two classes – polynomial time algorithms and non-polynomial time algorithms.

Polynomial time algorithms – Good

Non-polynomial time algorithms - Bad

Good vs. Bad (in Algorithms)

Exponential time algorithms should not be considered “good” algorithms.

Most exponential time algorithms are merely variations of exhaustive search.

Polynomial time algorithms generally are made possible only through gain of some deeper insight into the structure of a problem.

Easy and Difficult Problems

A problem is easy if a polynomial time algorithm is known for it.

A problem may be suspected to be difficult ifa polynomial time algorithm cannot be developed for it, even after significant time and effort.

Theory of NP-Completeness Complexity of an algorithm for a problem says

more about the algorithm and less about the problem If a low complexity algorithm can be found for the

solution of a problem, we can say that the problem is not difficult

If we are unable to find a low complexity algorithm for the solution of a problem, can we say that the problem is difficult?

Answer: No

NP-Completeness of a problem says something about the problem

Problems may or may not be NP-Complete – not the algorithms

Problems and Algorithms for their solution

Problem P

Algorithm 1Complexity: O(n)

Algorithm 3Complexity: O(2n)

Algorithm 2Complexity: O(n4)

Complexity of a Problem

How to prove a problem difficult?

Is the approach of lining up a group of famous people really going to work?

Answer: Probably not

Why would a group of famous people be interested in working on your problem?

“If the mountain does not come to Mohammed, Mohammed goes to the mountain”

If the famous people are not interested in working on your problem, you transform their problem into yours.

If such a transformation is possible, you can now claim that if your problem can easily be solved, so can be theirs.

In other words, if their problem is difficult, so is yours.

Problem Transformation – Hamiltonian Cycle Problem

A cycle in a graph G = (V, E) is a sequence <v1, v2, …, vk> of distinct vertices of V such that {vi, vi+1} e E for 1 <= i < k and such that {vk, v1} e E.

A Hamiltonian cycle in G is a simple cycle that includes all the vertices of G.

Hamiltonian Cycle Problem Instance: A graph G = (V, E) Question: Does G contain a Hamiltonian

cycle?

Traveling Salesman Problem

Instance: A finite set C={c1, c2, …, cm} of cities, a distance d(ci, cj) є Z+ for each pair of cities ci, cj є C and a bound B є Z+ (where Z+ denotes the positive integers).

Question: Is there a tour of all cities in C having total length no more than B, that is an ordering <cπ(1), cπ(2), …, cπ(m)> of C such that,

1

1

)1(),()1(),( )()(m

i

mii BCCdCCd

No-wait Flow-shop Scheduling Problem

S1 S2 S8 w1 t11 t12 t18

w2 t21 t22 t28

w3 t31 t32 t38

Problem Transformation No-wait Flow-shop Scheduling Problem can be

transformed into Traveling Salesman Problem How? We will see it later

Hamiltonian Cycle problem can be transformed to Traveling Salesman Problem How? From an instance of the HC Problem, the graph G =

(V, E), (|V| = n), construct an instance of the TSP problem as follows: Construct a completely connected graph G’ = (V’, E’) where (|V’| = |V|). Associate a distance with each edge of E’. For each edge e’ e E’, if e’ e E then dist(e’) = 1, otherwise dist(e’) = 2. Set B, a problem parameter of the TSP problem, equal to n.

Problem Transformation Claim: Graph G contains a Hamiltonian Cycle, if and

only if there is a tour of all the cities in G’, that has a total length no more than B.

If G has a HC <v1, v2, …, vn>, then G’ has a TSP tour of length n = B, because each intercity distance traveled in the tour corresponds to an edge in G and hence has length 1.

If G’ has a TSP tour of length n = B, then each edge e that contributes to the tour must have dist(e) = 1 (because the tour is made up of n edges). It implies that these edges are present in G as well. These set of edges makes up a Hamiltonian Cycle in G.

Coping with NP-Complete Problems

A combinatorial optimization problem P is either a minimization or a maximization problem and consists of the following three parts A set DP of instances For each instance I a finite set SP(I) of candidate

solutions for I A function mP that assigns to each instance I and each

candidate solution s є SP(I) a positive rational number mP(I, s) called the solution value for s.

If a combinatorial optimization problem is NP-Complete,it might take an absurdly long time (e.g., 300 centuries) to find the optimal solution for the problem.

Probably cannot wait 300 centuries to find the solution

However, the problem does not go away. One still has to find a solution!

Approximation Algorithms

If the optimal solution is unattainable then it is reasonable to sacrifice optimality and settle for a “good” (close to optimal) feasible solution that can be computed efficiently (i.e., within some reasonable amount of time).

We would like to sacrifice as little optimality as possible, while gaining as much as possible in efficiency.

Trading-off optimality in favor of tractability is the paradigm of approximation algorithms

- Dorit. S. Hochbaum

Approximation Algorithms for NP-Hard Problems

“Cost” vs. “Quality” of a solution

“Cost” of a solution: Time spent in finding a solution

“Quality” of a solution: Closeness of the solution to

the optimal Tradeoff between “thinking time” vs.

“execution time” Thinking Time Execution Time

Time

Approximation Algorithms-approximation algorithm

Let be an optimization (minimization or maximization) problem and A an algorithm which, given an instance I of , returns a feasible (but not necessarily optimal) solution denoted by APP(I). Let the optimal solution be denoted by OPT(I). The algorithm A is called an -approximation algorithm for for some ≥ 0 if and only if

| APP(I) – OPT(I) | / OPT(I) ≤ for all instances I

Yet Another Job Scheduling Problem

P1, …, Pm: A set of m independent processors with similar (dissimilar) performance characteristics

T1, …, Tn: A set of n independent tasks with no ordering relationship between them

If the processors are dissimilar (heterogeneous computing environment), the execution time of a task on different processors are different.

tij = Execution time of task Ti on Processor Pj

If the processors are similar (homogeneous computing environment), the execution time of a task on different processors are same.

Yet Another Job Scheduling Problem

Total execution time used by Processor Pj is the sum of the execution times of the tasks assigned to this processor

Makespan of an assignment is maximum of the total execution times of individual processors

Objective: Find the assignment that minimizes the makespan

Question: How difficult is it to find the schedule with the minimum makespan?

Algorithm for the Job Scheduling Problem

(Heterogeneous Environment)

Step1: for j:=1 to m doFj := 0;

Step2: for i:=1 to n dobegin

Step 2.1 Find k where k is the smallest

integer j for which Fj + ti, j is

minimum (1≤j≤ m)Step 2.2 Assign task Ti to

processor Pk

Step 2.3 Update Fk Fk + ti, j

end

JobsExecution Time

Case 1Case 2

1 2 3 4 5 6 7 8 9 10{ , , , , , , , , , }J J J J J J J J J J

{4,7,3,5,9,2,10,3,6,8 / 5}

10 1 4 1 10( )t F F APP I F t

10 1 4 4( )t F F APP I F

J1 J6 J10J8

4 2 3P1

J2 J9

7 6P2

J3 J5

3 9P3

J4 J7

5 10P4

APP(I)

Case 1:

1 10

2 10

3 10

4 10

( )

( )

( )

( )

App I F t

App I F t

App I F t

F tApp I

1 10( )App I F t

( )

(1 )

j nApp I F t

j m

or

1 2

1

1

1

( ) ...

( 1)

m n

n

i ni

n

i ni

m App I F F F mt

t mt

t m t

1

1 1( )

1( ) ( )

n

i ni

mApp I t t

m m

mOPT I OPT I

m

1( ) (2 ) ( )App I OPT I

m

as1

1( ) .....(1)

n

ii

OPT I tm

and ( ) max { |1 }iOPT I t i m

Case 2F4 was the finish time at the end of scheduling jobs J1,.., J7. Let the finish time on the processors at this time be (before scheduling job J7).

4( )App I F

1 7

2 7

3 7

4 7

( )

( )

( )

( )

App I F t

App I F t

App I F t

F tApp I

( )

(1 )

j kApp I F t

j m

1 2, ,..., mF F F

1 2

1

1 1

1

( ) ...

( 1)

( 1)

m k

k k

i k i ki i

n

i ki

m App I F F F mt

t mt t m t

t m t

1

1 1( )

n

i ki

mApp I t t

m m

or

1( ) ( ) ( )

1(2 ) ( )

mApp I OPT I OPT I

m

OPT Im

Steiner Tree Problem

Instance: Graph G (V, E) , a weight w(e) є Z0

+ for each edge e є E, a subset R V, and a positive integer bound B.

Question: Is there a subtree of G that includes all the vertices of R such that the sum of the weights of the edges in the subtree is no more than B.

Heuristic Algorithm for Steiner Trees(H)

INPUT: An undirected graph G = (V, E) and a set of Steiner Points S V.

OUTPUT : A Steiner Tree TH for G and S.

Step 1: Construct the complete undirected graph G1 =(V1, E1) from G and S., in such a Way that V1 = S, and every {vi, vj} E1, the weight (length/cost) on the edge {vi, vj}is equal to the length of the shortest path from vi to vj.

Step 2: Find the minimal spanning tree T1 of G1. (If there are several minimal spanningtrees, pick an arbitrary one.

Step 3: Construct the sub graph Gs of G by replacing each edge in T1 by its corresponding shortest path in G. (If there are several shortest paths, pick an arbitraryone.)

Step 4: Find the minimal spanning tree Ts of Gs. (If there are several minimal spanningtrees, pick an arbitrary one.)

Step 5: Construct a Steiner Tree TH, from T by deleting edges in T, if necessary, so thatall the leaves in TH are Steiner Points.

10

1

1/21/2

11 1

1

8 22

9

v1

v2

v7

v8 v9

v6 v5

v4v3

4

4

4

444

v1

v3v2

v4

4

4

4

v1

v3v2

v4

(a)

(c)

(b)

Steiner Points: {v1, v2, v3, v4}

1

1/21/2

11 1

1

22

v1

v2

v7

v8 v9

v6 v5

v4v3(d)

(f)

(e)

1

1/21/2

1 1

1

22

v1

v2

v7

v8 v9

v6 v5

v4v31

1 1

1

22

v1

v2

v9

v6 v5

v4v3

Analysis of the AlgorithmG = (V, E): Input Graph, S: Steiner Points (|S| = k), H: Heuristic Algorithm

DH = Total length on the edges of the Steiner Tree TH, produce by the algorithm H.

DMIN = Total length on the edges of the minimal Steiner Tree TMIN.

m = Total number of leaves in TMIN.

Construct a loop L around TMIN that traverses each edge of TMIN exactly two times.

Every leaf in TMIN appears exactly once in L.

If ui, uj are two “consecutive” leaves in the loop, then the subpath connecting ui to

uj is a simple path.

We may regard the loop L as composed of m simple subpaths, each connecting a

leaf to another leaf.

Construct a path P by deleting the longest “leaf to leaf” subpath from L.

Length(P) ≤ (1 – 1/m) Length(L)

Every edge in TMIN appears at least once in P.

Let (w1. w2, …, wk) be the k distinct Steiner points appearing in P, in that order.

Length(P) ≤ (1 – 1/m) Length(L) = 2 (1 – 1/m) Length (TMIN) = 2 (1 – 1/m) DMIN

Length(P) ≥ Length of a spanning tree for G1 consisting of the edges

{w1, w2 }, { w2, w3}, …., { wk-1, wk}

Length(P) ≥ Length of the minimal spanning tree for G1.

Length(P) ≥ DH. --- 1

Length(P) ≤ 2 ( 1- 1/m) DMIN. --- 2

DH ≤ 2 ( 1 – 1/m ) DMIN

DH / DMIN ≤ 2 ( 1 – 1/m ) Performance Bound is 2.

Approximation Ratios in Graphs

2-approximation [3 independent papers, 1979-81]

Last decade of the second millennium: 11/6 = 1.84 [Zelikovsky] 16/9 = 1.78 [Berman & Ramayer]

PTAS with the limit ratios: 1.73 [Borchers & Du]1+ln2 = 1.69 [Zelikovsky] 5/3 = 1.67 [Promel & Steger] 1.64 [Karpinski & Zelikovsky] 1.59 [Hougardy & Promel]

2000: 1.55 = 1 + ln3 / 2

Cannot be approximated better than 1.004

The description of a problem instance that we provide as input to the computer can be viewed as a single finite string of symbols chosen from a finite input alphabet.

Each problem has associated with it a fixed encoding scheme which maps problems into the strings describing them. The input length for an instance I of a problem Π is defined to the number of symbols in the description of I obtained from the encoding scheme for Π.

The time complexity function for an algorithm expresses its time requirements by giving, for each possible input length, the largest amount of time needed by the algorithm to solve a problem instance of that size.

A function f(n) is said to be O(g(n)) whenever there exists constant c and n0 such that |f(n)| ≤ c*|g(n)| for all n> n0. A polynomial time algorithm is defined to be one whose time complexity function is O(p(n)) for some polynomial function p, where n is used to denote the input length. Any algorithm whose time complexity function cannot be so bounded is called as exponential time algorithm.

A problem is intractable if it is so hard that no polynomial time algorithm can possibly solve it.

Time complexity as defined is a worst-case measure, and the fact that an algorithm has time complexity 2n means that at least one problem instance of size n requires that much time.

The intractability of a problem turns out to be independent of the particular encoding scheme and computer model used for determining time complexity.

“Reasonable” encoding scheme: Although we do not formalize the notion of “reasonableness”, the following two conditions capture much of the notion:

(1) the encoding of an instance I should be concise and not “padded” with unnecessary information or symbols, and

(2) Numbers occurring in I should be represented in binary (or any fixed base other than 1).

Two different causes of intractability

The problem is so difficult that exponential amount of time is needed to discover a solution.

The solution itself is required to be so extensive that it cannot be described with an expression having length bounded by a polynomial function of the input length.

Undecidable Problem: No algorithm can be developed for solving it.

“It is impossible to specify any algorithm which, given an arbitrary computer program and an arbitrary input to that program, can decide whether or not the program will eventually halt when applied to that input.” [Alan Turing, 1936]

Decidable Intractable Problem: Such problems cannot be solved in polynomial time using even a “nondeterministic” computer model, which has the ability to pursue an unbounded number of independent computational sequences in parallel.

All the provably intractable problems known to date are either undecidable or “non-deterministically” intractable.

However, most of the apparently intractable problems encountered in practice are decidable and can be solved in polynomial time with the aid of non-deterministic computer.

Problem Reduction: The technique used to demonstrate that two problems are related is that of reducing one to the other, by giving a constructive transformation that maps any instance of the first problem into an instance of the second.

Decision Problem: problem with only yes/ no answer.

A decision problem Π consists simply of a set DΠ of instances and a subset YΠ, subset of DΠ of yes instances.

Decision problems and optimization problems: So long as the cost function is relatively easy to evaluate the decision problem can be no harder than the corresponding optimization problem. (many decision problems including TSP, can be shown to be “no easier” than the corresponding optimization problem)

The reason that the study of the theory of NP-completeness is restricted to decision problems is that they have a very natural, formal counterpart, which is a suitable object to study in mathematically precise theory of computation known as “language”.

The correspondence between decision problems and language is brought about by the encoding scheme we use for specifying the instances.

Documents

CSE 550 Combinatorial Algorithms and Intractability Instructor: Arun Sen Instructor: Arun Sen Office: BYENG 530 Office: BYENG 530 Tel: 480-965-6153 Tel: