Computational complexity - Heuristic Algorithms · PDF fileComputational complexity Heuristic Algorithms ... Polynomial and exponential complexity In combinatorial optimization, the

Computational complexityHeuristic Algorithms

Giovanni Righini

University of Milan

Department of Computer Science (Crema)

Definitions: problems and instances

A problem is a general question expressed in mathematical terms.Usually the same question can be expressed on many examples:they are instances of the problem.For instance:

• Problem: “Is n prime?”

• Instance: “Is 7 prime?”

A solution S is the answer corresponding to a specific instance.Formally, a problem P is a function that maps instances from a set Iinto solutions (set S):

P : I → S

A priori, we do not know how to compute it: we need an algorithm.

Definitions: algorithms

An algorithm is a procedure with the following properties:

• it is formally defined

• it is deterministic

• it made of elementary operations

• it is finite.

An algorithm for a problem P is an algorithm whose steps aredetermined by an instance I ∈ I of P and produce a solution S ∈ S

A : I → S

An algorithm defines a function and it also computes it.

If the function is the same, the algorithm is exact; otherwise, it isheuristic.

Algorithms characteristics

A heuristic algorithm should be

1. effective: it should compute solutions with a value close to theoptimum;

2. efficient: its computational complexity should be low, at leastcompared with an exact algorithm;

3. robust: it should remain effective and efficient for any possibleinput.

To compute a solution, an algorithm needs some resources. The twomost important ones are

• space (amount of memory required to store data);

• time (number of elementary steps to be performed to computethe final result).

Complexity

Time is usually considered as the most critical resource because:

• time is subtracted from other computations more often thanspace;

• it is often possible to use very large amounts of space at a verylow cost, but not the same for time;

• the need of space is upper bounded by the need for time,because space is re-usable.

It is intuitive that in general the larger is an instance, the larger is theamount of resources that are needed to compute its solution.However how the computational cost grows when the instance size

grows is not always the same: it depends on the problem and on thealgorithm.By computational complexity of an algorithm we mean the speed withwhich the consumption of computational resources grows when thesize of the instance grows.

Measuring the time complexity

The time needed to solve a problem depends on:

• the specific instance to be solved

• the algorithm used

• the machine that executes the algorithm

• . . .

We want a measure of the time complexity with the followingcharacteristics:

• independent of the technology, i.e. it must be the same when thecomputation is done on different hardware;

• synthetic and formally defined, i.e. it must be represented by asimple and well-defined mathematical expression;

• ordinal, i.e. it must allow to rank the algorithms according to theircomplexity.

The observed computing time, does not satisfy these requirements.

Time complexity

The asymptotic worst-case time complexity of an algorithm providesthe required measure in this way:

1. we measure the number T of elementary operations executed(which is computer-independent);

2. we compute a number n which determines the number of bitsneeded to define the size of any instance (e.g., the number ofelements in the ground set in a combinatorial optimizationproblem);

3. we find the maximum number of elementary operations neededto solve instances of size n

T (n) = maxI∈In

T (I) n ∈ N

(this reduces the complexity to a function T : N → N)

4. we approximate T (n) with a simpler funcion f (n), for which weare only interested in the asymptotic trend for n → +∞(complexity is more important when instances are larger)

5. finally we can collect these functions in complexity classes.

Notation: Θ

T (n) ∈ Θ(f (n))

means that

∃c1, c2 ∈ R+, n0 ∈ N : c1 f (n) ≤ T (n) ≤ c2 f (n) for all n ≥ n0

where c1, c2 and n0 are constant values, independent on n.

T (n) is between c1f (n) and c2f (n)

• for a suitable “small value” c1

• for a suitable “large value” c2

• for any size larger than n0

T (n)

T(n)

n

f(n)

A

c f(n)

n0

c f(n)

1

2

Asymptotically, f (n) is an estimate of T (n) within a constant factor:

• for large instances, the computing time is proportional to f (n).

Notation: O

T (n) ∈ O (f (n))

means that

∃c ∈ R+, n0 ∈ N : T (n) ≤ c f (n) for all n ≥ n0

where c and n0 do not depend on n.

T (n) is upper bounded by cf (n)

• for a suitable “large value” c

• for any n larger than a suitablen0

T (n)

T(n)

n

c f(n)

A

f(n)

n0

Asymptotically, f (n) is an upper bound for T (n) within a constantfactor:

• for large instances the computing time is at most proportional tof (n).

Notation: Ω

T (n) ∈ Ω (f (n))

means that

∃c > 0, n0 ∈ N : T (n) ≥ c f (n) for all n ≥ n0

where c and n0 do not depend on n.

T (n) is “lower bounded” by cf (n)

• for some suitable “small value”di c

• for any n larger than n0

T (n)

T(n)

n

f(n)

A

c f(n)

n0

Asymptotically, f (n) is a lower bound of T (n) within a constant factor:

• for large instances the computing time is at least proportional tof (n)

Combinatorial optimization

In combinatorial optimization problems it is natural to define the sizeof an instance as the cardinality of its ground set. An explicit

enumeration algorithm

• considers each subset S ⊆ E ,

• evaluates whether it is feasible (x ∈ X ) in α (n) time,

• evaluates the objective function f (x) in β (n) time,

• records the best value found.

Since the number of solutions is exponential in n, its complexity is atleast exponential, even if α (n) and β (n) are polynomials (as oftenoccurs).

Polynomial and exponential complexity

In combinatorial optimization, the main distinction is between• polynomial complexity: T (n) ∈ O

(

nd)

for a constant d > 0• exponential complexity: T (n) ∈ Ω (dn) for a constant d > 1

The algorithms of the former type are efficient; those of the latter typeare inefficient.

In general, heuristic algorithms are polynomial and they are usedwhen the corresponding exact algorithms are exponential.

Assuming 1 operation/µsecn n2 op. 2n op.1 1µ sec 2µ sec10 0.1 msec 1 msec20 0.4 msec 1 sec30 0.9 msec 17.9 min40 1.6 msec 12.7 days50 2.5 msec 35.7 years60 3.6 msec 366 centuries

Problem transformations and reductions

Some times it is possible and convenient to reformulate an instanceof a problem P into an instance of a problem Q and then to transformback the solution of the latter into a solution of the former.

Polynomial transformation P Q: given any instance of P

• a corresponding instance of Q is defined in polynomial time

• the instance of Q is solved by a suitable algorithm, providing asolution SQ

• from SQ a corresponding solution SP is obtained in polynomialtime

Example: VCP SCP, MCP MISP and MISP MCP.

Problem transformations and reductions

Polynomial reduction P Q: given any instance of P

• an algorithm A is executed a polynomial number of times;

• to solve instances of a problem Q obtained in polynomial timefrom the instance of Pand from the results of the previous runs;

• from the solutions computed, a solution of the instance of P isobtained.

Examples: BPP PMSP and PMSP BPP.

In both cases

• if A is polynomial/exponential, the overall algorithm turns out tobe polynomial/exponential

• if A is exact/heuristic, the overall algorithm turns out to beexact/heuristic

Optimization vs. decision

A polynomial reduction links optimization and decision problems.

• Optimization problem: given a function f and a feasible region X ,what is the minimum of f in X?

f ∗ = minx∈X

f = ?

• Decision problem: given a function f , a value k and a feasibleregion X , do solutions with a value not larger than k exist?

∃x ∈ X : f (x) ≤ k?

The two problems are polynomially equivalent:

• the decision problem can be solved by solving the optimizationproblem and then comparing the optimal value with k ;

• the optimization problem can be solved by repeatedly solving thedecision problem for different values of k , tuned by dichotomoussearch.

Drawbacks of worst-case analysis

The worst-case time complexity has some relevant drawbacks:

• it does not consider the performance of the algorithm on theeasy/small instances; in practice the most difficult instancescould be rare or unrealistic;

• it provides a rough estimate of the computing time growth, not ofthe computing time itself;

• the estimate can be very rough, up to the point it becomesuseless;

• it may be misleading: algorithms with worse worst-casecomputational complexity can be very efficient in practice, evenmore than algorithms with better worst-case computationalcomplexity.

Other complexity measures

To overcome these drawbacks one could employ different definitionsof computational complexity:

• parameterized complexity expresses T as a function of someother relevant parameter k besides the size of the instance n:T (n, k)

• average-case complexity assumes a probability distribution on Iand it evaluates the expected value of T (I) on In

T (n) = E [T (I) |I ∈ In]

If the distribution has some parameter k , the average-casecomplexity is also parameterized, i.e. it provides T (n, k).

Average-case complexity

Average-case complexity analysis and classification is more reliablewhen algorithms are efficient on almost all instances (e.g. the simplexalgorithm for linear programming).We would like to evaluate the expected value of T (I) on In for eachn ∈ N

T (n) = E [T (I) |I ∈ In]

This requires to define the probability distribution of the instances.

• The most frequent hypothesis is equiprobability;(when we do not have any other information.)

• other assumptions must be based on some specific probabilisticmodel of the problem

(often depending on some parameters.)

Random instances: binary matrices

Associating a probability with every instance of a problem is useful fortwo reasons:

• for a priori studying the average-case complexity of an algorithm;• for a posteriori evaluating the efficiency of the algorithm.

In case of heuristic algorithm we also want to evaluate theireffectiveness (the value of the solutions obtained and the distancefrom the optimum).

Random binary matrices of given size (m, n):

1. model with uniform probability p:

Pr[

aij = 1]

= p (i = 1, . . . ,m; j = 1, . . . , n)

If p = 0.5 it provides equiprobability of all instances.

2. model with fixed density δ: given the mn entries of the matrix,δmn are randomly selected with uniform probability distributionand are set to 1.

The two models tend to be similar for p = δ.

Random instances: graphs

Random graphs of size n can be generated as follows:

1. Gilbert model: G (n, p), i.e. uniform probability p:

Pr [(i, j) ∈ E ] = p (i ∈ V , j ∈ V \ i)

Graphs with the same given number of edges m have the sameprobability pm : (1 − p)n(n−1)/2−m (different for each m) If p = 0.5

it coincides with the model where all graphs have the sameprobability.

2. Erdos-Renyi model: G (n,m): given the number o edges m, munordered vertex pairs are randomly selected with uniformprobability distribution and an edge is generated for each ofthem.

The two models tend to be similar for p =2 m

n (n − 1).

Phase transitions

Different values of the parameters of the probability distributionscorrespond to different regions of the instance space.

For several problems we observe that the computing time of thealgorithms is significantly different in different regions. In case ofheuristic algorithms the same holds for the quality of the solutions.

This has to do with the robustness of the algorithms.

In some cases the changes occur suddenly, for some critical values ofthe parameters, reminding the phase transitions in physical systems.

Two things we can do

The design and analysis of heuristic algorithms proceeds in twodirections:

• proving theoretical properties on the algorithms, such as:• worst-case time complexity (usually polynomial);• average-case time complexity or parameterized time complexity;• approximation guarantees;

• evaluating the practical usefulness of the algorithms:• computing time;• approximation;• robustness to instances and to parameters (phase transitions).

The termination is often (arbitrarily) decided on the basis of thenumber of iterations or the computing time elapsed or the lack ofimprovements for a certain time. It is used to calibrate the trade-offbetween approximation and computing time.

Documents

Computational complexity - Heuristic Algorithms · PDF fileComputational complexity Heuristic Algorithms ... Polynomial and exponential complexity In combinatorial optimization, the