NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision

Embed Size (px)

Text of NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are...

  • Slide 1
  • NP-complete and NP-hard problems
  • Slide 2
  • Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision problems we are trying to decide whether a statement is true or false. In optimization problems we are trying to find the solution with the best possible score according to some scoring scheme. Optimization problems can be either maximization problems, where we are trying to maximize a certain score, or minimization problems, where we are trying to minimize a cost function.
  • Slide 3
  • Example 1: Hamiltonian cycles Given a directed graph, we want to decide whether or not there is a Hamiltonian cycle in this graph. This is a decision problem.
  • Slide 4
  • Example 2: TSP - The Traveling Salesman Problem Given a complete graph and an assignment of weights to the edges, find a Hamiltonian cycle of minimum weight. This is the optimization version of the problem. In the decision version, we are given a weighted complete graph and a real number c, and we want to know whether or not there exists a Hamiltonian cycle whose combined weight of edges does not exceed c.
  • Slide 5
  • Example 3: Pairwise sequence alignment, optimization version Given two sequences and a scoring scheme find the highest-scoring (global or local) alignment of these sequences. This is an optimization problem. Similarly for multiple sequence alignment.
  • Slide 6
  • Example 3: Pairwise sequence alignment, decision version Given two sequences, a scoring scheme, and a cutoff score c, decide whether there exists an alignment with a score that is higher than c. Similarly for multiple sequence alignment.
  • Slide 7
  • Important Observation: Each optimization problem has a corresponding decision problem.
  • Slide 8
  • Inputs Each of the problems discussed above has its characteristic input. For example, for the optimization version of the TSP, the input consists of a weighted complete graph; for the decision version of the TSP, the input consists of a weighted complete graph and a real number. The input data of a problem are often called the instance of the problem. Each instance has a characteristic size; which is the amount of computer memory needed to describe the instance. If the instance is a graph of n vertices, then the size of this instance would typically be about n(n-1)/2.
  • Slide 9
  • The class P A decision problem D is solvable in polynomial time or in the class P, if there exists an algorithm A such that zA takes instances of D as inputs. zA always outputs the correct answer Yes or No. zThere exists a polynomial p such that the execution of A on inputs of size n always terminates in p(n) or fewer steps.
  • Slide 10
  • The class P EXAMPLE: The local and global pairwise sequence alignment problems are in the class P for all scoring schemes used in practice. The class P is often considered as synonymous with the class of computationally feasible problems, although in practice this is somewhat unrealistic.
  • Slide 11
  • Witnesses for decision problems Note that if the answer to a decision problem is yes, then there is usually some witness to this. For example, in the Hamiltonian cycle problem, any permutation v 1, v 2, ,v n of the vertices of the input graph is a potential witness. This potential witness is a true witness if v 1 is adjacent to v 2, and v n is adjacent to v 1.
  • Slide 12
  • Witnesses for decision problems In the TSP, a potential witness would be a Hamiltonian cycle. This potential witness is a true witness if its cost is c or less.
  • Slide 13
  • Example4: factorization Given an integer N and an integer M with 1 M N, does N have a factor d with 1 < d < M? In this problem, a witness for a yes answer is a factor as above, and no answer can easily be computed from a complete factorization, which can also be considered as a kind of witness.
  • Slide 14
  • The class NP A decision problem is nondeterministically polynomial-time solvable or in the class NP if there exists an algorithm A such that zA takes as inputs potential witnesses for yes answers to problem D. zA correctly distinguishes true witnesses from false witnesses. zThere exists a polynomial p such that for each potential witnesses of each instance of size n of D, the execution of the algorithm A takes at most p(n) steps.
  • Slide 15
  • The class NP Note that if a problem is in the class NP, then we are able to verify yes-answers in polynomial time, provided that we are lucky enough to guess true witnesses. Homework 1: Show that both the factorization problem of a previous slide and its negation are in the class NP.
  • Slide 16
  • The P=NP Problem Are the classes P and NP identical? This is an open problem; it may well be the biggest open problem of mathematics at the beginning of the 21st century. It is not hard to show that every problem in P is also in NP, but it is unclear whether every problem in NP is also in P.
  • Slide 17
  • The P=NP Problem The best we can say is that thousands of computer scientists have been unsuccessful for decades to design polynomial-time algorithms for some problems in the class NP. This constitutes overwhelming empirical evidence that the classes P and NP are indeed distinct, but no formal mathematical proof of this fact is known.
  • Slide 18
  • Polynomial-time reducibility Let E and D be two decision problems. We say that D is polynomial-time reducible to E if there exists an algorithm A such that zA takes instances of D as inputs and always outputs the correct answer Yes or No for each instance of D. zA uses as a subroutine a hypothetical algorithm B for solving E. zThere exists a polynomial p such that for every instance of D of size n the algorithm A terminates in at most p(n) steps if each call of the subroutine B is counted as only m steps, where m is the size of the actual input of B.
  • Slide 19
  • Polynomial-time reducibility The notion of polynomial-time reducibility can be extended to the case where only E is a decision problem. Homework 2: Give the corresponding definition and show that finding the factorization of a given positive integer is polynomial-time reducible to the decision problem about factorization from a previous slide. What does this imply for prime factorization if the latter decision problem is in fact in the class P?
  • Slide 20
  • An example of polynomial-time reducibility Theorem: The Hamiltonian cycle problem is polynomial- time reducible to the decision version of TSP. Proof: Given an instance G with vertices v 1, , v n of the Hamiltonian cycle problem, let H be the weighted complete graph on v 1, , v n such that the weight of an edge {v i, v j } in H is 1 if {v i, v j } is an edge in G, and is 2 otherwise. Now the correct answer for the instance G of the Hamiltonian cycle problem can be obtained by running an algorithm on the instance (H,n+1) of the TSP.
  • Slide 21
  • NP-complete problems A decision problem E is NP-complete if every problem in the class NP is polynomial-time reducible to E. The Hamiltonian cycle problem, the decision versions of the TSP and the graph coloring problem, as well as literally hundreds of other problems are known to be NP-complete. It is not hard to show that if a problem D is polynomial- time reducible to a problem E and E is in the class P, then D is also in the class P. It follows that if there exists a polynomial-time algorithm for the solution of any of the NP-complete problems, then there exist polynomial-time algorithms for all of them, and P= NP.
  • Slide 22
  • NP-hard problems Optimization problems whose decision versions are NP- complete are called NP-hard. Theorem: If there exists a polynomial-time algorithm for finding the optimum in any NP-hard problem, then P = NP.
  • Slide 23
  • NP-hard problems Proof: Let E be an NP-hard optimization (let us say minimization) problem, and let A be a polynomial-time algorithm for solving it. Now an instance J of the corresponding decision problem D is of the form (I,c), where I is an instance of E, and c is a number. Then the answer to D for instance J can be obtained by running A on I and checking whether the cost of the optimal solution exceeds c. Thus there exists a polynomial-time algorithm for D, and NP-completeness of the latter implies P= NP.
  • Slide 24
  • Multiple Sequence Alignment Problem Theorem (Wang and Jiang, 1994): There exists a scoring scheme for which the Multiple Sequence Alignment problem is NP- hard. Unfortunately, the scoring scheme used in this result is biologically unrealistic.
  • Slide 25
  • Multiple Sequence Alignment Problem Theorem (Just, 2001): The Multiple Sequence Alignment problem is NP- hard for every biologically realistic scoring matrix with SP-score and affine gap penalties. This has been subsequently extended by another author to some scoring schemes different from SP-score.
  • Slide 26
  • Consequences for bioinformatics In view of the overwhelming empirical evidence against the equality P= NP it seems that no NP-hard optimization problem is solvable by an algorithm that is guaranteed to: zrun in polynomial time and zalways produce a solution with optimal score/cost. Unfortunately, many, perhaps most, of the important optimization problems in bioinformatics are