1 NP-Completeness Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij

1

NP-Completeness

Algorithms and Networks 2015/2016Hans L. Bodlaender

Johan M. M. van Rooij

Today and next week

Complexity of computational problems Formal notion of computations NP-completeness

Why is it relevant? What is it exactly? How is it proven?

P vs. NP Some animals from the complexity zoo

INTRODUCTIONNP-Completeness – Algorithms and Networks

Easy Problems / Hard Problems

Finding the shortest simple path between vertices v and w in a given graph.

Determine if there is an Euler tour in a given graph.

Testing 2-colourability.

Satisfiability when each clause has two literals.

Finding the longest simple path between vertices v and w in a given graph.

Determine if there is a Hamiltonian circuit in a given graph.

Testing 3-colourability.

Satisfiability when each clause has three literals.

Fast and Slow

Algorithms whose running time is polynomial in input size.

Algorithms whose running time is exponential in input size.

Or worse…

Or in between???

EXPTIME

Many problems appear not to have a polynomial time algorithm.

For a few, we can proof that each algorithm needs exponential time: EXPTIME hardness, in particular generalized games

(Generalized Go – Japanese rules, Generalized Chess) For most, including many important and interesting

problems, we cannot.

NP-Completeness

Theory shows relations and explains behavior of many combinatorial problems

From many fields: Logic Graphs, networks, logistics, scheduling Databases Compiler optimization Graphics …

We need formalisation!

Formal notion of: Problem instance. Decision problem. Computation. Running time.

ABSTRACT AND FORMAL PROBLEMS

NP-Completeness – Algorithms and Networks

Different versions of problems

Decision problems. Answer is yes or no.

Optimization problems. Answer is a number representing an objective value.

Construction problems. Answer is some object (set of vertices, function, …).

Counting problems. How many objects of some kind exists?

Focus ondecision problems

Abstract versus formal problems

Abstract: Talks about graphs, logic formulas, applications, …

Formal: Inputs are sets of strings over a finite alphabet.

Abstract and formal problem instances

Computers work with bit strings. Problems are described using objects:

G is a graph, … Given a logic formula F, … Is there a clique C of size at least k, ...

To get a formal instance, we must map objects to bit strings. Or another encoding like Gödel numbers...

Abstract and formal problem instances

1

23

4 5

6

7 8

0 1 1 0 0 0 1 0

1 0 1 0 0 0 0 0

1 1 0 1 1 1 0 0

0 0 1 0 1 0 0 0

0 0 1 1 0 0 0 0

0 0 1 0 0 0 1 0

1 0 0 0 0 1 0 1

0 0 0 0 0 0 1 0

Abstract decision problems

Abstract decision problem: Set of instances I.

Objects like graphs, integers, logic formulas, etc. Subset of I: instances where the answer to the problem is YES.

Formal problem encoding

Encoding: mapping of set of instances I to bitstrings in {0,1}*.

Formal (decision) problem: Subset of {0,1}*.

By using an encoding, an abstract decision problem maps to a concrete decision problem.

Language of a problem Decision problem as a language:

Set of all Yes-instances.

P AND NPNP-Completeness – Algorithms and Networks

Solving a formal problem

An algorithm solves a formal problem: Decides if string in {0,1}* belongs to the subset.

Algorithm uses polynomial time, if there is a polynomial p such that on inputs of length n the algorithm uses at most p(n) computation steps. Size of input x is denoted |x|.

Time: Deterministic. Worst case.

Computation steps. Depend on model of computation. Turing machine. RAM-model with word size.

Complexity Class P

Formal Class of languages L, for

which there exists a deterministic Turing Machine deciding whether s L, using running time O( p(|s|) ) for some polynomial p.

Abstract Class of decision problems

that have polynomial time algorithms solving them.

Using P for abstract problems

Abstract problem (with encoding) is in P, if the resulting formal problem is in P. In practice: encoding and corresponding concrete problem is

assumed very implicitly. When considering polynomial time algorithms, encoding

mostly does not matter. If we can transform encodings in polynomial time to each

other. Details: see e.g., chapter 34 of Introduction to Algorithms.

Verification algorithm

Verification algorithm has two arguments: Problem input, Certificate (“solution”).

Answers “yes” or “no”.

The algorithm checks if the 2nd argument is a certificate for the first argument for the studied problem.

The language verified by the verification algorithm A is {s | there is an c with A(s,c)=Yes}.

Complexity Class NP

Two equivalent definitions of NP: Class of languages L, for which there exists a Non-

Deterministic Turing Machine deciding whether i L, using running time O(p(|i|)).

Class of languages L, for which there exists a Deterministic Turing Machine verifying whether i L, using a polynomial sized certificate c, and using running time O(p(|i|)) This definition we use: details follow.

NP

Problems with polynomial time verification algorithm and polynomial size certificates.

Problem L belongs to the class NP, if there exists a 2-argument algorithm A, with: A runs in polynomial time. There is a constant d such that for each i, i is a yes-instance to

L, if and only if there is a certificate c with:• A(i,c) = true.• |c| = O(|i|d).

Many problems are in NP

Examples: Hamiltonian Path, Maximum Independent Set, Satisfiability, Vertex Cover, …

Al of these have trivial certificates (set of vertices, truth assignment, …)

In NP (not trivial): Integer Linear Program.

P Í NP

If A decides L in polynomial time, then as verification algorithm, compute: B(i,c) = A(i). “We do not need a certificate” !

Famous open problem: P = NP ?? Or not??

NP-COMPLETENESS AND THE COOK-LEVIN THEOREM


Reducibility

Language L1 is polynomial time reducible to language L2 (or: L1 £P L2 ), if there exists a polynomial time computable function f : {0,1}* ® {0,1}* such that: For all x Î {0,1}*: x Î L1, if and only if, f(x) Î L2.

Lemma: If L1 £P L2 then: if L2 Î P, then L1 Î P.

Proof-idea: run an algorithm for L2 on f(i) for input i forproblem L1.

Lemma: If L1 £P L2 then: if L2 Î NP, then L1 Î NP.

NP-completeness

A language L is NP-complete, if:1. L Î NP.2. For every L’ Î NP: L’ £P L.

A language L is NP-hard, if:3. For every L’ Î NP: L’ £P L.

NP-hardness sometimes also used as term for problems that are not a decision problem, and for problems that are ‘harder than NP’.

What does it mean to beNP-Complete?

Evidence that it is (very probably) hard to find an algorithm that solves the problem. Always. Exactly. In polynomial time.

CNF-Satisfiability

Input: Expression over Boolean variables in conjunctive normal form.

Question: Is the expression satisfiable?I.e., can we give each variable a value (true or false) such that the expression becomes true? CNF: “and” of clauses; each clause “or” of variables or

negations (xi or not(xj))

Cook-Levin theorem

Satisfiability is NP-complete. Most well known is Cook’s proof, using Turing machine

characterization of NP.

Brief sketch of ideas in proof of Cook-Levin theorem

Suppose problem Q is in NP. Build a Turing Machine that:

Gets on an input tape a string (i, c), with c a certificate of size O(|i|d).

Runs the verification algorithm A in O(|i|q) time. d, q constants.

Make a logic formula that expresses “A halts in an accepting state on (i,c)” Variables:

• a(t,r,s): true if at time t, the rth symbol on the tape is symbol s.• b(t,a): true if the machine is at time t at state a.• c(t,r): true if at time t, the reading head at the tape is at position r.

Lots of clauses that express that the variable act like a Turing Machine tells them to do.

PROVING THAT PROBLEMS ARE NP-COMPLETE


Proving problems NP-complete

Lemma1. Let L’ £P L and let L’ be NP-complete. Then L is NP-hard.

2. Let L’ £P L and let L’ be NP-complete, and L Î NP. Then L is NP-complete.

3-SAT

3-Sat is the special case of CNF-Satisfiability where each clause has exactly three literals.

Lemma: CNF-Satisfiability ·P 3-Sat Clauses with one or two literals:

Use two extra variables p and q. Replace (x or y) by (x or y or p) and (x or y or :p). Similarly, replace a 1-literal clause by 4 clauses.

Clauses with more than three literals: For (l1 or l2 or … lr) add new variable t and take as replacement

clauses (l1 or l2 or t) and (:t or l3 or … or lr). Repeat this until the last clause has size three.

3-SAT is NP-complete

Proof:1. Membership in NP (almost trivial).2. Reduction (last slide).

3-Sat is important starting problem for many NP-completeness proofs.

Clique

Clique Given: graph G=(V,E), integer k. Question: does G have a clique with at least k vertices?

Clique is NP-complete. In NP … easy! NP-hardness: using 3-sat.

Reduction for Clique

One vertex per literal per clause.

Edges between vertices in different clauses, except edges between xi and :xi.

If m clauses, look for clique of size m.

x1

x1

x2

:x3

:x2 x3

…

Clause: {x1,:x2,x3}

Clause: {x1,x2,:x3}

Correctness

Lemma: There is a satisfying truth assignment, if and only if, there is a clique with m vertices.

! : Select from each clause the true literal. The corresponding vertices form a clique with m vertices.

Ã : Set variable xi to true, if a vertex representing xi is in the clique, otherwise set it to false. This is a satisfying truth assignment: The clique must contain one vertex from each 3 vertices

representing a clause. It cannot contain a vertex representing xi and a vertex

representing :xi.

Independent set

Independent set: set of vertices W Í V, such that for all v,w Î W: {v,w} Ï E.

Independent set problem: Given: graph G, integer k. Question: Does G have an independent set of size at least k?

Independent set is NP-complete.

Independent set is NP-complete

In NP. NP-hard: transform from Clique. W is a clique in G, if and only if, W is an independent set in

the complement of G (there is an edge in Gc iff. there is no edge in G).

How do I write down this proof?

Theorem: Independent Set is NP-complete.Proof: The problem belongs to NP:

As certificates, we use subsets of vertices of the input graph; We can check in polynomial time for a given subset of vertices

that it is an independent set and that its size is at least k. To show NP-hardness, we use a reduction from Clique.

Let (G,k) be an instance of the clique problem. Transform this to instance (Gc,k) of the independent set

problem with Gc the complement of G. As G has a clique with k vertices, if and only if, Gc has an

independent set with k vertices, this is a correct transformation.

The transformation can clearly be carried out in polynomial time.

Writing an NP-Completeness proof

State the theorem. Proof starts (or ends) with showing that problem belongs to NP.

Sometimes (e.g., in journal paper: trivial). Only if it is. Otherwise, explain the encoding of certificates and how / that they can

be checked in polynomial time. State which known NP-complete problem you transform from.

Careful: do not go in the wrong direction. Explain the transformation. Give the proof: instance to original problem is Yes-instance, if and

only if, transformed instance is Yes-instance for the known NP-complete problem. Remember: you need to proof this in two directions.

Phrase (or prove if not trivial): transformation can be carried out in polynomial time, hence problem is NP-complete.

QED.

Some comments on ‘proofs’by former students

Things I do not want to see in your ‘proofs’: If problem P would return true, then … Input/output of a problem. There is no polynomial time algorithm for P, thus there is no

polynomial time algorithm for Q. “Certificate to Hamilton cycle is a set of vertices.”

Terminology on reductions with problems/instances. A reduction reduces one problem to another problem. Yet the reduction transforms an instance of problem A into an

instance of problem B. After your reduction always prove that:

Your instance is a Yes-instance, if and only if, transformed instance is a Yes-instance.

Or if instance is a Yes-instance then transformed instance is a Yes-instance, and if instance is a No-instance then transformed instance is a No-instance.

Vertex Cover

Set of vertices W Í V with for all {x,y} Î E: xÎW or yÎW. Vertex Cover problem:

Given G, find vertex cover of minimum size.

= vertex cover

Vertex cover is NP-Complete

In NP. NP-hard: reduction from independent set. W is a vertex cover in G, if and only if, V-W is an

independent set in G.

3-Colouring

3-Colouring Given: Graph G=(V,E) Question: Can we colour the

vertices with 3 colours, such that for all edges {v,w} in E, the colour of v differs from the colour of w.

3-colouring is NP-Complete.

1 2

3

21

3

2

Proof of NP-Completeness of 3-Colouring

In NP: should be clear. Transformation from 3-sat:

We build a graph in 3 steps:1. Take a clique with 3

vertices True, False, C.2. Take two adjacent vertices

for each variable.

T F

C

x1 :x1 x2 :x2

We name the colours:

T, F, C

Continue

3. For each clause {l1,l2,l3}, take the following gadget:

T

C

l1 l2

l3

NP-Completeness of 3-ColouringWrap-up

One can show: If the formula is satisfiable, then there is a 3-colouring of G. If there is a 3-colouring of G, then the formula is satisfiable.

In both cases, the intuition is that a literal vertex is coloured T (true), if and only if, we take it to be true in the formula.

That the transformation takes polynomial time.

THREE APPROACHES TO PROVE NP-COMPLETENESS


Approaches to proving NP-Hardness

Three approaches to prove NP-hardness:1) Restriction2) Local replacement3) Component Design

Approach 1: Restriction

Take the problem. Add a restriction to the set of instances.

NOT to the problem definition! Show that this is a known NP-complete problem.

Restriction: Weighted Vertex Cover

Weighted vertex cover Given: Graph G=(V,E), for each vertex v Î V, a positive integer

weight w(v), integer k. Question: Does G have a vertex cover of total weight at most k?

NP-Complete. In NP. NP-Hardness: reduction from Vertex Cover to Weighted Vertex

Cover (set all weights to 1).

This is the easiest form of reduction.

Restriction: Knapsack

Subset-sum (assume we know it is NP-complete) Given: set of positive integers S, integer t. Question: Is there a subset of S with total sum t?

Knapsack Given: Set S of items, each with integer value v and integer

weight w, integers W and V. Question: is there a subset of S of weight no more than W, with

total value at least V?

NP-complete In NP. NP-hardness: set all weights equal to their values (Subset

sum).

Approach 2: Local replacement

Form an instance of our problem by Taking an instance of a known NP-Complete problem. Making some change “everywhere”. Such that we get an equivalent instance, but now of the

problem we want to show NP-hard.

Examples of Local Replacement

We saw: 3-Satisfiability (from Satisfiability). Clique (from 3-Satisfiability). Independent Set (from Clique). Vertex Cover (from Independent Set).

We will see: Travelling Salesman Problem (from Hamiltonian Cycle). Planar 3-Colouring (from 3-Colouring).

NP-Completeness of Travelling Salesman Problem by local replacement

In NP. Reduction from Hamiltonian Circuit:

Take city for each vertex. Take cost(i,j) = 1 if {i,j} Ï E. Take cost(i,j) = 0, if {i,j} Î E. G has HC, if and only if, there is a TSP-tour of length 0.

Remark TSP variant with triangle inequality:

Use weights 2 and 1. G has HC, if and only if, there is a TSP-tour of length n.

NP-Completeness of Planar 3-Colouring by Local Replacement

Planar 3-colouring Given: Planar graph G=(V,E). Question: Can we colour the vertices of G with 3 colours, such

that adjacent vertices have different colours?

Plan: transform from 3-colouring. Take arbitrary graph G. Draw G on the plane. So, we possibly get some crossings. Replace the crossings by something clever.

Clever reduction

Garey et al. found the following:

c d

a

b

c

a

d

b

Property of the gadget

If a and b have the same colour, we cannot colour the gadget with 3 colours.

If c and d have the same colour, we cannot colour the gadget with 3 colours.

Otherwise, we can.

... Some additional details (basically, repeat the step) when an edge has more than one crossing …

Approach 3: Component design

Component design (gadgeteering).

Build (often complicated) parts of an instance with certain properties. Often we call these parts components, gadgets, or widgets.

Glue them together in such a way that the proof works.

Examples: 3-Colouring (from 3-Satisfiability). Hamiltonian Circuit (from Vertex Cover).

Hamiltonian circuit

Given: Graph G Question: does G have a

simple cycle that contains all vertices?

NP-Completeness ofHamiltonian Circuit

Hamiltonian Circuit is in NP. Vertex Cover £P Hamiltonian Circuit: complicated proof

(component design) Widgets. Selector vertices. Given a graph G and an integer k, we construct a graph H,

such that H has a HC, if and only if G has a VC of size k.

Widget

For each edge {u,v} we have a widget Wuv [u,v,1]

[u,v,6] [v,u,6]

[v,u,1]

Only possible ways to visit all vertices in widget

Selector vertices

We have k selector vertices s1, …, sk. These will represent the vertices selected for the vertex

cover.

Connecting the widgets

For each vertex v we connect the widgets of the edges {v,w}.

Suppose v has neighbors x1, …, xr: add edges {[v,x1,6],

[v,x2,1]}, {[v,x2,6],[v,x3,1]},

…, {[v,xr-1,6],[v,xr,1]}.

Connecting the selector vertices to the widgets

Each selector vertex is attached to the first neighbor widget of each vertex, i.e. to vertex [v,x1,1] and to the last neighbor widget [v,xr,6]

si

Vertex in examplehas degree 2

Correctness of reduction

Lemma: G has a vertex cover of size (at most) k, if and only if H has a Hamiltonian circuit.

The reduction takes polynomial time. So, we can conclude that Hamiltonian Circuit is NP-

complete.

WEAK AND STRONG NP-COMPLETENESS


Problems with numbers

Strong NP-complete: Problem is NP-complete if numbers are given in unary.

Weak NP-complete: Problem is NP-complete if numbers are given in binary, but

polynomial time solvable when numbers are given in unary.

Examples

Subset-sum Given: set of positive integers S, integer t. Question: Is there a subset of S with total sum t?

Subset-sum is Weak NP-complete. Solvable using dynamic programming in O(nt) time. That is, solvable in pseudo-polynomial time.

3-Partition Given: set of positive integers S. Question: can we partition S into sets of exactly 3 elements

each, such that each has the same sum (t)? 3-Partition is Strong NP-hard.

t must be the sum of S divided by the number of groups (|S|/3). Starting point for many reductions.

Remark

Easily made mistake: reductions from subset sum that create exponentially large instances.

Subgraph Isomorphism for maximum degree 2 graphs. Given: Graphs G and H, such that each vertex in G and H has

degree at most 2. Question: Is G a subgraph of H?

NP-hardness proof can be done with 3-Partition.

Subgraph isomorphism for forests of paths

Input: Forest F =(V,E), forest H = (W,F) max degree 2. Question: Is there an injection f: W -> V, such that for all

{v,w} in F, {f(v),f(w)} is in E, i.e., is H isomorphic to a subgraph of F?

Theorem: Subgraph isomorphism for forests with maximum degree two is NP-complete.

Proof

In NP. NP-complete: transform from 3-partition. Suppose we have numbers a1, …, a3m with total sum mt. F is forest with m paths, each with t vertices. H is forest with 3m paths, with lengths a1, …, a3m.

Only correct because 3-Partition is strongly NP-Complete. NP-Complete when input is given in unary!

Similar proof from 2-Partition would be WRONG! 2-Partition: Subset sum with t = total sum / 2.

SOME DISCUSSIONNP-Completeness – Algorithms and Networks

How to prove NP-completeness

Is there an easy proof by restriction? Is there a problem that “looks like our problem” – try to do

local replacement. Try component design:

Good starting problems are: 3-Sat; special cases of 3-Sat; 3-Partition; subset sum; simple graph problems (like clique, Independent Set, Vertex Cover, Colouring, 3-Colouring)

… puzzle …

Discussion

Is P ¹ NP? (who thinks so?) www.claymath.org/prizeproblems/pvsnp.htm

One of the millennium problems.

Why so hard to prove? What to do with problems that are NP-hard? Other complexity notions…

NP

P

NP-complete

http://www.claymath.org/prizeproblems/pvsnp.htm





P vs NP is hard to prove

P = NP? Hard to design poly algorithm... Current mathematical knowledge does not suffice to prove

P != NP: “Natural Proofs” can not separate P from NP (Razborov &

Rudich, 1993) PA

= NPA, but PB != NPB for some oracles A and B, so diagonalisation can not separate P from NP (Baker, Gill, & Solovay, 1975)

NP-Completeness is not the end

Almost every interesting real-life problem is NP-Complete. Scheduling, planning, ...

NP-Completeness is a starting point to think further not an excuse to call a problem impossible to solve.

Theoretical approaches: Fixed parameter tractability, exact exponential-time

algorithms, special graph classes / restricted instances, ... Practical approaches:

Integer Linear Programming, Constraint Programming, Local search, Heuristics, ....

Often: finding subproblems that are not NP-Hard helps a lot.

SOME ANIMALS FROM THE COMPLEXITY ZOO


Much more complexity classes

In Theoretical Computer Science, a large number of other complexity classes have been defined.

Here, we give an informal introduction to a few of the more important ones.

There is much, much, much more…

coNP

Complement of a class: switch “yes” and “no”. Polynomial time verifiable certificates of no instances. coNP: complement of problems in NP, e.g.:

NOT-HAMILTONIAN Given: Graph G. Question: Does G NOT have a Hamiltonian circuit.UNSATISFIABLE Given: Boolean formula in CNF. Question: Do all truth assignments to the variable make the

formula false?

All problems in P are in coNP. NP-Complete problems are most probably not in coNP.

A more natural example for NP and coNP

Integer factorisation: Given: integer n, integer m. Question: does n have a prime factor less than m?

In NP - Yes-certificate: Two numbers i, j with ij = n, i < m.

In coNP - No-certificate: List of all prime factors of n. Verify that these are primes (prime testing is in P – 2002). Verify that when multiplied we have n.

Problem in NP and coNP. Also in P? Maybe not – for cryptography sake we hope not!

PSPACE

All decision problems solvable in polynomial space. Unknown: is P=PSPACE? Savitch, 1970: PSPACE = NPSPACE

NPSPACE: solvable with non-deterministic program in polynomial space.

PSPACE-complete, e.g., Generalized Tic-Tac-Toe, generalized Reversi, Some puzzles that need an exponential number of moves. Quantified Boolean formula’s (QBF):

423214321 xxxxxxxxx

EXPTIME

Decision problems that can be solved in exponential time. P is unequal EXPTIME (Stearns, Hartmanis, 1965). EXPTIME complete problems:

Generalized chess, generalized checkers, generalized go (Japanese drawing rule).

Given a Turing Machine M and integer k, does M halt after k steps?

And a few more

NEXPTIME: non-deterministic exponential time. EXPSPACE = NEXPSPACE: exponential space.

Graph Isomorphism

Discussed in another lecture. Given two graphs, are they isomorphic? In NP, not known to be NP-complete; not known to be in P. Several problems are equally hard: Isomorphism-complete.

NC

NC: “Nicks class”, after Nick Pippinger. Talks about the time to solve a problem with a parallel

machine. Model: we have a polynomial number of processors, that

use the same memory. Variants depending on what happens when processors try to

read or write the same memory location simultaneously.

NC: decision problems that can be solved with a PRAM (Parallel Random Access Machine) with polynomial number of processors in polylogarithmic time. O( (log n)d ) for some constant d.

Unknown: P=NC?

Counting: #P

#P (“Sharp-P”). Problems that output a number. The precise definition will not be given here. Think as:

“what is the number of certificates for this instance”, with polynomial checking of certificates.

#P-complete e.g.: Number of satisfying truth assignments of 3SAT-formula. Number of Hamiltonian circuits in a graph. Number of perfect matchings in a given graph.

LSPACE or L

Problems can be solved with only logarithmic extra space: You can read the input as often as you want You may use only O(log n) extra memory

• E.g.: Q(1) pointers to your input

NL: non-deterministic logspace...

More

Oracles. The Polynomial Time Hierarchy. UP: unique solutions.

Questions?

Documents

1 NP-Completeness Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij