Transcript
Page 1: Parameterized Algorithms Randomized Techniques

U N I V E R S I T Y O F B E R G E N

Parameterized AlgorithmsRandomized Techniques

Bart M. P. Jansen

August 18th 2014, Będlewo

Page 2: Parameterized Algorithms Randomized Techniques

Randomized computation

• For some tasks, finding a randomized algorithm is much easier than finding a deterministic one

• We consider algorithms that have access to a stream of uniformly random bits– So we do not consider randomly generated inputs

• The actions of the algorithm depend on the values of the random bits– Different runs of the algorithm may give different

outcomes, for the same input

2

Page 3: Parameterized Algorithms Randomized Techniques

Monte Carlo algorithms

• A Monte Carlo algorithm with false negatives and success probability is an algorithm for a decision problem that– given a NO-instance, always returns NO, and – given a YES-instance, returns YES with probability

• Since the algorithm is always correct on NO–instances, but may fail on YES-instances, it has one-sided error

• If is a positive constant, we can repeat the algorithm a constant number of times– ensure that the probability of failure is smaller than the

probability that, cosmic radiation causes a bit to flip in memory– (Which would invalidate even a deterministic algorithm)

• If is not a constant, we can also boost the success probability

3

Page 4: Parameterized Algorithms Randomized Techniques

Independent repetitions increase success probability

• Suppose we have a Monte Carlo algorithm with one-sided error probability , which may depend on – For example,

• If we repeat the algorithm times, the probability that all runs fail is at most

since

• Probability ≥ that the repeated algorithm is correct– Using trials gives success probability – For example,

4

Page 5: Parameterized Algorithms Randomized Techniques

This lecture

• The LONGEST PATH problem

Color Coding

• The SUBGRAPH ISOMORPHISM problem

Random Separation

• -CLUSTERING

Chromatic Coding

Derandomization

Monte Carlo algorithm for FEEDBACK VERTEX SET

5

Page 6: Parameterized Algorithms Randomized Techniques

COLOR CODING

6

Page 7: Parameterized Algorithms Randomized Techniques

Color coding

• Randomly assign colors to the input structure

• If there is a solution and we are lucky with the coloring, every element of the solution has received a different color

• Then find an algorithm to detect such colorful solutions– Solutions of elements with pairwise different colors

7

Page 8: Parameterized Algorithms Randomized Techniques

The odds of getting lucky

• Lemma. – Let be a set of size , and let have size – Let be a coloring of the elements of , chosen uniformly at

random• Each element of is colored with one of colors, uniformly and

independently at randomThe probability that the elements of are colored with pairwise distinct colors is at least

• Proof. – There are possible colorings– In of them, all colors on are distinct

– We used

8

𝑋

Page 9: Parameterized Algorithms Randomized Techniques

The LONGEST PATH problem

Input: An undirected graph and an integer Parameter:Question: Is there a simple path on vertices in ?

• A solution is a -path

• The LONGEST PATH problem is a restricted version of the problem of finding patterns in graphs

9

Page 10: Parameterized Algorithms Randomized Techniques

Color coding for LONGEST PATH

• Color the vertices of randomly with colors

• We want to detect a colorful -path if one exists– Use dynamic programming over subsets

• For every subset of colors and vertex , define = TRUE iff there is a colorful path whose colors are and that has as an endpoint

10

Page 11: Parameterized Algorithms Randomized Techniques

The dynamic programming table

• For every subset of colors and vertex , define = TRUE iff there is a colorful path whose colors are and that has as an endpoint

• Colorful -path if = TRUE for some

, TRUE, FALSE, TRUE

11

Page 12: Parameterized Algorithms Randomized Techniques

A recurrence to fill the table

• If is a singleton set, containing some color : = TRUE if and only if

• If :if

FALSE otherwise

• Fill the table in time

12

Page 13: Parameterized Algorithms Randomized Techniques

Randomized algorithm for LONGEST PATH

• Algorithm LongPath(Graph , integer )– repeat times:

• Color the vertices of uniformly at random with colors• Fill the DP table • if such that = TRUE then return YES

– return NO

• By standard DP techniques we can construct the path as well– For each cell, store a backlink to the earlier cell that

determined its value

13

Page 14: Parameterized Algorithms Randomized Techniques

Analysis for the Longest Path algorithm

• Running time is is – By the get-lucky lemma, if there is a -path, it becomes

colorful with probability – If the coloring produces a colorful -path, the DP finds it– By the independent repetition lemma, repetitions give

constant success probability

14

Theorem. There is a Monte Carlo algorithm for LONGEST PATH with one-sided error that runs in time and has constant

success probability

Page 15: Parameterized Algorithms Randomized Techniques

Discussion of color coding

• When doing dynamic programming, color coding effectively allows us to reduce the number of states from– keeping track of all vertices visited by the path, , to– keeping track of all colors visited by the path,

• The technique extends to finding size- occurrences of other “thin” patterns in graphs– A size- pattern graph of treewidth can be found in time ,

with constant probability

15

Page 16: Parameterized Algorithms Randomized Techniques

RANDOM SEPARATION

16

Page 17: Parameterized Algorithms Randomized Techniques

The SUBGRAPH ISOMORPHISM problem

Input: A host graph and pattern graph (undirected)Parameter:Question: Does have a subgraph isomorphic to ?

17

Does contain ?

Page 18: Parameterized Algorithms Randomized Techniques

Background

• The traditional color coding technique gives FPT algorithms for LONGEST PATH– Even for SUBGRAPH ISOMORPHISM when the pattern

graph has constant treewidth

• If the pattern graph is unrestricted, we expect that no FPT algorithm exists for SUBGRAPH ISOMORPHISM– It generalizes the -CLIQUE problem– Canonical W[1]-complete problem used to establish

parameterized intractability (more later)

• If the host graph (and therefore the pattern graph ) has constant degree, there is a nice randomized FPT algorithm

18

Page 19: Parameterized Algorithms Randomized Techniques

Random 2-coloring of host graphs

• Suppose is a host graph that contains a subgraph isomorphic to a connected -vertex pattern graph – Color the edges of uniformly independently at random with

colors red () and blue ()– If all edges of are colored red, and all other edges incident to are

colored blue, it is easy to identify • The pattern occurs as a connected component of • Isomorphism of two -vertex graphs in time

19

Page 20: Parameterized Algorithms Randomized Techniques

Probability of isolating the pattern subgraph

• Let be a -vertex subgraph of graph

• A 2-coloring of isolates if the following holds:– All edges of are red– All other edges incident to are blue

• Observation. If the maximum degree of is , the probability that a random 2-coloring of isolates a fixed -vertex subgraph is at least – There are at most edges incident on – Each such edge is colored correctly with probability

20

Page 21: Parameterized Algorithms Randomized Techniques

Randomized algorithm for SUBGRAPH ISOMORPHISM• Algorithm SubIso(Host graph , connected pattern graph )

– Let be the maximum degree of – Let be the number of vertices in – repeat times:

• Color the edges of uniformly at random with colors R, B• for each -vertex connected component of :

– if is isomorphic to , then return YES– return NO

• Easy to extend the algorithm to disconnected patterns

21

Theorem. There is a Monte Carlo algorithm for SUBGRAPH ISOMORPHISM with one-sided error and constant success probability. For -vertex pattern graphs in a host graph of

maximum degree , the running time is

Page 22: Parameterized Algorithms Randomized Techniques

CHROMATIC CODING

22

Page 23: Parameterized Algorithms Randomized Techniques

The -CLUSTERING problem

Input: A graph and an integer Parameter:Question: Is there a set of at most adjacencies

such that consists of disjoint cliques?

• Such a graph is called a -cluster graph

23

Page 24: Parameterized Algorithms Randomized Techniques

How to color

• -CLUSTERING looks for a set of (non-)edges, instead of vertices

• We solve the problem on general graphs

• By randomly coloring the input, we again hope to highlight a solution with good probability, making it easier to find– We color vertices of the graph

24

Page 25: Parameterized Algorithms Randomized Techniques

Proper colorings

• A set of adjacencies is properly colored by a coloring of the vertices if:– For all pairs , the colors of and are different

• As before, two crucial ingredients:1. What is the probability that a random coloring has the desired

property?2. How to exploit that property algorithmically?

• We assign colors to the vertices and hope to obtain a property for the (non-)edges in a solution– This allows us to save on colors

25

Page 26: Parameterized Algorithms Randomized Techniques

Probability of finding a proper coloring

• Lemma. If the vertices of a simple graph with edges are colored independently and uniformly at random with colors, then the probability that is properly colored is at least

• Corollary. If a -CLUSTERING instance has a solution set of adjacencies, the probability that is properly colored by a random coloring with colors is at least

26

For constant success probability,

repetitions suffice

Page 27: Parameterized Algorithms Randomized Techniques

Detecting a properly colored solution (I)

27

• Suppose properly colors a solution of – The graph is a -cluster graph

• For , let be the vertices colored – As is properly colored, no (non-)edge of has both ends in – No changes are made to by the solution

• is an induced subgraph of a -cluster graph• For all , the graph is a -cluster graph

• consists of cliques that are not broken by the solution

Observation. The -coloring partitions into cliques that are unbroken by the solution

Page 28: Parameterized Algorithms Randomized Techniques

Detecting a properly colored solution (II)

• For each of the cliques into which is partitioned, guess into which of the final clusters it belongs

• For each guess, compute the cost of this solution– Count edges between subcliques in different clusters– Count non-edges between subcliques in the same cluster

• Total of guesses, polynomial cost computation for each– Running time is to detect a properly colored solution, if

one exists

• Using dynamic programming (exercise), this can be improved to time

28

1

22

3

33

Page 29: Parameterized Algorithms Randomized Techniques

Randomized algorithm for -CLUSTERING

• Algorithm -Cluster(graph , integer )– Define – repeat times:

• Color the vertices of uniformly at random with colors• if there is a properly colored solution of size then

– return YES– return NO

29

Theorem. There is a Monte Carlo algorithm for -CLUSTERING with one-sided error and constant success probability that runs

in time

Page 30: Parameterized Algorithms Randomized Techniques

DERANDOMIZATION

30

Page 31: Parameterized Algorithms Randomized Techniques

Why derandomize?

• Truly random bits are very hard to come by– Usual approach is to track radioactive decay

• Standard pseudo-random generators might work– When spending exponential time on an answer, we do not

want to get it wrong

• Luckily, we can replace most applications of randomization by deterministic constructions– Without significant increases in the running time

31

Page 32: Parameterized Algorithms Randomized Techniques

How to derandomize

• Different applications require different pseudorandom objects

• Main idea: instead of picking a random coloring , construct a family of functions – Ensure that at least one function in has the property that

we hope to achieve by the random choice

• Instead of independent repetitions of the Monte Carlo algorithm, run it once for every coloring in

• If the success probability of the random coloring is , we can often construct such a family of size

32

Page 33: Parameterized Algorithms Randomized Techniques

Splitting evenly

• Consider a -coloring of a universe

• A subset is split evenly by if the following holds:– For every the sizes and differ by at most one– All colors occur almost equally often within

• If a set of size is split evenly, then is colorful

33

Page 34: Parameterized Algorithms Randomized Techniques

Splitters

• For , an -splitter is a family of functions from to such that:– For every set of size , there is a function that splits evenly

34

Theorem. For any one can construct an -splitter of size in time

Page 35: Parameterized Algorithms Randomized Techniques

Perfect hash families derandomize LONGEST PATH• The special case of an -splitter is called an -perfect hash

family

• Instead of trying random colorings in the LONGEST PATH algorithm, try all colorings in a perfect hash family– If is the vertex set of a -path, then so some function splits

evenly– Since , this causes to be colorful

• The DP then finds a colorful path

35

Theorem. For any one can construct an -perfect hash family of size in time

Page 36: Parameterized Algorithms Randomized Techniques

Universal sets

• For an -universal set is a family of subsets of such that for any of size , all subsets of are contained in the family:

• Universal sets can be used to derandomize the random separation algorithm for SUBGRAPH ISOMORPHISM (exercise)

36

Theorem. For any one can construct an -universal set of size in time

Page 37: Parameterized Algorithms Randomized Techniques

Coloring families

• For , an -coloring family is a family of functions from to with the following property:– For every graph on the vertex set with at most edges,

there is a function that properly colors

• Coloring families can be used to derandomize the chromatic coding algorithm for -CLUSTERING– Instead of trying random colorings, try all colorings in an -

coloring family

37

Theorem. For any one can construct an -coloring family of size in time

Page 38: Parameterized Algorithms Randomized Techniques

A RANDOMIZED ALGORITHM FOR FEEDBACK VERTEX SET

38

Page 39: Parameterized Algorithms Randomized Techniques

The FEEDBACK VERTEX SET problem

Input: A graph and an integer Parameter:Question: Is there a set of at most vertices in , such

that each cycle contains a vertex of ?

39

Page 40: Parameterized Algorithms Randomized Techniques

Reduction rules for FEEDBACK VERTEX SET

(R1) If there is a loop at vertex , then delete and decrease by one

(R2) If there is an edge of multiplicity larger than , then reduce its multiplicity to

(R3) If there is a vertex of degree at most , then delete

(R4) If there is a vertex of degree two, then delete and add an edge between ’s neighbors

40

If (R1-R4) cannot be applied anymore, then the minimum degree is at least

Observation. If is transformed into , then:1. FVS of size in FVS of size in 2. Any feedback vertex set in is a feedback vertex set in when

combined with the vertices deleted by (R1)

Page 41: Parameterized Algorithms Randomized Techniques

How randomization helps

• We have seen a deterministic algorithm with runtime

• There is a simple randomized Monte Carlo algorithm

• In polynomial time, we can find a size- solution with probability at least , if one exists– Repeating this times gives an algorithm with running time

and constant success probability

• Key insight is a simple procedure to select a vertex that is contained in a solution with constant probability

41

Page 42: Parameterized Algorithms Randomized Techniques

Feedback vertex sets in graphs of min.deg.

• Lemma. Let be an -vertex multigraph with minimum degree at least 3. For every feedback vertex set of , more than half the edges of have at least one endpoint in .

• Proof. Consider the forest – We prove that – for any forest

• It suffices to prove

– Let be the edges with one end in and the other in – Let , and be the vertices of with -degrees ,

• Every vertex of contributes to • Every vertex of contributes to

42

in any forest

Page 43: Parameterized Algorithms Randomized Techniques

Monte Carlo algorithm for FEEDBACK VERTEX SET

43

Theorem. There is a randomized polynomial-time algorithm that, given a FEEDBACK VERTEX SET instance , • either reports a failure, or • finds a feedback vertex set in of size at most .If has an FVS of size , it returns a solution with probability at least

Page 44: Parameterized Algorithms Randomized Techniques

Monte Carlo algorithm for FEEDBACK VERTEX SET• Algorithm FVS(Graph , integer )

– Exhaustively apply (R1)-(R4) to obtain • Let be the vertices with loops removed by (R1)

– if then FAILURE– if is a forest then return – Uniformly at random, pick an edge of – Uniformly at random, pick an endpoint of – return FVS(

44

Page 45: Parameterized Algorithms Randomized Techniques

Correctness (I)

• The algorithm outputs a feedback vertex set or FAILURE

• Claim: If has a size- FVS, then the algorithm finds a solution with probability at least

• Proof by induction on – Assume has a size- feedback vertex set – By safety of (R1)-(R4), ’ has a size-’ FVS

• We have • Since loops are in any FVS, we have

– If , then so ’ is a forest• Algorithm outputs which is a valid solution

– If , we will use the induction hypothesis

45

Page 46: Parameterized Algorithms Randomized Techniques

Correctness (II)

• Case : – Probability that random has an endpoint in is – Probability that is – If , then has an FVS of size

• Then, by induction, with probability recursion gives a size-( FVS of

• So is a size- FVS of • By reduction rules, output is an FVS of

– Size is at most – Probability of success is

46

Theorem. There is a Monte Carlo algorithm for FEEDBACK VERTEX SET with one-sided error and constant success

probability that runs in time

Page 47: Parameterized Algorithms Randomized Techniques

Discussion

• This simple, randomized algorithm is faster than the deterministic algorithm from the previous lecture

• The method generalizes to -MINOR-FREE DELETION problems: delete vertices from the graph to ensure the resulting graph contains no member from the fixed set as a minor– FEEDBACK VERTEX SET is -MINOR-FREE DELETION

47

Page 48: Parameterized Algorithms Randomized Techniques

Exercises

• 5.1, 5.2, 5.8, 5.15, 5.19

Color coding

48

Page 49: Parameterized Algorithms Randomized Techniques

Summary

• Several variations of color coding give efficient FPT algorithms

• The general recipe is as follows:– Randomly color the input, such that if a solution exists,

one is highlighted with probability – Show that a highlighted solution can be found in a colored

instance in time

• For most problems we obtained single-exponential algorithms– For -CLUSTERING we obtained a subexponential algorithm

49

Page 50: Parameterized Algorithms Randomized Techniques

The end


Recommended