Foundations of Computer Science - GitHub Pages

Foundations of Computer ScienceRelease 0.3

Amir Kamil

Nov 22, 2021

CONTENTS

I Algorithms 1

1 Introduction 21.1 Text Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Tools for Abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 The First Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 The Potential Method 72.1 A Potential Function for Euclid’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Divide and Conquer 123.1 The Master Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2 Integer Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3 The Closest-Pair Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4 Dynamic Programming 304.1 Implementation Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2 Longest Common Subsequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.3 Longest Increasing Subsequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.4 All-Pairs Shortest Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Greedy Algorithms 42

II Computability 48

6 Introduction to Computability 496.1 Formal Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496.2 Overview of Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

7 Finite Automata 547.1 Formal Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

8 Turing Machines 618.1 Equivalent Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838.2 The Language of a Turing Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 868.3 Proving a Language is Decidable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

9 Diagonalization 899.1 Countable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 899.2 Uncountable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 929.3 The Existence of an Undecidable Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

i

10 The Halting Problem 96

11 Reducibility 10311.1 Wang Tiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

12 Recognizability 12012.1 Dovetailing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

13 Computable Functions and Kolmogorov Complexity 128

14 Rice’s Theorem 13114.1 Applying Rice’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13514.2 Rice’s Theorem and Program Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

III Complexity 139

15 Introduction to Complexity 14015.1 Efficiently Verifiable Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14215.2 P Versus NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

16 The Cook-Levin Theorem 14716.1 Starting Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15016.2 Cell Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15016.3 Accepting Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15116.4 Transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15116.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

17 NP-Completeness 15517.1 3SAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15817.2 Vertex Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16217.3 Set Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16617.4 Hamiltonian Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16917.5 Subset Sum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17117.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

18 Search and Approximation 17518.1 Approximation for Minimal Vertex Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17718.2 Maximal Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17918.3 Knapsack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18318.4 Other Approaches to NP-Hard Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

IV Randomness 186

19 Randomized Algorithms 18719.1 Review of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18819.2 Randomized Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19219.3 Primality Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19619.4 Skip Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20119.5 Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

20 Probabilistic Complexity Classes 213

21 Monte Carlo Methods and Concentration Bounds 21921.1 Chernoff Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

ii

21.2 Hoeffding’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22621.3 Polling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22721.4 Amplification for Two-Sided-Error Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23121.5 Load Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

V Cryptography 237

22 Introduction to Cryptography 23822.1 Review of Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23922.2 One-time Pad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

23 Diffie-Hellman Key Exchange 248

24 RSA 25124.1 RSA Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25524.2 Quantum Computers and Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256

VI Appendix 258

25 Appendix 25925.1 Proof of the Simplified Chernoff Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25925.2 Proof of the Upper-Tail Hoeffding’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26125.3 General Case of Hoeffding’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

VII About 269

26 About 270

Index 271

iii

Part I

Algorithms

1

CHAPTER

ONE

INTRODUCTION

Every complex problem has a clear, simple and wrong solution. — H. L. Mencken

Welcome to Foundations of Computer Science! This text covers foundational aspects of Computer Science that willhelp you reason about any computing task. In particular, we are concerned with the following with respect to problemsolving:

• What are common, effective approaches to designing an algorithm?

• Given an algorithm, how do we reason about whether it is correct and how efficient it is?

• Are there limits to what problems we can solve with computers, and how do we identify whether a particularproblem is solvable?

• What problems are efficiently solvable, and how do we determine whether a particular problem is?

• What techniques can we use to approximate solutions to problems that we don’t know how to solve efficiently?

• Can randomness help us in solving problems?

• How can we exploit difficult problems to build algorithms for cryptography and security?

The only way to answer these questions is to construct a model and apply a proof-based (i.e. math-like) methodology tothe questions. Thus, this text will feel much like a math text, but we apply the techniques directly to widely applicableproblems in Computer Science.

As an example, how can we demonstrate that there is no general algorithm for determining whether or not two pro-grams have the same functionality? A simple but incorrect approach would be to try all possible algorithms to seeif they work. However, there are infinitely many possible algorithms, so we have no hope of this approach working.Instead, we need to construct a model that captures the notion of what is computable by any algorithm and use that todemonstrate that no such algorithm exists.

1.1 Text Objectives

The main purpose of this text is to give you the tools to approach problems you’ve never seen before. Rather thanbeing given a particular algorithm or data structure to implement, approaching a new problem requires reasoning aboutwhether the problem is solvable, how to relate it to problems that you’ve seen before, what algorithmic techniques areapplicable, whether the algorithm you come up with is correct, and how efficient the resulting algorithm is. These areall steps that must be taken prior to actually writing code to implement a solution, and these steps are independent ofyour choice of programming language or framework.

Thus, in a sense, the fact that you will not have to write code (though you are free to do so if you like) is a feature,not a bug. We focus on the prerequisite algorithmic reasoning required before writing any code, and this reasoningis independent of the implementation details. Were we to have you implement all the algorithms you design in thistext, the workload would be far greater, and it would only replicate the coding practice you get in your programmingcourses. Instead, we focus on the aspects of problem solving that you have not had much experience in thus far.

2

Foundations of Computer Science, Release 0.3

The training we give you in this text will make you a better programmer, as algorithmic design is crucial to effectiveprogramming. This text also provides a solid framework for further exploration of theoretical Computer Scienceshould you wish to pursue that path. However, you will find the material here useful regardless of which subfields ofComputer Science you decide to study.

As an example, suppose your boss tells you that you need to make a business trip to visit several cities, and you mustminimize the cost of visiting all those cities. This is an example of the classic traveling salesperson problem1, and youmay have heard that it is an intractable problem. What exactly does that mean? Does it mean that it cannot be solvedat all? What if we change the problem so that you don’t have to minimize the cost, but instead must fit the cost withina given budget (say $2000)? Does this make the problem any easier? Or if we require that the total cost be within afactor of two of the optimal rather than actually being minimized?

Before we can reason about the total cost of a trip, we need to know how much it costs to travel between two consec-utive destinations in the trip. Suppose we are traveling by air. There may be no direct flight between those two cities,or it might be very expensive. To help us keep our budget down, we need to figure out what the cheapest itinerarybetween those two cities is, considering intermediate layover stops. And since we don’t know a priori in what orderwe will visit all the cities, we need to know the cheapest itineraries between all pairs of cities. This is an instance ofthe all-pairs shortest path problem2. Is this a “solvable” problem, and if so, what algorithmic techniques can we use toconstruct a solution?

We will consider both of the problems above in this text. We will learn what it means for a problem to be solvableor tractable, how to show that a particular problem is or is not solvable, and techniques for designing and analyzingalgorithms.

1.2 Tools for Abstraction

Abstraction is a core principle in Computer Science, allowing us to reason about and use complex systems withoutpaying attention to implementation details. As mentioned earlier, the focus of this text is on reasoning about problemsand algorithms independently of implementation. To do so, we need appropriate abstract models that are applicable toany programming language or system architecture.

Our abstract model for expressing algorithms is pseudocode, which describes the steps in the algorithm at a high levelwithout implementation details. As an example, the following is a pseudocode description of the Floyd-Warshallalgorithm (page 39), which we will discuss later:

Algorithm 1 (Floyd-Warshall)

Algorithm 𝐹𝑙𝑜𝑦𝑑-𝑊𝑎𝑟𝑠ℎ𝑎𝑙𝑙(𝐺 = (𝑉,𝐸)) ∶Let 𝑑0(𝑢, 𝑣) = 𝑤𝑒𝑖𝑔ℎ𝑡(𝑢, 𝑣) for all 𝑢, 𝑣 ∈ 𝑉

For 𝑘 = 1 to ∣𝑉 ∣ ∶For all 𝑢, 𝑣 ∈ 𝑉 ∶

𝑑𝑘(𝑢, 𝑣) ∶= min𝑑𝑘−1(𝑢, 𝑣), 𝑑𝑘−1(𝑖, 𝑘) + 𝑑

𝑘−1(𝑘, 𝑗)Return 𝑑

𝑛

This description is independent of how the graph 𝐺 is represented, or the two-dimensional matrices 𝑑𝑖, or the specificsyntax of a loop. Yet it is still clear what each step of the algorithm is doing. Expressing this algorithm in an actualprogramming language such as C++ would only add unnecessary syntactic and implementation details that are anartifact of the chosen language and data structures and not intrinsic to the algorithm itself. Instead, pseudocode givesus a simple means of expressing an algorithm that facilitates understanding and reasoning about the core elements ofthe algorithm.

1 https://en.wikipedia.org/wiki/Travelling_salesman_problem2 https://en.wikipedia.org/wiki/Shortest_path_problem#All-pairs_shortest_paths

1.2. Tools for Abstraction 3

https://en.wikipedia.org/wiki/Travelling_salesman_problem

https://en.wikipedia.org/wiki/Shortest_path_problem#All-pairs_shortest_paths


We also need abstraction in our methodology for reasoning about the efficiency of an algorithm. The actual run-ning time and space usage of a program depend on many factors, including choice of language and data structures,available compiler optimizations, and characteristics of the underlying hardware. Again, these are not intrinsic to analgorithm itself. Instead, we focus not on absolute efficiency but on how the efficiency scales with respect to inputsize. Specifically, we look at time complexity in terms of:

1. the number of basic operations as a function of input size,

2. asymptotically, as the input size grows,

3. ignoring leading constants,

4. over worst-case inputs.

We measure space complexity in a similar manner, but with respect to the number of memory cells rather than thenumber of basic operations. These measures of time and space complexity allow us to evaluate algorithms in theabstract rather than with respect to a particular implementation. (This is not to say that absolute performance isirrelevant. Rather, the asymptotic complexity gives us a “higher-order” view of an algorithm. An algorithm with poorasymptotic time complexity will be inefficient when run on large inputs regardless of how good the implementationis.)

Later in the text, we will also reason about the intrinsic solvability of a problem. We will see how to express problemsin the abstract (e.g. as languages and decision problems), and we will examine a simple model of computation (Turingmachines) that encapsulates the essence of computation.

1.3 The First Algorithm

One of the oldest known algorithms is Euclid’s algorithm for computing the greatest common divisor (GCD) of twonatural numbers. The greatest common divisor of 𝑥, 𝑦 ∈ N is the largest number 𝑧 ∈ N such that 𝑧 evenly dividesboth 𝑥 and 𝑦. We often use the notation gcd(𝑥, 𝑦) to refer to the GCD of 𝑥 and 𝑦, and if gcd(𝑥, 𝑦) = 1, we say that 𝑥and 𝑦 are coprime. For example, gcd(21, 9) = 3, so 21 and 9 are not coprime, but gcd(121, 5) = 1, so 121 and 5 arecoprime.

A naïve algorithm for computing the GCD of 𝑥 and 𝑦 can try every number starting from 𝑥 down to 1 to see if it evenlydivides both 𝑥 and 𝑦. (For simplicity, we will assume that 𝑥 ≥ 𝑦; otherwise we can swap their values without changingthe answer.)

Algorithm 2 (Naïve GCD)

Algorithm 𝑁𝑎𝑖𝑣𝑒-𝐺𝐶𝐷(𝑥, 𝑦) ∶For 𝑧 = 𝑥 to 1 ∶

If 𝑥 mod 𝑧 = 0 and 𝑦 mod 𝑧 = 0, return 𝑧

Here, the mod operation computes the remainder of the first operand divided by the second. For example, 9 mod 6 =3 since 9 = 6 ⋅ 1 + 3, and 9 mod 3 = 0 since 9 = 3 ⋅ 3 + 0. The result of 𝑥 mod 𝑧 is always an integer in the range[0, 𝑧 − 1], and the result is not defined when 𝑧 is 0.

How efficient is the algorithm above? In the worst case, it performs two mod operations for every value in the range[1, 𝑥]. Using big-O notation, the number of operations is O(𝑥).

Can we do better?

Observe that if a number 𝑧 divides both 𝑥 and 𝑦, then it also divides 𝑥−𝑚𝑦 for any 𝑚 ∈ N; if 𝑥 = 𝑧 ⋅ 𝑎 and 𝑦 = 𝑧 ⋅ 𝑏for some 𝑎, 𝑏 ∈ N, then 𝑥 − 𝑚𝑦 = 𝑧𝑎 − 𝑚𝑧𝑏 = 𝑧 ⋅ (𝑎 − 𝑚𝑏), so 𝑧 evenly divides 𝑥 − 𝑚𝑦 as well. By the samereasoning, if a number 𝑧′ divides both 𝑥−𝑚𝑦 and 𝑦, it also divides 𝑥. Thus, the set of factors common to both 𝑥 and 𝑦is the same as the set common to both 𝑥−𝑚𝑦 and 𝑦, so we can subtract off any multiple of 𝑦 from 𝑥 when computingthe GCD without changing the answer:

1.3. The First Algorithm 4


Lemma 3

∀𝑥, 𝑦,𝑚 ∈ N. gcd(𝑥, 𝑦) = gcd(𝑥 −𝑚𝑦, 𝑦)

Since any 𝑚 will do, let’s pick 𝑚 to minimize the value 𝑥−𝑚𝑦. We do so by picking 𝑚 = ⌊𝑥/𝑦⌋, the integer quotientof 𝑥 and 𝑦. Then if we compute 𝑟 = 𝑥−𝑚𝑦 = 𝑥− ⌊𝑥/𝑦⌋ 𝑦, we obtain the remainder of 𝑥 divided by 𝑦. Thus, we have𝑟 = 𝑥 mod 𝑦, using our previous definition of the mod operation.

This results in another identity:

Theorem 4

∀𝑥, 𝑦 ∈ N. gcd(𝑥, 𝑦) = gcd(𝑥 mod 𝑦, 𝑦)

Or, since the GCD is symmetric:

Corollary 5

∀𝑥, 𝑦 ∈ N. gcd(𝑥, 𝑦) = gcd(𝑦, 𝑥 mod 𝑦)

The latter identity will maintain our invariant that the first argument of gcd is greater than or equal to the second.

We have determined a recurrence relation for gcd. We also need base cases. As mentioned previously, 𝑥 mod 𝑦is not defined when 𝑦 is 0. Thus, we need a base case for 𝑦 = 0. In this case, 𝑥 evenly divides both 𝑥 and 0, sogcd(𝑥, 0) = 𝑥. We can also observe that gcd(𝑥, 1) = 1. (This latter base case is not technically necessary; somedescriptions of Euclid’s algorithm include it while others do not.)

This leads us to the Euclidean algorithm:

Algorithm 6 (Euclid)

Algorithm 𝐸𝑢𝑐𝑙𝑖𝑑(𝑥, 𝑦) ∶If 𝑦 = 0 return 𝑥

If 𝑦 = 1 return 1

Return 𝐸𝑢𝑐𝑙𝑖𝑑(𝑦, 𝑥 mod 𝑦)

The following are some examples of running this algorithm:

Example 7

𝐸𝑢𝑐𝑙𝑖𝑑(21, 9)= 𝐸𝑢𝑐𝑙𝑖𝑑(9, 3)= 𝐸𝑢𝑐𝑙𝑖𝑑(3, 0)= 3



Example 8

𝐸𝑢𝑐𝑙𝑖𝑑(30, 19)= 𝐸𝑢𝑐𝑙𝑖𝑑(19, 11)= 𝐸𝑢𝑐𝑙𝑖𝑑(11, 8)= 𝐸𝑢𝑐𝑙𝑖𝑑(8, 3)= 𝐸𝑢𝑐𝑙𝑖𝑑(3, 2)= 𝐸𝑢𝑐𝑙𝑖𝑑(2, 1)= 1

Example 9

𝐸𝑢𝑐𝑙𝑖𝑑(376281, 376280)= 𝐸𝑢𝑐𝑙𝑖𝑑(376280, 1)= 1

How efficient is this algorithm? It is no longer obvious how many operations it performs. For instance, the computation𝐸𝑢𝑐𝑙𝑖𝑑(30, 19) does six iterations, while 𝐸𝑢𝑐𝑙𝑖𝑑(376281, 376280) does only two. There doesn’t seem to be a clearrelationship between the input size and the number of iterations.

This is an extremely simple algorithm, consisting of just a handful of lines of code. But that does not make it simpleto reason about. We need new techniques to analyze code such as this to determine a measure of its time complexity.


CHAPTER

TWO

THE POTENTIAL METHOD

Let’s set aside Euclid’s algorithm for the moment and examine a game instead. Consider a flipping game that has an

11 × 11 board covered with two-sided chips, say on the front side and on the back. Initially, the

entirety of the board is covered with every chip face down.

The game pits a row player against a column player, and both take turns to flip an entire row or column, respectively.A row or column may be flipped only if it contains more face-down (OSU) than face-up (UM) chips. The game endswhen a player can make no legal moves, and that player loses the game. For example, if the game reaches the followingstate and it is the column player’s move, the column player has no possible moves and therefore loses.

7


We will set aside strategy and ask a simpler question: must the game end in a finite number of moves, or is there a wayfor the game to continue indefinitely?

An 11 × 11 board is rather large to reason about, so a good strategy is to simplify the problem by considering a 3 × 3board instead. Let’s consider the following game state.

Assume that it is the row player’s turn, and the player chooses to flip the bottom row.

Notice that in general, a move may flip some chips from UM to OSU and others vice versa. This move in particularflipped the first chip in the row from UM to OSU and the latter two chips from OSU to UM. The number of chipsflipped in each direction depends on the state of a row or column, and it is not generally the case that every move flipsthree OSU chips to UM in the 3 × 3 board.

Continuing the game, the column player only has a single move available.

8


Again, one UM chip is flipped to OSU and two OSU chips are flipped to UM.

It is again the row player’s turn, but unfortunately, no row flips are possible. The game ends, with the victory going tothe column player.

We observe that each move we examined flipped both UM and OSU chips, but each move flipped more OSU chips toUM than vice versa. Is this always the case? Indeed it is: a move is only legal if the respective row or column hasmore OSU than UM chips. The move flips each OSU chip to a UM one, and each UM chip to an OSU chip. Sincethere are more OSU than UM chips, more OSU to UM swaps happen than vice versa. More formally, for an 𝑛 × 𝑛board, an individual row or column has 𝑘 OSU chips and 𝑛−𝑘 UM chips. A move is only legal if 𝑘 > 𝑛−𝑘. After theflip, the row or column will have 𝑘 UM chips and 𝑛 − 𝑘 OSU chips. The number of UM chips gained is 𝑘 − (𝑛 − 𝑘).Since we know that 𝑘 > 𝑛 − 𝑘, subtracting 𝑛 − 𝑘 from both sides gives us 𝑘 − (𝑛 − 𝑘) > 0. Thus, at least one UMchip is added by each move, which implies that at least one OSU chip is removed by a move.

In the 3× 3 case, we start with nine OSU chips. No board configuration has fewer than zero OSU chips. Then if eachmove removes at least one OSU chip, no more than nine moves are possible. By the same reasoning, no more than121 moves are possible in the original 11 × 11 game.

Strictly speaking, it will always take fewer than 121 moves to reach a configuration where a player has no movesavailable. But we don’t need an exact number to answer our original question. We have placed an upper bound of 121on the number of moves, so we have established that a game is indeed bounded by a finite number of moves.

The core of our reasoning above is that we compute a measure of the board’s state, e.g. the number of OSU chips.We observe that in each step, this measure decreases. There is a lower bound on the measure, so eventually that lowerbound will be reached.

This pattern of reasoning is called the potential method. Formally, if we have an algorithm 𝐴, we define a function𝑠 ∶ 𝐴→ R that maps the state of the algorithm to a number. The function 𝑠 is a potential function if:

1. 𝑠 strictly decreases with every step of the algorithm3

2. 𝑠 has a lower bound

By defining a meaningful potential function and establishing both its lower bound and how it decreases, we can reasonabout the number of steps required to run a complex algorithm.

2.1 A Potential Function for Euclid’s Algorithm

Let’s come back to Euclid’s algorithm and try to come up with a potential function we can use to reason about itsefficiency. Specifically, we want to determine an upper bound on the number of iterations for a given input 𝑥 and 𝑦.

Observe that 𝑥 and 𝑦 both decrease from one step to the next. For instance, 𝐸𝑢𝑐𝑙𝑖𝑑(30, 19) calls 𝐸𝑢𝑐𝑙𝑖𝑑(19, 11), so𝑥 decreases from 30 to 19 and 𝑦 decreases from 19 to 11. However, the amount each argument individually decreasescan vary. In 𝐸𝑢𝑐𝑙𝑖𝑑(376281, 376280), 𝑥 decreases by only one in the next iteration while 𝑦 decreases by 376279.

Given that the algorithm has two variables, both of which change as the algorithm proceeds, it seems reasonable todefine a potential function that takes both into account. Let 𝑥𝑖 and 𝑦𝑖 be the values of the two variables in iteration 𝑖.

3 We need to take a bit more care to ensure that 𝑠 converges to its lower bound in a finite number of steps. A sufficient condition is that 𝑠decreases by at least some fixed constant 𝑐 in each step. In the game example, we established that 𝑠 decreases by at least 𝑐 = 1 in each step.

2.1. A Potential Function for Euclid’s Algorithm 9


Initially, 𝑥0 = 𝑥 and 𝑦0 = 𝑦, where 𝑥 and 𝑦 are the original inputs to the algorithm. We try a simple sum of the twovariables as our potential function:

𝑠𝑖 = 𝑥𝑖 + 𝑦𝑖

Before we look at some examples, let’s first establish that this is a valid potential function. Examining the algorithm,we see that:

𝑠𝑖+1 = 𝑥𝑖+1 + 𝑦𝑖+1

= 𝑦𝑖 + 𝑟𝑖

where 𝑟𝑖 = 𝑥𝑖 mod 𝑦𝑖. Given the invariant maintained by the algorithm that 𝑥𝑖 > 𝑦𝑖, and that 𝑥𝑖 mod 𝑦𝑖 ∈ [0, 𝑦𝑖−1],we have that 𝑟𝑖 < 𝑦𝑖 < 𝑥𝑖. Then:

𝑠𝑖+1 = 𝑦𝑖 + 𝑟𝑖

< 𝑦𝑖 + 𝑥𝑖

= 𝑠𝑖

Thus, 𝑠 always decreases from one iteration to the next, satisfying the first requirement of a potential function. Wealso observe that 𝑦𝑖 ≥ 0 for all 𝑖; coupled with 𝑥𝑖 > 𝑦𝑖, this implies that 𝑥𝑖 ≥ 1, so 𝑠𝑖 ≥ 1 + 0 = 1 for all 𝑖. Since wehave established a lower bound on 𝑠, it meets the second requirement for a potential function.

As an example, we look at the values of the potential function in the case of 𝐸𝑢𝑐𝑙𝑖𝑑(21, 9):

𝑠0 = 21 + 9 = 30

𝑠1 = 9 + 3 = 12

𝑠2 = 3 + 0 = 3

The following are the values for 𝐸𝑢𝑐𝑙𝑖𝑑(8, 5):

𝑠0 = 8 + 5 = 13

𝑠1 = 5 + 3 = 8

𝑠2 = 3 + 2 = 5

𝑠3 = 2 + 1 = 3

While the values decrease rather quickly for 𝐸𝑢𝑐𝑙𝑖𝑑(21, 9), they do so more slowly for 𝐸𝑢𝑐𝑙𝑖𝑑(8, 5). In these exam-ples, the ratio of 𝑠𝑖+1/𝑠𝑖 is highest for 𝑠2/𝑠1 in 𝐸𝑢𝑐𝑙𝑖𝑑(8, 5), where it is 0.625. In fact, we will demonstrate an upperbound that is not far from that value.

Lemma 10 For all valid inputs 𝑥, 𝑦 to Euclid’s algorithm, 𝑠𝑖+1 ≤23𝑠𝑖 for all iterations 𝑖 of 𝐸𝑢𝑐𝑙𝑖𝑑(𝑥, 𝑦).

The recursive case of 𝐸𝑢𝑐𝑙𝑖𝑑(𝑥𝑖, 𝑦𝑖) invokes 𝐸𝑢𝑐𝑙𝑖𝑑(𝑦𝑖, 𝑥𝑖 mod 𝑦𝑖), so 𝑥𝑖+1 = 𝑦𝑖 and 𝑦𝑖+1 = 𝑥𝑖 mod 𝑦𝑖 = 𝑟𝑖, theremainder of dividing 𝑥𝑖 by 𝑦𝑖. By definition of remainder, we can express 𝑥𝑖 as:

𝑥𝑖 = 𝑞𝑖 ⋅ 𝑦𝑖 + 𝑟𝑖

where 𝑞𝑖 = ⌊𝑥𝑖/𝑦𝑖⌋ is the integer quotient of dividing 𝑥𝑖 by 𝑦𝑖. Since 𝑥𝑖 > 𝑦𝑖, we have that 𝑞𝑖 ≥ 1. Then:

𝑠𝑖 = 𝑥𝑖 + 𝑦𝑖

= 𝑞𝑖 ⋅ 𝑦𝑖 + 𝑟𝑖 + 𝑦𝑖 (substituting 𝑥𝑖 = 𝑞𝑖 ⋅ 𝑦𝑖 + 𝑟𝑖)= (𝑞𝑖 + 1) ⋅ 𝑦𝑖 + 𝑟𝑖

≥ 2𝑦𝑖 + 𝑟𝑖 (since 𝑞𝑖 ≥ 1)

We are close to what we need to relate 𝑠𝑖 to 𝑠𝑖+1 = 𝑦𝑖 + 𝑟𝑖, but we need a common multiplier for both the 𝑦𝑖 and 𝑟𝑖terms. Let’s split the difference by adding 𝑟𝑖/2 and subtracting 𝑦𝑖/2:

𝑠𝑖 ≥ 2𝑦𝑖 + 𝑟𝑖

≥ 2𝑦𝑖 + 𝑟𝑖 −𝑦𝑖 − 𝑟𝑖

2(since 𝑟𝑖 < 𝑦𝑖)



The latter stop holds because 𝑟𝑖 < 𝑦𝑖 (since it is the remainder of dividing 𝑥𝑖 by 𝑦𝑖), which means that 𝑦𝑖 − 𝑟𝑖 ispositive. By subtracting a positive number, we only make the right-hand side of the inequality smaller. We startedwith the left-hand side being bigger than the right-hand side, and making the right-hand side even smaller maintainsthat inequality.

Continuing onward, we have:

𝑠𝑖 ≥ 2𝑦𝑖 + 𝑟𝑖 −𝑦𝑖 − 𝑟𝑖

2

=3

2𝑦𝑖 +

3

2𝑟𝑖

=3

2𝑠𝑖+1

Rearranging the inequality, we conclude that 𝑠𝑖+1 ≤23𝑠𝑖.

By repeated applications of the lemma, starting with 𝑠0 = 𝑥0 + 𝑦0 = 𝑥 + 𝑦, we can conclude that:

Corollary 11 For all valid inputs 𝑥, 𝑦 to Euclid’s algorithm, 𝑠𝑖 ≤ ( 23)𝑖 (𝑥+𝑦) for all iterations 𝑖 of 𝐸𝑢𝑐𝑙𝑖𝑑(𝑥, 𝑦).

We can now prove the following:

Theorem 12 (Time Complexity of Euclid’s Algorithm) For inputs 𝑥, 𝑦, the number of iterations of Euclid’salgorithm is O(log(𝑥 + 𝑦)).

We have previously shown that 1 ≤ 𝑠𝑖 ≤ ( 23)𝑖 (𝑥 + 𝑦) for any iteration 𝑖. Thus,

1 ≤ (2

3)𝑖

(𝑥 + 𝑦)

(3

2)𝑖

≤ 𝑥 + 𝑦 (multiplying both sides by (3

2)𝑖

)

𝑖 ≤ log3/2(𝑥 + 𝑦) (taking the log base 3/2 of both sides)

We have established an upper bound on 𝑖, which means that the number of recursive calls cannot exceed log3/2(𝑥 +𝑦) = O(log(𝑥 + 𝑦)). (We can change bases by multiplying by a constant, and since O-notation ignores leadingconstants, the base in O(log(𝑥+𝑦)) does not matter. Unless otherwise specified, the base of a logarithm is assumed tobe 2 in this text.) Since each iteration does at most one mod operation, the total number of operations is O(log(𝑥+𝑦)),completing the proof.

Under our assumption that 𝑥 > 𝑦, we have that O(log(𝑥 + 𝑦)) = O(log 2𝑥) = O(log 𝑥). Recall that the naïvealgorithm had complexity O(𝑥). This means that Euclid’s algorithm is exponentially faster than the naïve algorithm.

We see that the potential method gives us an important tool in reasoning about the complexity of algorithms, enablingus to establish an upper bound on the runtime of Euclid’s algorithm.

Exercise 13 The bound of log3/2(𝑥 + 𝑦) that we showed on the number of recursive calls of Euclid’s algorithmcan be made a little tighter. Show that the number of recursive calls made by the algorithm on (𝑥, 𝑦) is at mostlog𝜙(𝑥 + 𝑦), where 𝜙 = 1+

√5

2≈ 1.618 is the golden ratio4.

4 https://en.wikipedia.org/wiki/Golden_ratio


https://en.wikipedia.org/wiki/Golden_ratio

CHAPTER

THREE

DIVIDE AND CONQUER

The divide-and-conquer algorithmic pattern involves subdividing a large, complex problem instance into smaller ver-sions of the same problem. The subproblems are solved recursively, and their solutions are combined in a “meaningful”way to construct the solution for the original instance.

Since divide and conquer is a recursive paradigm, the main tool for analyzing divide-and-conquer algorithms is induc-tion. When it comes to complexity analysis, such algorithms generally give rise to recurrence relations expressing thetime or space complexity. While these relations can be solved inductively, we usually rely on other tools to do so. Wewill see one such tool in the form of the master theorem.

As an example of a divide-and-conquer algorithm, the following is a description of the merge sort algorithm for sortingmutually comparable items:

Algorithm 14 (Merge Sort)

Algorithm 𝑀𝑒𝑟𝑔𝑒𝑆𝑜𝑟𝑡(𝐴[1..𝑛] ∶ array of 𝑛 integers) ∶If 𝑛 = 1 return 𝐴

𝑚 ∶= ⌊𝑛/2⌋𝐿 ∶=𝑀𝑒𝑟𝑔𝑒𝑆𝑜𝑟𝑡(𝐴[1..𝑚])𝑅 ∶=𝑀𝑒𝑟𝑔𝑒𝑆𝑜𝑟𝑡(𝐴[𝑚 + 1..𝑛])Return 𝑚𝑒𝑟𝑔𝑒(𝐿,𝑅)

Subroutine 𝑚𝑒𝑟𝑔𝑒(𝐴[1..𝑚], 𝐵[1..𝑛]) ∶If 𝑚 = 0 return 𝐵

If 𝑛 = 0 return 𝐴

If 𝐴[1] > 𝐵[1] return 𝐵[1] +𝑚𝑒𝑟𝑔𝑒(𝐴[1..𝑚], 𝐵[2..𝑛])Return 𝐴[1] +𝑚𝑒𝑟𝑔𝑒(𝐴[2..𝑚], 𝐵[1..𝑛])

The algorithm sorts an array by recursively sorting the two halves, then combining the sorted halves with the mergeoperation. Thus, it follows the pattern of a divide-and-conquer algorithm.

6 14 12 1 9 4 8 0 5 13 15 10 7 2 3 11

0 1 4 6 8 9 12 14 2 3 5 7 10 11 13 15

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Sort each half recursively

Merge

12


A naïve algorithm such as insertion sort5 has a time complexity of O(𝑛2). How does merge sort compare?

Define 𝑇 (𝑛) to be the number of operations performed by merge sort on an array of 𝑛 elements. Let 𝑆(𝑛) bethe number of operations performed by the merge step for an input array of size 𝑛. We can express the followingrecurrence for 𝑇 (𝑛):

𝑇 (𝑛) = 2𝑇 (𝑛2) + 𝑆(𝑛)

Observe that the merge step does ~𝑛 comparisons and ~𝑛 array concatenations. Assuming that each of these operationstakes a constant amount of time6, we have that 𝑆(𝑛) = O(𝑛). This leaves us with the following for 𝑇 (𝑛):

𝑇 (𝑛) = 2𝑇 (𝑛2) +𝑂(𝑛)

How do we solve this recurrence? While we can do so using induction or other tools for directly solving recurrencerelations, in many cases we can take a short cut and apply the master theorem.

3.1 The Master Theorem

Consider an arbitrary divide-and-conquer algorithm that breaks a problem of size 𝑛 into:

• 𝑘 smaller subproblems, 𝑘 ≥ 1

• each subproblem is of size 𝑛/𝑏, 𝑏 > 1

• the cost of splitting the input and combining the results is O(𝑛𝑑)This results in a recurrence relation:

𝑇 (𝑛) = 𝑘𝑇 (𝑛𝑏) +O(𝑛𝑑)

More generally, when 𝑏 does not evenly divide 𝑛, we may end up with one of the following recurrences:

𝑇 (𝑛) = 𝑘𝑇 (⌊𝑛𝑏⌋) +O(𝑛𝑑)

𝑇 (𝑛) = 𝑘𝑇 (⌈𝑛𝑏⌉) +O(𝑛𝑑)

The master theorem provides the solution to all three of these recurrences.

Theorem 15 (Master Theorem) Let 𝑇 (𝑛) be a recurrence with one of the following forms, where 𝑘 ≥ 1, 𝑏 >1, 𝑑 ≥ 0 are constants:

𝑇 (𝑛) = 𝑘𝑇 (𝑛𝑏) +O(𝑛𝑑)

𝑇 (𝑛) = 𝑘𝑇 (⌊𝑛𝑏⌋) +O(𝑛𝑑)

𝑇 (𝑛) = 𝑘𝑇 (⌈𝑛𝑏⌉) +O(𝑛𝑑)

5 https://en.wikipedia.org/wiki/Insertion_sort6 Using a linked data structure, adding an element to the front does indeed take a constant amount of time. On the other hand, the assumption of

a constant-time comparison only generally holds for fixed-size data types, such as a 32-bit integer or 64-bit floating-point number. For arbitrary-sizenumbers or other variable-length data types (e.g. strings), this assumption does not hold, and the cost of the comparison needs to be consideredmore carefully.

3.1. The Master Theorem 13

https://en.wikipedia.org/wiki/Insertion_sort


Then:

𝑇 (𝑛) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

O(𝑛𝑑) if 𝑘/𝑏𝑑 < 1

O(𝑛𝑑log 𝑛) if 𝑘/𝑏𝑑 = 1

O(𝑛log𝑏 𝑘) if 𝑘/𝑏𝑑 > 1

The relationship between 𝑘, 𝑏, and 𝑑 can alternatively be expressed in logarithmic form. For instance,

𝑘/𝑏𝑑 < 1

𝑘 < 𝑏𝑑

log𝑏 𝑘 < 𝑑

taking the log𝑏 of both sides in the last step. We end up with:

𝑇 (𝑛) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

O(𝑛𝑑) if log𝑏 𝑘 < 𝑑

O(𝑛𝑑log 𝑛) if log𝑏 𝑘 = 𝑑

O(𝑛log𝑏 𝑘) if log𝑏 𝑘 > 𝑑

Proof 16 We prove the master theorem for the case where 𝑛 is a power of 𝑏; when this is not the case, we canpad the input size to the smallest power of 𝑏 that is larger than 𝑛, which increases the input size by at most theconstant factor 𝑏 (can you see why?).

First, we determine how many subproblems we have at each level of the recursion. Initially, we have a singleproblem of size 𝑛. We subdivide it into 𝑘 subproblems, each of size 𝑛/𝑏. Each of those gets subdivided into 𝑘subproblems of size 𝑛/𝑏2, for a total of 𝑘2 problems of size 𝑛/𝑏2. Those 𝑘2 problems in turn are subdivided intoa total of 𝑘3 problems of size 𝑛/𝑏3.

𝑛

𝑛𝑏

𝑛𝑏⋯

𝑘

𝑛𝑏!

𝑛𝑏!⋯

𝑛𝑏!

𝑛𝑏!⋯

𝑘!

1 1 ⋯

⋮

1 1

𝑘"#$! %

level 𝑖 = 0

level 𝑖 = 1

level 𝑖 = 2

level 𝑖 = log" 𝑛

In general, at the 𝑖th level of recursion (with 𝑖 = 0 the initial problem), there are 𝑘𝑖 subproblems of size 𝑛/𝑏𝑖. We



assume that the base case is when the problem size is 1, which is when

1 =𝑛

𝑏𝑖

𝑏𝑖= 𝑛

𝑖 = log𝑏 𝑛

Thus, there are 1 + log𝑏 𝑛 total levels in the recursion.

We now consider how much work is done in each subproblem, which corresponds to the O(𝑛𝑑) term of therecurrence relation. For a subproblem of size 𝑛/𝑏𝑖, this is

O(( 𝑛𝑏𝑖)𝑑

) = O(𝑛𝑑

𝑏𝑖𝑑)

= 𝑏−𝑖𝑑

⋅O(𝑛𝑑)

With 𝑘𝑖 subproblems at level 𝑖, the total work 𝑇𝑖 at level 𝑖 is

𝑇𝑖 =𝑘𝑖

𝑏𝑖𝑑⋅O(𝑛𝑑)

= ( 𝑘

𝑏𝑑)𝑖

⋅O(𝑛𝑑)

Summing over all the levels, we get a total work

𝑇 =

log𝑏 𝑛

∑𝑖=0

𝑇𝑖

=

log𝑏 𝑛

∑𝑖=0

(( 𝑘

𝑏𝑑)𝑖

⋅O(𝑛𝑑))

= O(𝑛𝑑) ⋅log𝑏 𝑛

∑𝑖=0

( 𝑘

𝑏𝑑)𝑖

= O(𝑛𝑑) ⋅log𝑏 𝑛

∑𝑖=0

𝑟𝑖

= O(𝑛𝑑⋅log𝑏 𝑛

∑𝑖=0

𝑟𝑖)

= O (𝑛𝑑⋅ (1 + 𝑟 + 𝑟

2+ ⋅ ⋅ ⋅ + 𝑟

log𝑏 𝑛))

where 𝑟 = 𝑘/𝑏𝑑.

Since we are working with asymptotics, we only care about the dominating term in the sum

log𝑏 𝑛

∑𝑖=0

𝑟𝑖= 1 + 𝑟 + 𝑟

2+ ⋅ ⋅ ⋅ + 𝑟

log𝑏 𝑛

There are three cases:



• 𝑟 < 1: The terms have decreasing value, so that the initial term 1 is the largest and dominates7. Thus, wehave

𝑇 = O (𝑛𝑑⋅ (1 + 𝑟 + 𝑟

2+ ⋅ ⋅ ⋅ + 𝑟

log𝑏 𝑛))= O(𝑛𝑑)

• 𝑟 = 1: The terms all have the same value 1. Then

𝑇 = O(𝑛𝑑⋅log𝑏 𝑛

∑𝑖=0

𝑟𝑖)

= O(𝑛𝑑⋅log𝑏 𝑛

∑𝑖=0

1)

= O (𝑛𝑑⋅ (1 + log𝑏 𝑛))

= O (𝑛𝑑+ 𝑛

𝑑log𝑏 𝑛))

= O(𝑛𝑑log𝑏 𝑛)

where we discard the lower-order term in the last step. Since logarithms of different bases differ only by amultiplicative constant factor (see the logarithmic identity for changing the base10), which does not affectasymptotics, we can elide the base:

𝑇 = O(𝑛𝑑log𝑏 𝑛)

= O(𝑛𝑑log 𝑛)

• 𝑟 > 1: The terms have increasing value, so that the final term 𝑟log𝑏 𝑛 is the largest. Then we have

𝑇 = O (𝑛𝑑⋅ (1 + 𝑟 + 𝑟

2+ ⋅ ⋅ ⋅ + 𝑟

log𝑏 𝑛))= O (𝑛𝑑

𝑟log𝑏 𝑛)

= O(𝑛𝑑 ( 𝑘

𝑏𝑑)log𝑏 𝑛

)

= O (𝑛𝑑⋅ 𝑘

log𝑏 𝑛 ⋅ 𝑏−𝑑 log𝑏 𝑛)

= O (𝑛𝑑⋅ 𝑘

log𝑏 𝑛 ⋅ (𝑏log𝑏 𝑛)−𝑑)

To simplify this further, we observe the following:

– 𝑏log𝑏 𝑛

= 𝑛, which we can see by taking log𝑏 of both sides.

– 𝑘log𝑏 𝑛

= 𝑛log𝑏 𝑘. We show this as follows:

𝑘log𝑏 𝑛

= (𝑏log𝑏 𝑘)log𝑏 𝑛

Here, we applied the previous observation to substitute 𝑘 = 𝑏log𝑏 𝑘. We proceed to get

𝑘log𝑏 𝑛

= (𝑏log𝑏 𝑘)log𝑏 𝑛

= (𝑏log𝑏 𝑛)log𝑏 𝑘

= 𝑛log𝑏 𝑘


https://en.wikipedia.org/wiki/Logarithm#Change_of_base


applying the prior observation once again.

Applying both observations, we get

𝑇 = O (𝑛𝑑⋅ 𝑘

log𝑏 𝑛 ⋅ (𝑏log𝑏 𝑛)−𝑑)

= O (𝑛𝑑⋅ 𝑛

log𝑏 𝑘 ⋅ 𝑛−𝑑)

= O (𝑛log𝑏 𝑘)

Here, we cannot discard the base of the logarithm, as it appears in the exponent and so affects the value byan exponential rather than a constant factor.

Thus, we have demonstrated all three cases of the master theorem.

7 We’re being a bit sloppy here – the largest term in a sum doesn’t necessarily dominate, as can be seen with a divergent series such asthe harmonic series? ∑∞𝑛=1 1/𝑛. However, what we have here is a geometric series? ∑𝑛

𝑖=0 𝑟𝑖, which has the closed-form solution∑𝑛

𝑖=0 𝑟𝑖=

(1− 𝑟𝑛+1)/(1− 𝑟) for 𝑟 ≠ 1. When 0 ≤ 𝑟 < 1 (and therefore 𝑟𝑛+1 < 1), this is bounded from above by the constant 1/(1− 𝑟), so the sum

is O(1).10 https://en.wikipedia.org/wiki/Logarithm#Change_of_base

In the case of merge sort, we have 𝑑 = 1 and 𝑘/𝑏𝑑 = 1, so the solution is 𝑇 (𝑛) = O(𝑛 log 𝑛). Thus, merge sort ismore efficient than insertion sort.

3.1.1 Master Theorem with Log Factors

A recurrence such as

𝑇 (𝑛) = 2𝑇 (𝑛2) +O(𝑛 log 𝑛)

does not quite fit into the pattern of the master theorem above, since the combination cost O(𝑛 log 𝑛) is not of theform O(𝑛𝑑) for some constant 𝑑. Such a recurrence requires a modified form of the theorem.

Theorem 17 Let 𝑇 (𝑛) be a recurrence with one of the following forms, where 𝑘 ≥ 1, 𝑏 > 1, 𝑑 ≥ 0, 𝑤 ≥ 0 areconstants:

𝑇 (𝑛) = 𝑘𝑇 (𝑛𝑏) +O(𝑛𝑑

log𝑤𝑛)

𝑇 (𝑛) = 𝑘𝑇 (⌊𝑛𝑏⌋) +O(𝑛𝑑

log𝑤𝑛)

𝑇 (𝑛) = 𝑘𝑇 (⌈𝑛𝑏⌉) +O(𝑛𝑑

log𝑤𝑛)

Then:

𝑇 (𝑛) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

O(𝑛𝑑log

𝑤𝑛) if 𝑘/𝑏𝑑 < 1

O(𝑛𝑑log

𝑤+1𝑛) if 𝑘/𝑏𝑑 = 1

O(𝑛log𝑏 𝑘) if 𝑘/𝑏𝑑 > 1

Proof 18 We need to modify the reasoning in Proof 16 to account for the amount of work done in each subprob-


https://en.wikipedia.org/wiki/Harmonic_series_(mathematics)

https://en.wikipedia.org/wiki/Geometric_series


lem. For a subproblem of size 𝑛/𝑏𝑖, we now have work


log𝑤 𝑛

𝑏𝑖)

We consider the cases of 𝑘/𝑏𝑑 ≤ 1 and 𝑘/𝑏𝑑 > 1 separately.

• 𝑘/𝑏𝑑 ≤ 1: We compute an upper bound on the work as follows:


log𝑤 𝑛

𝑏𝑖) = O(𝑛

𝑑

𝑏𝑖𝑑⋅ (log

𝑤𝑛 − log

𝑤𝑏𝑖))

= 𝑏−𝑖𝑑

⋅O(𝑛𝑑⋅ (log

𝑤𝑛 − log

𝑤𝑏𝑖))

= 𝑏−𝑖𝑑

⋅O(𝑛𝑑log

𝑤𝑛)

since 𝑛𝑑 ⋅ (log𝑤𝑛− log

𝑤𝑏𝑖) ≤ 𝑛

𝑑log

𝑤𝑛 for 𝑏 > 1, and we only need an upper bound for the asymptotics.

Then the work 𝑇𝑖 at level 𝑖 is

𝑇𝑖 =𝑘𝑖

𝑏𝑖𝑑⋅O(𝑛𝑑

log𝑤𝑛)

= ( 𝑘

𝑏𝑑)𝑖

⋅O(𝑛𝑑log

𝑤𝑛)

Summing over all the levels, we get a total work

𝑇 =

log𝑏 𝑛

∑𝑖=0

𝑇𝑖

= O(𝑛𝑑log

𝑤𝑛) ⋅

log𝑏 𝑛

∑𝑖=0

( 𝑘

𝑏𝑑)𝑖

= O(𝑛𝑑log

𝑤𝑛) ⋅

log𝑏 𝑛

∑𝑖=0

𝑟𝑖

= O(𝑛𝑑log

𝑤𝑛 ⋅

log𝑏 𝑛

∑𝑖=0

𝑟𝑖)

where 𝑟 = 𝑘/𝑏𝑑. When 𝑟 < 1, the initial term of the summation dominates as before, so we have

𝑇 = O (𝑛𝑑log

𝑤𝑛 ⋅ (1 + 𝑟 + 𝑟

2+ ⋅ ⋅ ⋅ + 𝑟

log𝑏 𝑛))= O(𝑛𝑑

log𝑤𝑛)

When 𝑟 = 1, the terms all have equal size, also as before. Then

𝑇 = O (𝑛𝑑log

𝑤𝑛 ⋅ (1 + 𝑟 + 𝑟

2+ ⋅ ⋅ ⋅ + 𝑟

log𝑏 𝑛))= O(𝑛𝑑

log𝑤𝑛 ⋅ (1 + log𝑏 𝑛))

= O(𝑛𝑑log

𝑤𝑛 log𝑏 𝑛)

= O(𝑛𝑑log

𝑤𝑛 log 𝑛)

= O(𝑛𝑑log

𝑤+1𝑛)



As before, we drop the lower order term, as well as the base in log𝑏 𝑛 since it only differs by a constantfactor from log 𝑛.

• 𝑘/𝑏𝑑 > 1: We start by observing that since log𝑤𝑛 is subpolynomial in 𝑛, it grows slower than any

polynomial in 𝑛. We have

𝑘

𝑏𝑑> 1

or equivalently

log𝑏 𝑘 > 𝑑

0 < log𝑏 𝑘 − 𝑑

We choose a small exponent 𝜖 such that

0 < 𝜖 < log𝑏 𝑘 − 𝑑

For instance, we can choose 𝜖 = (log𝑏 𝑘 − 𝑑)/2. Then

log𝑏 𝑘 > 𝑑 + 𝜖

𝑘 > 𝑏𝑑+𝜖

𝑘

𝑏𝑑+𝜖> 1

We also have

log𝑤𝑛 ≤ 𝐶𝑛

𝜖

for a sufficiently large constant 𝐶 and 𝑛 ≥ 1. For a subproblem of size 𝑛/𝑏𝑖, we now have work


log𝑤 𝑛

𝑏𝑖) = O(( 𝑛

𝑏𝑖)𝑑

⋅ 𝐶 ( 𝑛𝑏𝑖)𝜖

)

= O(𝐶 ( 𝑛𝑏𝑖)𝑑+𝜖

)

Then our work 𝑇𝑖 at level 𝑖 is

𝑇𝑖 = 𝑘𝑖⋅O(𝐶 ( 𝑛

𝑏𝑖)𝑑+𝜖

)



The total work 𝑇 is

𝑇 =

log𝑏 𝑛

∑𝑖=0

(𝑘𝑖 ⋅O(𝐶 ( 𝑛𝑏𝑖)𝑑+𝜖

))

= O(log𝑏 𝑛

∑𝑖=0

(𝑘𝑖 ⋅ 𝐶 ( 𝑛𝑏𝑖)𝑑+𝜖

))

= O(𝐶𝑛𝑑+𝜖

⋅log𝑏 𝑛

∑𝑖=0

(𝑘𝑖 ⋅ (𝑏−𝑖)𝑑+𝜖))


⋅log𝑏 𝑛

∑𝑖=0

(𝑘𝑖 ⋅ (𝑏𝑑+𝜖)−𝑖))


⋅log𝑏 𝑛

∑𝑖=0

( 𝑘

𝑏𝑑+𝜖)𝑖

)

Since 𝑘/𝑏𝑑+𝜖 > 1, the last term in the summation dominates as in Proof 16, so we get

𝑇 = O(𝐶𝑛𝑑+𝜖

⋅log𝑏 𝑛

∑𝑖=0

( 𝑘

𝑏𝑑+𝜖)𝑖

)


⋅ ( 𝑘

𝑏𝑑+𝜖)log𝑏 𝑛

)

= O (𝐶𝑛𝑑+𝜖

⋅ 𝑘log𝑏 𝑛 ⋅ (𝑏−(𝑑+𝜖))

log𝑏 𝑛)


⋅ 𝑘log𝑏 𝑛 ⋅ (𝑏log𝑏 𝑛)−(𝑑+𝜖))


⋅ 𝑘log𝑏 𝑛 ⋅ 𝑛

−(𝑑+𝜖))= O (𝐶𝑘

log𝑏 𝑛)= O (𝐶𝑛

log𝑏 𝑘)

In the last step, we applied the observation that 𝑘log𝑏 𝑛= 𝑛

log𝑏 𝑘. Folding the constant 𝐶 into the big-Onotation, we end up with

𝑇 = O (𝐶𝑘log𝑏 𝑛)

= O (𝑛log𝑏 𝑘)

as required.

Applying the modified master theorem to the recurrence

𝑇 (𝑛) = 2𝑇 (𝑛2) +O(𝑛 log 𝑛)

we have 𝑘 = 2, 𝑏 = 2, 𝑑 = 1, 𝑤 = 1, 𝑘/𝑏𝑑 = 1. Thus,

𝑇 (𝑛) = O(𝑛𝑑log

𝑤+1𝑛)

= O(𝑛 log2𝑛)



3.1.2 Non-master-theorem Recurrences

Recurrences that do not match the pattern required by the master theorem can often be manipulated to do so usingsubstitutions. We take a look at two examples here.

Example 19 Consider the following recurrence:

𝑇 (𝑛) = 2𝑇 (𝑛2) +O(𝑛 log 𝑛)

While we solved this recurrence above using the master theorem with log factors, we can also do so throughsubstitution. We perform the substitution 𝑆(𝑛) = 𝑇 (𝑛)/ log 𝑛 to get the following:

𝑆(𝑛) log 𝑛 = 2𝑆 (𝑛2) log

𝑛

2+O(𝑛 log 𝑛)

= 2𝑆 (𝑛2) (log 𝑛 − log 2) +O(𝑛 log 𝑛)

= 2𝑆 (𝑛2) (log 𝑛 − 1) +O(𝑛 log 𝑛)

To get this, we substituted 𝑇 (𝑛) = 𝑆(𝑛) log 𝑛 and 𝑇 (𝑛/2) = 𝑆(𝑛/2) log(𝑛/2) in the first step. Since log 𝑛−1 ≤log 𝑛, we can turn the above into the inequality:

𝑆(𝑛) log 𝑛 ≤ 2𝑆 (𝑛2) log 𝑛 +O(𝑛 log 𝑛)

We can then divide out the log 𝑛 from each term to get:

𝑆(𝑛) ≤ 2𝑆 (𝑛2) +O(𝑛)

We now have something that is in the right form for applying the master theorem. Even though it is an inequalityrather than an equality, because we are computing an upper bound, we can still apply the master theorem. Wehave 𝑘/𝑏𝑑 = 2/2

1= 1, so we get:

𝑆(𝑛) = O(𝑛 log 𝑛)

We can then undo the substitution 𝑆(𝑛) = 𝑇 (𝑛)/ log 𝑛 to get:

𝑇 (𝑛) = 𝑆(𝑛) log 𝑛 = O(𝑛 log 𝑛) log 𝑛

= O(𝑛 log2𝑛)

Exercise 20 In Example 19, we assumed that if 𝑓(𝑛) = O(log 𝑛), then 𝑓(𝑛)/ log 𝑛 = O(1). Prove from thedefinition of big-O that this holds.

Example 21 Consider the following recurrence:

𝑇 (𝑛) = 𝑇 (𝑛 − 1) +O(𝑛)

We can demonstrate through other means that 𝑇 (𝑛) = O(𝑛2). However, we will proceed to do so using substi-tutions and the master theorem. We start with the substitution 𝑆(𝑛) = 𝑇 (log 𝑛), or equivalently, 𝑆(2𝑛) = 𝑇 (𝑛).We get:

𝑆(2𝑛) = 𝑆(2𝑛−1) +O(𝑛)

= 𝑆 (2𝑛

2) +O(𝑛)



We now perform a change of variables 𝑚 = 2𝑛, which gives us:

𝑆(𝑚) = 𝑆 (𝑚2) +O(log𝑚)

We need another substitution, so we apply 𝑅(𝑛) = 𝑆(𝑛)/ log 𝑛 to get:

𝑅(𝑛) log 𝑛 = 𝑅 (𝑛2) log

𝑛

2+O(log 𝑛)

= 𝑅 (𝑛2) (log 𝑛 − log 2) +O(log 𝑛)

≤ 𝑅 (𝑛2) log 𝑛 +O(log 𝑛)

Dividing by log 𝑛, we get:

𝑅(𝑛) ≤ 𝑅 (𝑛2) +O(1)

Applying the master theorem, we conclude that 𝑅(𝑛) = O(log 𝑛). Then:

𝑆(𝑛) = 𝑅(𝑛) log 𝑛

= O(log2𝑛)

𝑇 (𝑛) = 𝑆(2𝑛)= O(log

22𝑛)

= O(𝑛2)

We have thus shown that 𝑇 (𝑛) = O(𝑛2).

3.2 Integer Multiplication

We now turn our attention to algorithms for integer multiplication. For fixed-size data types, such as 32-bit integers,multiplication can be done in a constant amount of time and is typically implemented as a hardware instruction forcommon sizes. However, if we are working with arbitrary 𝑛-bit numbers, we will have to implement multiplicationourselves in software.

Let’s first take a look the standard grade-school long-multiplication algorithm.

011101011001110111

110111110111010101110111

=++=

´

59

42

59<<1

59<<3

59<<5

2478

Here, the algorithm is illustrated for binary numbers, but it is the same as for decimal numbers. We first multiply thetop number by the last digit in the bottom number. We then multiply the top number by the second-to-last digit in thebottom number, but we shift over the result leftward by one digit. We repeat this for each digit in the bottom number,

3.2. Integer Multiplication 22


adding one more leftward shift as we proceed. Once we have done all the multiplications for each digit of the bottomnumber, we add up the results to compute the final product.

How efficient is this algorithm? If the input numbers are each 𝑛 bits long, each individual multiplication takes a linearamount of time – we have to multiply each digit in the top number by the single digit in the bottom number (plus acarry if we are working in decimal). Since we have to do 𝑛 multiplications, computing the partial results takes O(𝑛2)total time. We then have 𝑛 partial results that we need to add. The longest partial result is the last one, which is ~2𝑛digits long. Thus, we add 𝑛 numbers, each of which has O(𝑛) digits. Adding two O(𝑛)-digit numbers takes O(𝑛)time, so adding 𝑛 of them takes a total of O(𝑛2) time. Adding the time for the multiplications and additions leaves uswith O(𝑛2) time total for the entire operation.

Can we do better? Let’s try to make use of the divide-and-conquer paradigm. We need a way of splitting up a numberinto smaller pieces, but we can do that by splitting it into the first 𝑛/2 digits and the last 𝑛/2 digits. (For the rest ofour discussion, we will work with decimal numbers, though the same reasoning applies to numbers in other bases.)Assume that 𝑛 is even for simplicity.

n/2 digits n/2 digits

N a b

As an example, consider the number 376280. Here, 𝑛 = 6, and splitting it into two gives us 376 and 280. How arethese pieces related to the original number? We have:

376280 = 376 ⋅ 1000 + 280

= 376 ⋅ 103+ 280

= 376 ⋅ 10𝑛/2

+ 280

In general, when we split an 𝑛-digit number 𝑁 into two pieces 𝑎 and 𝑏, we have that 𝑁 = 𝑎 ⋅ 10𝑛/2 + 𝑏.

Let’s now apply this splitting process to multiply two 𝑛-digit numbers 𝑁1 and 𝑁2. Split 𝑁1 into 𝑎 and 𝑏 and 𝑁2 into𝑐 and 𝑑 such that:

𝑁1 = 𝑎 ⋅ 10𝑛/2

+ 𝑏

𝑁2 = 𝑐 ⋅ 10𝑛/2

+ 𝑑

n/2 digits n/2 digits

N1 a bN2 c d

We can then compute 𝑁1 ×𝑁2 as:

𝑁1 ×𝑁2 = (𝑎 ⋅ 10𝑛/2

+ 𝑏) × (𝑐 ⋅ 10𝑛/2

+ 𝑑)= 𝑎 × 𝑐 ⋅ 10

𝑛+ 𝑎 × 𝑑 ⋅ 10

𝑛/2+ 𝑏 × 𝑐 ⋅ 10

𝑛/2+ 𝑏 × 𝑑

= 𝑎 × 𝑐 ⋅ 10𝑛+ (𝑎 × 𝑑 + 𝑏 × 𝑐) ⋅ 10

𝑛/2+ 𝑏 × 𝑑

How efficient is this computation? First, observe that multiplying by 10𝑘 is the same as shifting it to the left by ap-

pending 𝑘 zeros to the end of a number, so it can be done in O(𝑘) time. Then we have the following subcomputations:

• 4 recursive multiplications of 𝑛/2-digit numbers (𝑎 × 𝑐, 𝑎 × 𝑑, 𝑏 × 𝑐, 𝑏 × 𝑑)

• 2 left shifts, each of which takes O(𝑛) time



• 3 additions of O(𝑛)-digit numbers, which take O(𝑛) time

Let 𝑇 (𝑛) be the time it takes to multiply two 𝑛-digit numbers. Using the algorithm above, we have:

𝑇 (𝑛) = 4𝑇 (𝑛/2) +𝑂(𝑛)

Applying the master theorem, we have 𝑘 = 4, 𝑏 = 2, 𝑑 = 1, so 𝑘/𝑏𝑑 = 2 > 1. Thus:

𝑇 (𝑛) = O(𝑛log2 4) = O(𝑛2)

This is the same as the long-multiplication algorithm! We did a lot of work to come up with a divide-and-conqueralgorithm, and it doesn’t do any better than a naïve algorithm. It seems that our division and combination wasn’tsufficiently “meaningful” to result in an improvement.

3.2.1 The Karatsuba Algorithm

Let’s try again, but this time, let’s try to rearrange the computation such that we have fewer than four subproblems.We have shown that

𝑁1 ×𝑁2 = 𝑎 × 𝑐 ⋅ 10𝑛+ (𝑎 × 𝑑 + 𝑏 × 𝑐) ⋅ 10

𝑛/2+ 𝑏 × 𝑑

Rather than computing 𝑎 × 𝑐, 𝑎 × 𝑑, 𝑏 × 𝑐, and 𝑏 × 𝑑 directly, we compute the following three terms:

𝑚1 = (𝑎 + 𝑏) × (𝑐 + 𝑑)𝑚2 = 𝑎 × 𝑐

𝑚3 = 𝑏 × 𝑑

Observe that 𝑚1 = 𝑎 × 𝑐 + 𝑎 × 𝑑 + 𝑏 × 𝑐 + 𝑏 × 𝑑, and we can subtract 𝑚2 and 𝑚3 to obtain:

𝑚1 −𝑚2 −𝑚3 = 𝑎 × 𝑐 + 𝑎 × 𝑑 + 𝑏 × 𝑐 + 𝑏 × 𝑑 − 𝑎 × 𝑐 − 𝑏 × 𝑑

= 𝑎 × 𝑑 + 𝑏 × 𝑐

This is exactly the middle term in our expansion for 𝑁1 ×𝑁2. Thus:

𝑁1 ×𝑁2 = 𝑎 × 𝑐 ⋅ 10𝑛+ (𝑎 × 𝑑 + 𝑏 × 𝑐) ⋅ 10

𝑛/2+ 𝑏 × 𝑑

= 𝑚2 ⋅ 10𝑛+ (𝑚1 −𝑚2 −𝑚3) ⋅ 10

𝑛/2+𝑚3

This algorithm is known as the Karatsuba algorithm. How efficient is the computation? We have the followingsubcomputations:

• Computing 𝑚1 requires two additions of 𝑛/2-digit numbers, resulting in two numbers that are ~𝑛/2 digits each.This takes O(𝑛) time. The two ~𝑛/2-digit numbers get multiplied, which takes 𝑇 (𝑛/2) time.

• Computing 𝑚2 and 𝑚3 each take 𝑇 (𝑛/2) time.

• Putting together the terms in the expansion for 𝑁1 ×𝑁2 requires two left shifts, two additions, and two subtrac-tions, all of which take O(𝑛) time.

This leads to the recurrence:

𝑇 (𝑛) = 3𝑇 (𝑛/2) +O(𝑛)

Applying the master theorem, we have 𝑘 = 3, 𝑏 = 2, 𝑑 = 1, so 𝑘/𝑏𝑑 = 1.5 > 1. This gives us:

𝑇 (𝑛) = O(𝑛log2 3) ≈ O(𝑛1.585)

Thus, the Karatsuba algorithm gives us a runtime that is faster than the naïve algorithm, and it was the first algorithmdiscovered for integer multiplication that takes less than quadratic time.



3.3 The Closest-Pair Problem

In the closest-pair problem, we are given 𝑛 points in 𝑑-dimensional space, and our task is to find the pair of pointswhose mutual distance is smallest among all pairs. This problem has several applications in computational geometryand data mining (e.g. clustering). The following is an example of this problem in two dimensions, where the closestpair is at the top left.

A naïve algorithm compares the distance between every pair of points; since there are O(𝑛2) pairs, the algorithm takesO(𝑛2) time. Can we do better?

Let’s start with a simpler problem. Given a list of 𝑛 real numbers 𝑥1, 𝑥2, . . . , 𝑥𝑛, we wish to find the pair of numbersthat are closest together. In other words, find 𝑥𝑖, 𝑥𝑗 that minimize ∣𝑥𝑖 − 𝑥𝑗∣, where 𝑖 ≠ 𝑗.

Rather than comparing every pair of numbers, we can first sort the list. Then it must be the case that the closest pairof numbers are adjacent in the sorted list. (Exercise: prove this by contradiction.) So we only need compare each pairof adjacent points to find the pair with minimal distance. The following is a complete algorithm:

Algorithm 22 (Closest Numbers)

Algorithm 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝐴[1..𝑛] ∶ array of 𝑛 ≥ 2 integers) ∶𝑆 ∶=𝑀𝑒𝑟𝑔𝑒𝑆𝑜𝑟𝑡(𝐴)𝑖 ∶= 1

For 𝑘 = 2 to 𝑛 − 1 ∶

If ∣𝐴[𝑘] −𝐴[𝑘 + 1]∣ < ∣𝐴[𝑖] −𝐴[𝑖 + 1]∣ ∶𝑖 ∶= 𝑘

Return 𝐴[𝑖], 𝐴[𝑖 + 1]

As we saw previously, merge sort takes O(𝑛 log 𝑛) time (assuming fixed-width numbers). The algorithm above alsoiterates over the sorted list, doing a constant amount of work in each iteration. This takes O(𝑛) time. Putting the twosteps together results in a total time of O(𝑛 log 𝑛), which is better than the naïve O(𝑛2).

This algorithm works for points in one-dimensional space, which are just real numbers. Unfortunately, it is not clearhow to generalize this algorithm to two-dimensional points. While there are various ways we can sort such points,there is no obvious ordering that provides the invariant that the pair of closest points must be adjacent in the resultingordering.

Let’s instead take another look at the single-dimensional problem, taking a more direct divide-and-conquer approach.We start by finding the median of all the points and partitioning them into two halves according to the median.

3.3. The Closest-Pair Problem 25


m = median of S

L RS

xL xR

We now have two subproblems, and we can recursively find the closest pairs that reside in just the left and just theright halves. However, the closest pair overall may cross the two halves, with one point in the left and one in the right.But we actually only need to consider the pair formed by the maximal point in the left half and the minimal point inthe right – any other crossing pair will be further apart than these two.

Thus, we have three pairs of points, and we just need to compare them to determine the closest overall:

• the closest pair in the left half

• the closest pair in the right half

• the closest crossing pair, consisting of the maximal element on the left and the minimal element on the right

The full algorithm is as follows:

Algorithm 23 (1D Closest Pairs)

Algorithm 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝐴[1..𝑛] ∶ set of 𝑛 ≥ 2 points) ∶Let 𝑚 be the median of the points in 𝐴

Let 𝐿 be the points in 𝐴 that are ≤ 𝑚

Let 𝑅 be the points in 𝐴 that are > 𝑚

(𝑙1, 𝑙2) ∶= 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝐿)(𝑟1, 𝑟2) ∶= 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝑅)𝛿𝐿 ∶= ∣𝑙1 − 𝑙2∣𝛿𝑅 ∶= ∣𝑟1 − 𝑟2∣Let 𝑥𝐿 be the maximal element in 𝐿

Let 𝑥𝑅 be the minimal element in 𝑅

Return the pair among (𝑙1, 𝑙2), (𝑟1, 𝑟2), (𝑥𝐿, 𝑥𝑅) that lies within distance min(𝛿𝐿, 𝛿𝑅, ∣𝑥𝐿 − 𝑥𝑅∣)

Analyzing this algorithm for its runtime, we can find the median of a set of points by sorting them and then examiningthe middle points. We can also obtain the maximal element in the left-hand side and the minimal in the right from thesorted list. Splitting the points by the median takes O(𝑛) time – we just need to compare each point to the median.The combination step is dominated by the O(𝑛 log 𝑛) sort, so we end up with the recurrence:

𝑇 (𝑛) = 2𝑇 (𝑛/2) +O(𝑛 log 𝑛)

This doesn’t quite fit into the master theorem since the combination is not of the form O(𝑛𝑑). We showed usingsubstitution in Example 19 that 𝑇 (𝑛) = O(𝑛 log

2𝑛). This means that this algorithm is less efficient than our previous

one. However, there are two modifications we can make:

• Use an O(𝑛) median-finding algorithm11 rather than sorting to find the median.

• Sort the points once at the beginning, so that we don’t need to re-sort in each step.

Either modification brings the combination step down to O(𝑛), resulting in 𝑇 (𝑛) = O(𝑛 log 𝑛). (The second optionadds an additional O(𝑛 log 𝑛) on top of this for the presorting, but that still results in a total time of O(𝑛 log 𝑛).)

11 https://en.wikipedia.org/wiki/Median_of_medians


https://en.wikipedia.org/wiki/Median_of_medians


Exercise 24 In the one-dimensional closest-pair algorithm, we computed 𝑚 as the median, 𝛿𝐿 as the distancebetween the closest pair on the left, and 𝛿𝑅 as the distance between the closest pair on the right. Let 𝛿 =

min(𝛿𝐿, 𝛿𝑅). How many points can lie in the interval [𝑚,𝑚 + 𝛿)? What about the interval (𝑚 − 𝛿,𝑚 + 𝛿)?

m = median of S

L RS

d d

Now let’s try to generalize this algorithm to two dimensions. It’s not clear how to split the points according to a medianpoint. So rather than doing that, let’s compute a median line instead as the median x-coordinate.

L R

S

ℓ

d d

As before, we can recursively find the closest pair solely on the left and the closest pair solely on the right. When itcomes to crossing pairs, however, we no longer have just one pair to consider. It may be that the points closest to themedian line on the left and right are very far apart in the y-dimension, whereas another crossing pair is slightly furtherapart in the x-dimension but much closer in the y-dimension. Thus, we need to consider all crossing pairs that are“close enough” to the median line.

How far from the median line do we need to consider? Let 𝛿𝐿 be the distance between the closest pair solely on theleft and 𝛿𝑅 similarly on the right. Let 𝛿 = min(𝛿𝐿, 𝛿𝑅) be the minimal distance between all pairs of points where bothare solely on the left or solely on the right. Then when looking at crossing pairs, we can ignore any point that is 𝛿 ormore away from the median – the distance between such a point and a point on the other side is at least 𝛿 in just thex-dimension. Then the overall distance is also at least 𝛿, and they can’t be closer together than the closer of the twonon-crossing pairs we have already found.

Thus, we need only consider the points that lie within the 𝛿-strip, the space that lies within a distance of 𝛿 of themedian line in the x-dimension. This leads to the following algorithm:



Algorithm 25 (2D Closest Pairs – Attempt 1)

Algorithm 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝐴[1..𝑛] ∶ set of 𝑛 ≥ 2 points) ∶Let 𝑚 be the median x-coordinate of the points in 𝐴

Let 𝐿 be the points in 𝐴 whose x-coordinates are ≤ 𝑚

Let 𝑅 be the points in 𝐴 whose x-coordinates > 𝑚

(𝑙1, 𝑙2) ∶= 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝐿)(𝑟1, 𝑟2) ∶= 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝑅)𝛿𝐿 ∶= ∣𝑙1 − 𝑙2∣𝛿𝑅 ∶= ∣𝑟1 − 𝑟2∣𝛿 ∶= min(𝛿𝐿, 𝛿𝑅)Find the closest pair (𝑥𝐿, 𝑥𝑅) ∈ 𝐿 ×𝑅 among the points whose x-coordinates are within 𝛿 of 𝑚Return the pair among (𝑙1, 𝑙2), (𝑟1, 𝑟2), (𝑥𝐿, 𝑥𝑅) that lies within distance min(𝛿𝐿, 𝛿𝑅, ∣𝑥𝐿 − 𝑥𝑅∣)

Other than examining the 𝛿-strip, the computations are the same as in the one-dimensional algorithm, and they takeO(𝑛) time. (We can presort the points by x-coordinate or use an O(𝑛) median-finding algorithm, as before.) Howlong does it take to examine the 𝛿-strip? A naïve examination would consider every pair of points where one lieswithin the left side of the 𝛿-strip and the other on its right side. How many points can lie in the 𝛿-strip? In the worstcase, all of them! This can happen if the points are close together in the x-dimension but far apart in the y-dimension.So in the worst case, we have 𝑛/2 points in the left part of the 𝛿-strip and 𝑛/2 points in the right, leaving us with𝑛2/4 = O(𝑛2) pairs to consider. So we end up with the recurrence:

𝑇 (𝑛) = 2𝑇 (𝑛/2) +O(𝑛2) = O(𝑛2)This is no better than the naïve algorithm that unconditionally compares all pairs. As with our first attempt at divide-and-conquer multiplication, we have not come up with a “meaningful” enough division and combination.

Let’s take another look at the 𝛿-strip to see if we can find a more efficient method for examining it. Consider anarbitrary point 𝑝𝐿 = (𝑥𝐿, 𝑦𝐿) in the left side of the 𝛿-strip. How many points in the right side that are below 𝑝𝐿 (i.e.have a y-coordinate that is ≤ the y-coordinate of 𝑝𝐿) can lie within distance 𝛿 of 𝑝𝐿?

R

ℓ

pL

L

dd

d dpR

Observe that in order for a point 𝑝𝑅 on the right to be below 𝑝𝐿 but within distance 𝛿, its y-coordinate must be in therange (𝑦𝐿 − 𝛿, 𝑦𝐿] – otherwise, it would be at least 𝛿 away from 𝑝𝐿 in just the y-dimension, and thus at least 𝛿 away



overall. Thus, all such points 𝑝𝑅 that are on the right, below 𝑝𝐿, and within a distance 𝛿 must lie within the 𝛿 × 2𝛿portion of the 𝛿-strip that lies just below 𝑝𝐿. Note that some points in this rectangle are on the left, and some may befurther from 𝑝𝐿 than distance 𝛿, but all points that are on the right, below 𝑝𝐿, and within 𝛿 of 𝑝𝐿 must lie within therectangle.

How many points can be in this 𝛿 × 2𝛿 rectangle? We claim no more than eight, including 𝑝𝐿, but we leave the proofas an exercise. Thus, for each point in the 𝛿-strip, we need only pair it with the seven points that lie directly below itin the strip.

Algorithm 26 (2D Closest Pairs)

Algorithm 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝐴[1..𝑛] ∶ set of 𝑛 ≥ 2 points) ∶Let 𝑚 be the median x-coordinate of the points in 𝐴

Let 𝐿 be the points in 𝐴 whose x-coordinates are ≤ 𝑚

Let 𝑅 be the points in 𝐴 whose x-coordinates > 𝑚

(𝑙1, 𝑙2) ∶= 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝐿)(𝑟1, 𝑟2) ∶= 𝐶𝑙𝑜𝑠𝑒𝑠𝑡(𝑅)𝛿𝐿 ∶= ∣𝑙1 − 𝑙2∣𝛿𝑅 ∶= ∣𝑟1 − 𝑟2∣𝛿 ∶= min(𝛿𝐿, 𝛿𝑅)Let 𝐷 be the set of points that lie within the 𝛿-strip, sorted by their y-coordinatesFor each point in 𝐷, pair it with the 7 preceding points in 𝐷

Return the pair with the minimal distance among (𝑙1, 𝑙2), (𝑟1, 𝑟2), and the pairs from 𝐷 above

We have already seen that we can presort by x-coordinate to avoid sorting to find the median in each step. We can alsoseparately presort by y-coordinate (into a different array) so that we do not have to sort the 𝛿-strip in each iteration.Instead, we merely filter the points from the presorted array according to whether they lie within the strip. Thus, in theworst case when all 𝑛 points are in the 𝛿-strip, constructing a sorted 𝛿-strip takes O(𝑛) time to do the filtering, and weconsider ~7𝑛 = O(𝑛) pairs in the strip. The total combination time is then O(𝑛), resulting in the recurrence:

𝑇 (𝑛) = 2𝑇 (𝑛/2) +O(𝑛) = O(𝑛 log 𝑛)

This matches the efficiency of the one-dimensional algorithm. The algorithm can be further generalized to higherdimensions, retaining the O(𝑛 log 𝑛) runtime for any fixed dimension.

Exercise 27 a) Prove that the left square area in the 𝛿 × 2𝛿 rectangle from the 𝛿-strip can contain at most 4points from the left subset, and similarly that the right square area can contain at most 4 points from theright subset.

Hint: Partition each square area into four congruent square sub-areas, and show that each sub-area canhave at most one point from the relevant subset.

b) Use the previous part to argue why it suffices to check only the seven points that lie directly below eachpoint in the 𝛿-strip, when considering only the points that are within the strip itself.

Note: the value seven here is not optimal, but for the asymptotics we only need a constant.


CHAPTER

FOUR

DYNAMIC PROGRAMMING

The idea of subdividing a large problem into smaller versions of the same problem lies at the core of the divide-and-conquer paradigm. However, for some problems, this recursive subdivision may result in encountering manyinstances of the exact same subproblem. This gives rise to the paradigm of dynamic programming, which is applicableto problems that have the following features:

1. The principle of optimality, also known as an optimal substructure. This means that the optimal solution to alarger problem can be constructed from optimal solutions to its smaller subproblems. For example, shortest-pathproblems on graphs generally obey the principle of optimality. If the shortest path between vertices 𝑎 and 𝑏 ina graph goes through some other vertex 𝑐, so that the path has the form 𝑎, 𝑢1, . . . , 𝑢𝑗 , 𝑐, 𝑣1, . . . , 𝑣𝑘, 𝑏, then thesubpath 𝑎, 𝑢1, . . . , 𝑢𝑗 , 𝑐 is the shortest path from 𝑎 to 𝑐, and similarly for the subpath between 𝑐 and 𝑏.

2. Overlapping subproblems. This means that the same subproblem appears many times when recursively decom-posing the original problem down to the base cases. A classic example of this is a recursive computation of theFibonacci sequence, which has the recurrence 𝐹 (𝑛) = 𝐹 (𝑛 − 1) + 𝐹 (𝑛 − 2). The same subproblem appearsover and over again, such that a naïve computation takes time exponential in 𝑛.

𝐹(4)

𝐹(3) 𝐹(2)

𝐹(2) 𝐹(1) 𝐹(1)

𝐹(1) 𝐹(0)

𝐹(0)

The latter characteristic of overlapping subproblems is what distinguishes dynamic programming from divide andconquer.

30


4.1 Implementation Strategies

The first step in applying dynamic programming to a problem is to determine a recurrence relation, along with thebase cases. In the case of the Fibonacci sequence, for example, the recurrence and bases cases are:

𝐹 (𝑛) = 1 if 𝑛 ≤ 1

𝐹 (𝑛 − 1) + 𝐹 (𝑛 − 2) otherwise

Once we have established the base cases and recurrence relation, we can decide on an implementation strategy. Thereare three typical patterns:

1. Top-down recursive. The naïve implementation directly translates the recurrence relation into a recursive algo-rithm, as in the following:

Algorithm 28 (Top-down Fibonacci)

Algorithm 𝐹𝑖𝑏(𝑛) ∶If 𝑛 ≤ 1 return 1

Return 𝐹𝑖𝑏(𝑛 − 1) + 𝐹𝑖𝑏(𝑛 − 2)

As alluded to previously, the problem with this strategy is that it repeats the same computations many times,to the extent that the overall number of recursive calls is exponential in 𝑛. Thus, the naïve top-down strategyis time inefficient when there are overlapping subproblems. However, the algorithm does not use any auxiliarystorage12, so it is space efficient.

2. Top-down memoized, or simply memoization. This approach also translates the recurrence relation into a recur-sive algorithm, but it saves results in a lookup table and queries that table before doing any computation. Thefollowing is an example:

Algorithm 29 (Memoized Fibonacci)

𝑚𝑒𝑚𝑜 ∶= an empty table (i.e. a map or dictionary)

Algorithm 𝐹𝑖𝑏(𝑛) ∶If 𝑛 ≤ 1 return 1

If 𝑛 ∉ 𝑚𝑒𝑚𝑜 ∶

𝑚𝑒𝑚𝑜(𝑛) ∶= 𝐹𝑖𝑏(𝑛 − 1) + 𝐹𝑖𝑏(𝑛 − 2)Return 𝑚𝑒𝑚𝑜(𝑛)

This memoized algorithm avoids recomputing a previously encountered subproblem. Each call to 𝐹𝑖𝑏(𝑛),where 𝑛 is not a base case, first checks the 𝑚𝑒𝑚𝑜 table to see if 𝐹𝑖𝑏(𝑛) has been computed before. If not, itgoes ahead and computes it, installing the result in 𝑚𝑒𝑚𝑜. If the subproblem was previously encountered, thealgorithm just returns the previously computed result.

Memoization trades space for time. The computation of 𝐹𝑖𝑏(𝑛) requires O(𝑛) auxiliary space to store the resultof each subproblem. On the other hand, since each subproblem is only computed once, the overall number ofoperations required is O(𝑛), a significant improvement over the exponential naïve algorithm.

3. Bottom-up table13. Rather than starting with the desired input and working our way down to the base cases, wecan invert the computation to start with the base cases and then work our way up to the desired input. In general,

12 Space is required to represent the recursion stack; for the Fibonacci computation, there are as many as O(𝑛) active calls at any point, sothe naïve approach actually requires linear space. For other algorithms, however, the naïve implementation can incur lower space costs than thetable-based approaches.

13 In many contexts, the term “dynamic programming” is used to refer specifically to the bottom-up approach. In this text, however, we use theterm to refer to both the top-down-memoized and the bottom-up-table strategies, and we will be explicit when we refer to a specific approach.

4.1. Implementation Strategies 31


we need a table to store the results of the subproblems we have computed so far, as those results will be neededto compute larger subproblems.

The following is a bottom-up implementation of computing the Fibonacci sequence:

Algorithm 30 (Bottom-up Fibonacci)

Algorithm 𝐹𝑖𝑏(𝑛) ∶𝑡𝑎𝑏𝑙𝑒 ∶= an empty table (i.e. a map or dictionary)𝑡𝑎𝑏𝑙𝑒(0) ∶= 1

𝑡𝑎𝑏𝑙𝑒(1) ∶= 1

For 𝑖 = 2 to 𝑛 ∶

𝑡𝑎𝑏𝑙𝑒(𝑖) ∶= 𝑡𝑎𝑏𝑙𝑒(𝑖 − 1) + 𝑡𝑎𝑏𝑙𝑒(𝑖 − 2)Return 𝑡𝑎𝑏𝑙𝑒(𝑛)

We start with an empty table and populate it with the results for the base cases. Then we work our way forwardfrom there, computing the result of each new subproblem from the results previously computed for the smallersubproblems. We stop when we reach the desired input and return the result. The following is an illustration ofthe table for 𝐹𝑖𝑏(9).

1 1 2 3 5 8 13 21 34 550 1 2 3 4 5 6 7 8 9

+

When we get to computing 𝐹𝑖𝑏(𝑖), we have already computed 𝐹𝑖𝑏(𝑖 − 1) and 𝐹𝑖𝑏(𝑖 − 2), so we can just takethose results from the table and add them to get the result for 𝐹𝑖𝑏(𝑖).

Like memoization, the bottom-up approach trades space for time. In the case of 𝐹𝑖𝑏(𝑛), it too requires O(𝑛)auxiliary space to store the result of each subproblem, and the overall number of operations required is O(𝑛).

In the specific case of computing the Fibonacci sequence, we don’t actually need to keep the entire table around– once we get to computing 𝐹𝑖𝑏(𝑖), we never use the results for 𝐹𝑖𝑏(𝑖 − 3) or lower. So we only need to keeparound the two previously computed results at any time. This lowers the storage overhead to O(1). However,this isn’t the case in general for dynamic programming. Other problems require maintaining a larger subset orall of the table throughout the computation.

The three implementation strategies produce different tradeoffs. The naïve top-down strategy often takes the leastimplementation effort, as it is a direct translation of the recurrence relation. It is also usually the most space efficient,but it typically is the least time efficient. Memoization adds some implementation effort in working with a lookuptable (though some programming languages provide built-in facilities for converting a naïve recursive function into amemoized one, such as @functools.lru_cache in Python). Its main advantages over the bottom-up approachare:

• It maintains the same structure as the recurrence relation, so it typically is simpler to reason about and imple-ment.

• It only computes the subproblems that are actually needed. If the recursive computation is sparse, meaning thatit requires only a subset of the subproblems that are smaller than the input, the top-down memoization approachcan be more time and space efficient than the bottom-up strategy.

However, memoization often suffers from higher constant-factor overheads than the bottom-up approach (e.g.function-call overheads and working with sparse lookup structures that are less time efficient than dense data struc-

4.1. Implementation Strategies 32


tures). Thus, the bottom-up approach is preferable when a large fraction of the subproblems are actually needed forcomputing the desired result.

4.2 Longest Common Subsequence

As a more complex example of dynamic programming, we take a look at the problem of finding the longest commonsubsequence (LCS) of two strings. A subsequence is an ordered subset of the characters in a string, preserving theorder in which the characters appear in the original string. However, the characters in the subsequence need not beadjacent in the original string14. The following are all subsequences of the string "Fibonacci sequence":

• "Fun"

• "seen"

• "cse"

The longest common subsequence 𝐿𝐶𝑆(𝑆1, 𝑆2) of two strings 𝑆1 and 𝑆2 is the longest string that is both a subse-quence of 𝑆1 and of 𝑆2. For instance, the longest common subsequence of "Go blue!" and "Wolverine" is"ole".

Finding the longest common subsequence is useful in many applications, including DNA sequencing and computingthe diff of two files. As such, we wish to devise an efficient algorithm for determining the LCS of two arbitrarystrings.

Let’s assume that we are given two strings 𝑆1 and 𝑆2 as input, and let their lengths be 𝑁 and 𝑀 , respectively. Fornow, let’s set aside the problem of finding the actual longest common subsequence and focus on just determining thelength of this subsequence. To apply dynamic programming, we first need to come up with a recurrence relation thatrelates the length of the LCS of 𝑆1 and 𝑆2 to smaller subproblems. Let’s consider just the last15 character in eachstring. There are two possibilities:

• The last character is the same in both strings, i.e. 𝑆1[𝑁] = 𝑆2[𝑀]; call this 𝑥. Then 𝑥 must be the last characterof the LCS of 𝑆1 and 𝑆2. We can prove this by contradiction. Let 𝐶 = 𝐿𝐶𝑆(𝑆1, 𝑆2), and suppose that 𝐶 doesnot end with 𝑥. Then it must consist of characters that appear in the first 𝑁 − 1 characters of 𝑆1 and the first𝑀 − 1 characters of 𝑆2. But 𝐶 + 𝑥 is a valid subsequence of both 𝑆1 and 𝑆2, since 𝑥 appears in 𝑆1 and 𝑆2 afterall the characters in 𝐶. Since 𝐶 + 𝑥 is a longer common subsequence than 𝐶, this contradicts the assumptionthat the longest common subsequence does not end with 𝑥.

Let 𝑘 be the length of 𝐶 = 𝐿𝐶𝑆(𝑆1, 𝑆2). By similar reasoning to the above, we can demonstrate that 𝐶[1..𝑘 −1] = 𝐿𝐶𝑆(𝑆1[1..𝑁 − 1], 𝑆2[1..𝑀 − 1]) – that is, 𝐶 with its last character removed is the LCS of the stringsconsisting of the first 𝑁 − 1 characters of 𝑆1 and the first 𝑀 − 1 characters of 𝑆2. This is an example ofthe principle of optimality, that we can construct an optimal solution to the larger problem from the optimalsolutions to the smaller subproblems.

14 Contrast this with a substring, where the characters in the substring must all be adjacent in the original string.15 It is equally valid to start with the first character in each string, but we have chosen to start with the last character as it simplifies some

implementation details when working with strings in real programming languages.

4.2. Longest Common Subsequence 33


X

X

=

𝑁 − 1 symbols

𝑀 − 1 symbols

Let 𝐿(𝑆1, 𝑆2) be the length of the LCS of 𝑆1 and 𝑆2. We have demonstrated that when 𝑆1[𝑁] = 𝑆2[𝑀], wehave:

𝐿(𝑆1, 𝑆2) = 𝐿(𝑆1[1..𝑁 − 1], 𝑆2[1..𝑀 − 1]) + 1.

• The last character differs between the two strings, i.e. 𝑆1[𝑁] ≠ 𝑆2[𝑀]. Then at least one of these charactersmust not be in 𝐿𝐶𝑆(𝑆1, 𝑆2). We do not know a priori which of the two (or both) are not in the LCS, but weobserve the following:

– If 𝑆1[𝑁] is not in 𝐿𝐶𝑆(𝑆1, 𝑆2), then all the characters in 𝐿𝐶𝑆(𝑆1, 𝑆2) must appear in the first 𝑁 − 1characters of 𝑆1. Then 𝐿𝐶𝑆(𝑆1, 𝑆2) = 𝐿𝐶𝑆(𝑆1[1..𝑁 − 1], 𝑆2).

Y

X

≠

𝑁 − 1 symbols

𝑀 symbols

– If 𝑆2[𝑀] is not in 𝐿𝐶𝑆(𝑆1, 𝑆2), then all the characters in 𝐿𝐶𝑆(𝑆1, 𝑆2) must appear in the first 𝑀 − 1characters of 𝑆2. Then 𝐿𝐶𝑆(𝑆1, 𝑆2) = 𝐿𝐶𝑆(𝑆1, 𝑆2[1..𝑀 − 1]).

Y

X

≠

𝑁 symbols

𝑀 − 1 symbols

Since at least one of these cases must hold, we just compute both 𝐿𝐶𝑆(𝑆1[1..𝑁 − 1], 𝑆2) and𝐿𝐶𝑆(𝑆1, 𝑆2[1..𝑀 − 1]) and see which one is longer. In terms of length alone, we have demonstrated that



when 𝑆1[𝑁] ≠ 𝑆2[𝑀], we have that:

𝐿(𝑆1, 𝑆2) = max(𝐿(𝑆1[1..𝑁 − 1], 𝑆2), 𝐿(𝑆1, 𝑆2[1..𝑀 − 1]))

Finally, we note that when either 𝑆1 or 𝑆2 is empty (i.e. 𝑁 = 0 or 𝑀 = 0, respectively), then 𝐿𝐶𝑆(𝑆1, 𝑆2) is alsoempty. Putting everything together, we have:

𝐿(𝑆1[1..𝑁], 𝑆2[1..𝑀]) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

0 if 𝑁 = 0 or 𝑀 = 0

𝐿(𝑆1[1..𝑁 − 1], 𝑆2[1..𝑀 − 1]) + 1 if 𝑆1[𝑁] = 𝑆2[𝑀]max(𝐿(𝑆1[1..𝑁 − 1], 𝑆2), 𝐿(𝑆1, 𝑆2[1..𝑀 − 1])) if 𝑆1[𝑁] ≠ 𝑆2[𝑀]

Now that we have a recurrence relation, we can proceed to construct an algorithm. Let’s take the bottom-up approach.We need a table to store the solutions to the subproblems, which consist of 𝐿(𝑆1[1..𝑖], 𝑆2[1..𝑗]) for all (𝑖, 𝑗) ∈[0, 𝑁]× [0,𝑀] (we consider 𝑆1[1..0] to denote an empty string). Thus, we need a table of size (𝑁 + 1)× (𝑀 + 1).The following is the table for 𝑆1 = "Go blue!" and 𝑆2 = "Wolverine".

W o l v e r i n e s

0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0

0 0 1 1 1 1 1 1 1 1 1

0 0 1 1 1 1 1 1 1 1 1

0 0 1 1 1 1 1 1 1 1 1

0 0 1 2 2 2 2 2 2 2 2

0 0 1 2 2 2 2 2 2 2 2

0 0 1 2 2 3 3 3 3 3 3

0 0 1 2 2 3 3 3 3 3 3

G

o

b

l

u

e

!

The (𝑖, 𝑗)th entry in the table denotes the length of the longest common subsequence for 𝑆1[1..𝑖] and 𝑆2[1..𝑗]. Usingthe recurrence relation, we can compute the value of the (𝑖, 𝑗)th entry from the entries at locations (𝑖 − 1, 𝑗 − 1),(𝑖 − 1, 𝑗), and (𝑖, 𝑗 − 1), with the latter two required when 𝑆1[𝑖] ≠ 𝑆2[𝑗] and the former when 𝑆1[𝑖] = 𝑆2[𝑗]. Theentry at (𝑁,𝑀) is the length of the LCS of the full strings 𝑆1[1..𝑁] and 𝑆2[1..𝑀].The last thing we need before proceeding to the algorithm is to determine an order in which to compute the tableentries. There are multiple valid orders, but we will compute the entries row by row from top to bottom, moving leftto right within a row. With this order, the entries required for (𝑖, 𝑗) have already been computing by the time we getto (𝑖, 𝑗).

We can now write the algorithm for computing the table:



Algorithm 31 (LCS Table)

Algorithm 𝐿𝐶𝑆𝑇𝑎𝑏𝑙𝑒(𝑆1[1..𝑁], 𝑆2[1..𝑀]) ∶𝑡𝑎𝑏𝑙𝑒 ∶= an empty (𝑁 + 1) × (𝑀 + 1) tableFor 𝑖 = 0 to 𝑁 ∶ 𝑡𝑎𝑏𝑙𝑒(𝑖, 0) = 0

For 𝑗 = 0 to 𝑀 ∶ 𝑡𝑎𝑏𝑙𝑒(0, 𝑗) = 0

For 𝑖 = 1 to 𝑁 ∶

For 𝑗 = 1 to 𝑀 ∶

If 𝑆1[𝑖] = 𝑆2[𝑗] ∶𝑡𝑎𝑏𝑙𝑒(𝑖, 𝑗) ∶= 𝑡𝑎𝑏𝑙𝑒(𝑖 − 1, 𝑗 − 1) + 1

Else ∶

𝑡𝑎𝑏𝑙𝑒(𝑖, 𝑗) ∶= max(𝑡𝑎𝑏𝑙𝑒(𝑖 − 1, 𝑗), 𝑡𝑎𝑏𝑙𝑒(𝑖, 𝑗 − 1)Return 𝑡𝑎𝑏𝑙𝑒

As stated before, the length of 𝐿𝐶𝑆(𝑆1, 𝑆2) is in the last table entry at position (𝑁,𝑀).

Now that we have determined how to compute the length of the longest common subsequence, let’s return to theproblem of computing the subsequence itself. It turns out that we can backtrack through the table to recover the actualcharacters. We start with the last entry and determine the path to that entry from a base case. For each entry on thepath, we check to see if the characters corresponding to that location match. If so, the matching character contributesto the LCS. If the characters do not match, we check to see if we came from the left or from above – the larger ofthe two neighboring entries is the source, since we took the max of the two in computing the current entry. If bothneighbors have the same value, we can arbitrarily choose one or the other. The following demonstrates the path thatresults from backtracking. The solid, red path arbitrarily chooses to go up when both neighbors have the same value,while the dashed, blue path chooses to go left. Both paths result in a valid LCS (and in this case, the same set ofcharacters, though that isn’t necessarily always the case).

W o l v e r i n e s

0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0

0 0 1 1 1 1 1 1 1 1 1

0 0 1 1 1 1 1 1 1 1 1

0 0 1 1 1 1 1 1 1 1 1

0 0 1 2 2 2 2 2 2 2 2

0 0 1 2 2 2 2 2 2 2 2

0 0 1 2 2 3 3 3 3 3 3

0 0 1 2 2 3 3 3 3 3 3

G

o

b

l

u

e

!



The algorithm for backtracking is as follows:

Algorithm 32 (LCS Backtrack)

Algorithm 𝐿𝐶𝑆𝐵𝑎𝑐𝑘𝑡𝑟𝑎𝑐𝑘(𝑆1[1..𝑁], 𝑆2[1..𝑀]) ∶𝑡𝑎𝑏𝑙𝑒 ∶= 𝐿𝐶𝑆𝑇𝑎𝑏𝑙𝑒(𝑆1, 𝑆2)𝑠 ∶= 𝜀 (the empty string)𝑖 ∶= 𝑁

𝑗 ∶=𝑀

While 𝑖 > 0 and 𝑗 > 0 ∶

If 𝑆1[𝑖] = 𝑆1[𝑗] ∶𝑠 ∶= 𝑆1[𝑖] + 𝑠

𝑖 ∶= 𝑖 − 1

𝑗 ∶= 𝑗 − 1

Else If 𝑡𝑎𝑏𝑙𝑒(𝑖, 𝑗 − 1) > 𝑡𝑎𝑏𝑙𝑒(𝑖 − 1, 𝑗) ∶𝑗 ∶= 𝑗 − 1

Else ∶

𝑖 ∶= 𝑖 − 1

Return 𝑠

How efficient is this algorithm? Computing a single table entry requires a constant number of operations. Since thereare (𝑁 + 1) ⋅ (𝑀 + 1) entries, constructing the table takes O(𝑁𝑀) time and requires O(𝑁𝑀) space. Backtrackingalso does a constant number of operations per entry on the path, and the path length is at most 𝑁 + 𝑀 + 1, sobacktracking takes O(𝑁 +𝑀) time. Thus, the total complexity of this algorithm is O(𝑁𝑀) in both time and space.

4.3 Longest Increasing Subsequence

Given a sequence of numbers 𝑆, an increasing subsequence of 𝑆 is a subsequence such that the elements are in strictlyincreasing order. As with a subsequence of a string, the elements do have to appear in the same relative order in thesubsequence as they do in the original sequence 𝑆. Suppose we want to find a longest increasing subsequence (LIS)of 𝑆. For example, the sequence 𝑆 = (0, 8, 7, 12, 5, 10, 4, 14, 3, 6) has several longest increasing subsequences:

• (0, 8, 12, 14)• (0, 8, 10, 14)• (0, 7, 12, 14)• (0, 7, 10, 14)• (0, 5, 10, 14)

As in the previous problem, we focus first on finding the length of an LIS before concerning ourselves with the actualelements. Let 𝑁 be the length of 𝑆. For our first attempt, we take a look at the subproblems of computing the length ofthe LIS for each sequence 𝑆[1..𝑖] for 𝑖 ∈ [1, 𝑁]. In the case of 𝑆 = (0, 8, 7, 12, 5, 10, 4, 14, 3, 6), we can determinethese lengths by hand.

1 2 2 3 3 3 3 4 4 41 2 3 4 5 6 7 8 9 10

4.3. Longest Increasing Subsequence 37


However, it is not clear how we can relate the length of the LIS for a sequence 𝑆 of 𝑁 elements to a smaller sequence.In the example above, 𝑆[1..9] and 𝑆[1..10] both have the same length LIS, but that is not the case for 𝑆[1..7]and 𝑆[1..8]. Yet, in both cases, the additional element is larger than the previous one (i.e. 𝑆[10] > 𝑆[9] and𝑆[8] > 𝑆[7]). Without knowing the contents of the LIS itself, it is not obvious how the result for the larger problemis related to that of smaller ones.

For dynamic programming to be applicable, we must identify a problem formulation that satisfies the principle ofoptimality. So rather than tackling the original problem directly, let’s constrain ourselves to subsequences of 𝑆[1..𝑁]that actually contain the last element 𝑆[𝑁]. Define 𝐶𝐿𝐼𝑆(𝑆[1..𝑁]) to be the longest of the increasing subse-quences of 𝑆[1..𝑁] that contain 𝑆[𝑁]. Then 𝐶𝐿𝐼𝑆(𝑆[1..𝑁]) has the form (𝑆[𝑖1], 𝑆[𝑖2], . . . , 𝑆[𝑖𝑘], 𝑆[𝑁]).Observe that by definition of increasing subsequence, we must have 𝑖𝑘 < 𝑁 and 𝑆[𝑖𝑘] < 𝑆[𝑁]. In addi-tion, (𝑆[𝑖1], 𝑆[𝑖2], . . . , 𝑆[𝑖𝑘]) is itself an increasing subsequence that contains 𝑆[𝑖𝑘]. Moreover, we can showvia contradiction that 𝐶𝐿𝐼𝑆(𝑆[1..𝑖𝑘]) = (𝑆[𝑖1], 𝑆[𝑖2], . . . , 𝑆[𝑖𝑘]). Thus, 𝐶𝐿𝐼𝑆(𝑆[1..𝑖𝑘]) is a subproblem of𝐶𝐿𝐼𝑆(𝑆[1..𝑁]) when 𝑆[𝑖𝑘] < 𝑆[𝑁].We do not know a priori which index 𝑖𝑘 produces a result that is a subsequence of 𝐶𝐿𝐼𝑆(𝑆[1..𝑁]). But only theindices 𝑖𝑘 such that 𝑖𝑘 < 𝑁 and 𝑆[𝑖𝑘] < 𝑆[𝑁] may do so, so we can just try all such indices and take the maximum.Let 𝐿(𝑖) be the length of 𝐶𝐿𝐼𝑆(𝑆[1..𝑖]). Then we have:

𝐿(𝑖) = 1 +max𝐿(𝑗) ∶ 0 < 𝑗 < 𝑖 and 𝑆[𝑗] < 𝑆[𝑖]

The following illustrates this computation for the concrete sequence 𝑆 above, specifically showing which values needto be compared when computing 𝐿(10).

0 8 7 12 5 10 4 14 3 61 2 3 4 5 6 7 8 9 10

1 2 2 3 2 3 2 4 2 31 2 3 4 5 6 7 8 9 10

𝑆

𝐿

We also have to account for the base cases, where no such 𝑗 exists. In that case, 𝐿(𝑖) = 1, since 𝐶𝐿𝐼𝑆(𝑆[1..𝑖])consists of just 𝑆[𝑖] itself. The full recurrence is as follows:

𝐿(𝑖) = 1 if 𝑖 = 1 or 𝑆[𝑗] ≥ 𝑆[𝑖] for all 0 < 𝑗 < 𝑖

1 +max𝐿(𝑗) ∶ 0 < 𝑗 < 𝑖 and 𝑆[𝑗] < 𝑆[𝑖] otherwise

We can then construct an algorithm for computing 𝐿(𝑖). Once we have 𝐿(𝑖) for all 𝑖 ∈ [1, 𝑁], we just have to takethe maximum value of 𝐿(𝑖) as the result for the length of the unconstrained LIS. As with LCS, we can also backtrackthrough the table of values for 𝐿(𝑖) to find the actual elements belonging to the longest increasing subsequence.

How efficient is the algorithm, assuming a bottom-up approach? We must compute 𝑁 values for 𝐿(𝑖), and each valuerequires scanning over all the previous elements of 𝑆, as well as all the previous values of 𝐿(𝑖) in the worst case.Thus, it takes O(𝑁) time to compute 𝐿(𝑖) for a single 𝑖, and O(𝑁2) to compute them all. Finding the maximum 𝐿(𝑖)takes linear time, as does backtracking. So the algorithm as a whole takes O(𝑁2) time, and it uses O(𝑁) space tostore the values of 𝐿(𝑖).

4.3. Longest Increasing Subsequence 38


4.4 All-Pairs Shortest Path

Suppose you are building a flight-aggregator website. Each day, you receive a list of flights from several airlines withtheir associated costs, and some flight segments may actually have negative cost if the airline wants to incentivize aparticular route (see hidden-city ticketing16 for an example of how this can be exploited in practice, and what the perilsof doing so are). You’d like your users to be able to find the cheapest itinerary from point A to point B. To providethis service efficiently, you determine in advance the cheapest itineraries between all possible origin and destinationlocations, so that you need only look up an already computed result when the user puts in a query.

This situation is an example of the all-pairs shortest path problem. The set of cities and flights can be represented asa graph 𝐺 = (𝑉,𝐸), with the cities represented as vertices and the flight segments as edges in the graph. There is alsoa weight function 𝑤𝑒𝑖𝑔ℎ𝑡 ∶ 𝐸 → R that maps each edge to a cost. While an individual edge may have a negative cost,no negative cycles are allowed. (Otherwise a traveler could just fly around that cycle to make as much money as theywant, which would be very bad for the airlines!) Our task is to find the lowest-cost path between all pairs of verticesin the graph.

How can we apply dynamic programming to this problem? We need to formulate it such that there are self-similarsubproblems. To gain some insight, we observe that many aggregators allow the selection of layover airports, whichare intermediate stops between the origin and destination, when searching for flights. The following is an examplefrom kayak.com17.

COMPARE

COMPARE

COMPARE ALL

Flight leg1h 10m – 23h 36m

Layover0h 10m – 18h 51m

Price

Cabin

Austria

Belgium

Mexico

United Kingdom

United States

Layover airports

Flight quality

Aircraft

Major Airline $126

United Airlines $127

Multiple airlines

oneworld $126

SkyTeam $137

Star Alliance $127

Reset

Vienna (VIE)

Brussels (BRU)

Cancún (CUN)

London (LHR)

Atlanta (ATL)

Baltimore (DD8)

Boston (BOS)

Buffalo (BUF)

Charlotte (CLT)

Chicago (ORD)

Cincinnati (CVG)

Cleveland (CLE)

Columbus (CMH)

Denver (DEN)

Detroit (DTW)

Fort Lauderdale (FLL)

Miami (MIA)

Myrtle Beach (MYR)

Norfolk (ORF)

Orlando (MCO)

Orlando (SFB)

Philadelphia (PHL)

Pittsburgh (PIT)

Portland (PWM)

Raleigh (RDU)

Richmond (RIC)

Rochester (ROC)

San Francisco (SFO)

Seattle (SEA)

Tampa (TPA)

Company

About

Careers

Mobile

Blog

How we work

Contact

Help/FAQ

Press

Affiliates

Hotel owners

Partners

More

Airline fees

Airlines

Low fare tips

Badges & Certificates

Site / Currency

United States

$ United States Dollars

Do Not Sell My Info Privacy Terms & Conditions Ad Choices ©2020 KAYAK

Search cheap flights with KAYAK. Search for the cheapest airline tickets for all the top airlines around the world and the top international flight routes. KAYAK searches hundreds of travel sites to help you find cheap airfare and book a flight that suits you best.Since KAYAK searches many plane tickets sites at once, you can find cheap tickets from cheap airlines quickly.

from Washington (… New York (N… Trips

WAS to NYC, 10/16 — 10/23 https://www.kayak.com/flights/WAS-NYC/2020-10-16/2020-10...

3 of 4 9/16/20, 1:01 AM

COMPARE

COMPARE

COMPARE ALL



Price

Cabin

Austria

Belgium

Mexico

United Kingdom

United States

Layover airports

Flight quality

Aircraft

Spirit Airlines $118


Multiple airlines

oneworld $126

SkyTeam $137

Star Alliance $127

Vienna (VIE)

Brussels (BRU)

Cancún (CUN)

London (LHR)

Atlanta (ATL)

Baltimore (DD8)

Boston (BOS)

Buffalo (BUF)

Charlotte (CLT)

Chicago (ORD)

Cincinnati (CVG)

Cleveland (CLE)

Columbus (CMH)

Denver (DEN)

Detroit (DTW)


Miami (MIA)

Myrtle Beach (MYR)

Norfolk (ORF)

Orlando (MCO)

Orlando (SFB)

Philadelphia (PHL)

Pittsburgh (PIT)

Portland (PWM)

Raleigh (RDU)

Richmond (RIC)

Rochester (ROC)

San Francisco (SFO)

Seattle (SEA)

Tampa (TPA)

Company

About

Careers

Mobile

Blog

How we work

Contact

Help/FAQ

Press

Affiliates

Hotel owners

Partners

More

Airline fees

Airlines

Low fare tips


Site / Currency

United States






3 of 4 9/16/20, 1:00 AM

COMPARE

COMPARE

COMPARE ALL



Price

Cabin

Austria

Belgium

Mexico

United Kingdom

United States

Layover airports

Flight quality

Aircraft



Multiple airlines

oneworld $126

SkyTeam $137

Star Alliance $127

Vienna (VIE)

Brussels (BRU)

Cancún (CUN)

London (LHR)

Atlanta (ATL)

Baltimore (DD8)

Boston (BOS)

Buffalo (BUF)

Charlotte (CLT)

Chicago (ORD)

Cincinnati (CVG)

Cleveland (CLE)

Columbus (CMH)

Denver (DEN)

Detroit (DTW)


Miami (MIA)

Myrtle Beach (MYR)

Norfolk (ORF)

Orlando (MCO)

Orlando (SFB)

Philadelphia (PHL)

Pittsburgh (PIT)

Portland (PWM)

Raleigh (RDU)

Richmond (RIC)

Rochester (ROC)

San Francisco (SFO)

Seattle (SEA)

Tampa (TPA)

Company

About

Careers

Mobile

Blog

How we work

Contact

Help/FAQ

Press

Affiliates

Hotel owners

Partners

More

Airline fees

Airlines

Low fare tips


Site / Currency

United States






3 of 4 9/16/20, 1:00 AM

COMPARE

COMPARE

COMPARE ALL



Price

Cabin

Austria

Belgium

Mexico

United Kingdom

United States

Layover airports

Flight quality

Aircraft



Multiple airlines

oneworld $126

SkyTeam $137

Star Alliance $127

Vienna (VIE)

Brussels (BRU)

Cancún (CUN)

London (LHR)

Atlanta (ATL)

Baltimore (DD8)

Boston (BOS)

Buffalo (BUF)

Charlotte (CLT)

Chicago (ORD)

Cincinnati (CVG)

Cleveland (CLE)

Columbus (CMH)

Denver (DEN)

Detroit (DTW)


Miami (MIA)

Myrtle Beach (MYR)

Norfolk (ORF)

Orlando (MCO)

Orlando (SFB)

Philadelphia (PHL)

Pittsburgh (PIT)

Portland (PWM)

Raleigh (RDU)

Richmond (RIC)

Rochester (ROC)

San Francisco (SFO)

Seattle (SEA)

Tampa (TPA)

Company

About

Careers

Mobile

Blog

How we work

Contact

Help/FAQ

Press

Affiliates

Hotel owners

Partners

More

Airline fees

Airlines

Low fare tips


Site / Currency

United States






3 of 4 9/16/20, 1:00 AM

We take the set of allowed layover airports as one of the key characteristics of a subproblem – computing shortestpaths with a smaller set of allowed layover airports is a subproblem of computing shortest paths with a larger set ofallowed layover airports. Then the base case is allowing only direct flights, with no layover airports.

16 https://en.wikipedia.org/wiki/Airline_booking_ploys#Hidden-city_ticketing17 https://kayak.com

4.4. All-Pairs Shortest Path 39

https://en.wikipedia.org/wiki/Airline_booking_ploys#Hidden-city_ticketing

https://kayak.com


Coming back to the graph representation of this problem, we formalize the notion of a layover airport as an interme-diate vertex of a simple path, which is a path without cycles. Let 𝑝 = 𝑣1, 𝑣2, . . . , 𝑣𝑚 be a path from origin 𝑣1 todestination 𝑣𝑚. Then 𝑣2, . . . , 𝑣𝑚−1 are intermediate vertices.

Assume that the vertices are labeled as numbers in the set 1, 2, . . . , ∣𝑉 ∣. We parameterize a subproblem by 𝑘, whichsignifies that the allowed set of intermediate vertices (layover airports) is restricted to 1, 2, . . . , 𝑘. Then we define𝑑𝑘(𝑖, 𝑗) to be the length of the shortest path between vertices 𝑖 and 𝑗, where the path is only allowed to go through

intermediate vertices in the set 1, 2, . . . , 𝑘.

We have already determined that when no intermediate vertices are allowed, which is when 𝑘 = 0, the shortest pathbetween 𝑖 and 𝑗 is just the direct edge (flight) between them. Thus, our base case is

𝑑0(𝑖, 𝑗) = 𝑤𝑒𝑖𝑔ℎ𝑡(𝑖, 𝑗)

where 𝑤𝑒𝑖𝑔ℎ𝑡(𝑖, 𝑗) is the weight of the edge between 𝑖 and 𝑗.

We proceed to the recursive case. We have at our disposal the value of 𝑑𝑘−1(𝑖′, 𝑗 ′) for all 𝑖′, 𝑗 ′ ∈ 𝑉 , and we wantto somehow relate 𝑑

𝑘(𝑖, 𝑗) to those values. The latter represents adding vertex 𝑘 to our set of permitted intermediatevertices. There are two possible cases for the shortest path between 𝑖 and 𝑗 that is allowed to use any of the intermediatevertices 1, 2, . . . , 𝑘:

• Case 1: The path does not go through 𝑘. Then the length of the shortest path that is allowed to go through1, 2, . . . , 𝑘 is the same as that of the shortest path that is only allowed to go through 1, 2, . . . , 𝑘−1, so 𝑑

𝑘(𝑖, 𝑗) =𝑑𝑘−1(𝑖, 𝑗).

• Case 2: The path does go through 𝑘. Then this path is composed of two segments, one that goes from 𝑖 to 𝑘and another that goes from 𝑘 to 𝑗. We minimize the cost of the total path by minimizing the cost of each of thesegments – the costs respect the principle of optimality.

Neither of the two segments may have 𝑘 as an intermediate vertex – otherwise we would have a cycle. The onlyway for a path with a cycle to have lower cost than one without is for the cycle as a whole to have negativeweight, which was explicitly prohibited in our problem statement. Since 𝑘 is not an intermediate vertex in thesegment between 𝑖 and 𝑘, the shortest path between them that is allowed to go through intermediate vertices1, 2, . . . , 𝑘 is the same as the shortest path that is only permitted to go through 1, 2, . . . , 𝑘 − 1. In other words,the length of this segment is 𝑑𝑘−1(𝑖, 𝑘). By the same reasoning, the length of the segment between 𝑘 and 𝑗 is𝑑𝑘−1(𝑘, 𝑗).

Thus, we have that in this case, 𝑑𝑘(𝑖, 𝑗) = 𝑑𝑘−1(𝑖, 𝑘) + 𝑑

𝑘−1(𝑘, 𝑗).

We don’t know a priori which of these two cases holds, but we can just compute them both and take the minimum.This gives us the recursive case:

𝑑𝑘(𝑖, 𝑗) = min(𝑑𝑘−1(𝑖, 𝑗), 𝑑𝑘−1(𝑖, 𝑘) + 𝑑

𝑘−1(𝑘, 𝑗))

Combining this with the base case, we have our complete recurrence relation:

𝑑𝑘(𝑖, 𝑗) = 𝑤𝑒𝑖𝑔ℎ𝑡(𝑖, 𝑗) if 𝑘 = 0

min(𝑑𝑘−1(𝑖, 𝑗), 𝑑𝑘−1(𝑖, 𝑘) + 𝑑𝑘−1(𝑘, 𝑗)) if 𝑘 ≠ 0

We can now construct a bottom-up algorithm to compute the shortest paths:



Algorithm 33 (Floyd-Warshall)

Algorithm 𝐹𝑙𝑜𝑦𝑑-𝑊𝑎𝑟𝑠ℎ𝑎𝑙𝑙(𝐺 = (𝑉,𝐸)) ∶Let 𝑑0(𝑢, 𝑣) = 𝑤𝑒𝑖𝑔ℎ𝑡(𝑢, 𝑣) for all 𝑢, 𝑣 ∈ 𝑉

For 𝑘 = 1 to ∣𝑉 ∣ ∶For all 𝑢, 𝑣 ∈ 𝑉 ∶

𝑑𝑘(𝑢, 𝑣) ∶= min𝑑𝑘−1(𝑢, 𝑣), 𝑑𝑘−1(𝑖, 𝑘) + 𝑑

𝑘−1(𝑘, 𝑗)Return 𝑑

𝑛

This is known as the Floyd-Warshall algorithm, and it runs in time O(∣𝑉 ∣3). The space usage is O(∣𝑉 ∣2) if we onlykeep around the computed values of 𝑑𝑚(𝑖, 𝑗) for iterations 𝑚 and 𝑚+1. Once we have computed these shortest paths,we need only look up the already computed result to find the shortest path between a particular origin and destination.


CHAPTER

FIVE

GREEDY ALGORITHMS

A greedy algorithm computes a solution to a problem by making a locally optimal choice at each step of the algorithm.In general, there is no guarantee that locally optimal choices lead to a globally optimal result. However, for somespecific greedy algorithms, we can demonstrate that the result is globally optimal.

As an example, consider the problem of finding a minimal spanning tree (MST) of a weighted, undirected graph. Givensuch a graph, we would like to find a subset of the edges such that the subgraph induced by those edges is connected,touches every vertex, and has the least total edge cost. This is an important problem in designing networks, includingtransportation and communication networks, where we want to ensure that there is a path between any two verticeswhile minimizing the cost of constructing the network.

Before we proceed, let’s review the definition of a tree. There are three equivalent definitions:

Definition 34 (Tree #1) Let 𝐺 = (𝑉,𝐸) be an undirected graph. Then 𝐺 is a tree if it is connected withoutcycles.

A graph is connected when there is a path between any two vertices. A cycle is a sequence of adjacent edges that startsand ends at the same vertex.

Definition 35 (Tree #2) Let 𝐺 = (𝑉,𝐸) be an undirected graph. Then 𝐺 is a tree if it is minimally connected.

A graph is minimally connected if removing any single edge causes it to be disconnected.

Definition 36 (Tree #3) Let 𝐺 = (𝑉,𝐸) be an undirected graph. Then 𝐺 is a tree if it is maximal without cycles.

A graph 𝐺 is maximal without cycles if adding any edge that is not in 𝐺 but connects two vertices in 𝐺 causes thegraph to contain a cycle.

Exercise 37 Show that the three definitions of a tree are all equivalent.

In the MST problem, our goal is to find a subset of the edges of a connected input graph such that:

• The subset touches all vertices in the graph, i.e. it spans the graph.

• The subset has no cycles.

• The subset has the least total edge cost over all subsets that meet the first two requirements.

The first requirement requires the subset to be connected and span the original graph. Combined with the secondrequirement, the end result is a tree by Definition 34. Since the tree spans the original graph, we call it a spanning tree.A minimal spanning tree is then a spanning tree that has the minimal cost over all spanning trees of the original graph.

The following illustrates three of the spanning trees of a sample graph. The middle one is the MST, as its total weightof 8 is lower than that of any other spanning tree.

42


4 7

1 5

9

2

A graph may have multiple minimal spanning trees. In the following graph, any two edges form an MST.

44

4

Now that we understand what a minimal spanning tree is, we take a look at Kruskal’s algorithm, a greedy algorithmfor computing an MST. The algorithm simply examines the edges in order of weight, taking any edge that does notinduce a cycle in the partially computed result.

Algorithm 38 (Kruskal)

Algorithm 𝐾𝑟𝑢𝑠𝑘𝑎𝑙(𝐺 = (𝑉,𝐸)) ∶𝑆 ∶= ∅ (an empty set of edges)For each edge 𝑒 in 𝐸 sorted by weight ∶

If 𝑆 ∪ 𝑒 does not have any cycles ∶𝑆 ∶= 𝑆 ∪ 𝑒

Return 𝑆

As an example, we use Kruskal’s algorithm to compute the MST of the following graph. We start by ordering theedges according to their weights.

43


4

2 7

5

18

6

9 3

15

14

19

810

12

9 17

11

1320

21

24

12

3 4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

Then we look at the edges in order, inserting an edge into our partial result if adding it does not result in a cycle.

44


4

2 7

5

18

6

9 3

15

14

19

810

12

9 17

11

1320

21

24

1

2

3 4

5

67

89

10

Observe that when the algorithm looks at the edge with weight 8, the two adjacent vertices are already connected inthe partially computed tree, so adding that edge would introduce a cycle. Thus, the algorithm skips it and continueson to the next edge. In this graph, there are two edges of weight 9, so the algorithm arbitrarily picks one of them toexamine first.

The algorithm terminates when all edges have been examined. For this graph, the resulting spanning tree has a totalweight of 66.

Kruskal’s algorithm is a greedy algorithm – in each step of the algorithm, it examines the minimal-weight edgeremaining, so it is making locally optimal choices. As stated previously, it isn’t always the case that locally optimalchoices lead to a globally optimal result. However, as we will now show, it is indeed the case for Kruskal’s algorithmthat the result is a minimal spanning tree.

Claim 39 The output of Kruskal’s algorithm is a tree.

To prove this, we will assume that the input graph 𝐺 is fully connected, meaning that there is an edge between everypair of vertices in the graph. We can turn a graph into a fully connected one by adding the missing edges with somevery large (or infinite) weight, so that the new edges do not change the MST.

Proof 40 Let 𝑇 be the output of Kruskal’s algorithm on a fully connected input graph 𝐺. Recall from Definition36 that a tree is a maximal graph without cycles. Clearly 𝑇 does not have any cycles, as Kruskal’s algorithm onlyadds an edge to 𝑇 when it does not induce a cycle. Then we need only show that 𝑇 is maximal. Suppose forthe purpose of contradiction that it is not. By definition, this means that we can add some edge 𝑒 to 𝑇 withoutintroducing a cycle. However, the algorithm examines every edge in the input graph, so it must have encountered𝑒 at some point, and the algorithm adds an edge to 𝑇 if it would not create a cycle. Since it did not add 𝑒 to 𝑇 ,adding 𝑒 to 𝑇 must induce a cycle, which contradicts our assumption that it does not. Thus, no such edge exists,so 𝑇 is maximal. Since 𝑇 is a maximal graph without cycles, it is a tree.

45


We can similarly demonstrate that the output of Kruskal’s algorithm is a spanning tree, but we leave that as an exer-cise.

Exercise 41 Show that if the input graph 𝐺 is connected, the output 𝑇 of Kruskal’s algorithm spans all thevertices in 𝐺.

Next, we show that the result of Kruskal’s algorithm is a minimal spanning tree. To do so, we actually prove a strongerclaim:

Claim 42 If 𝑆 is a set of edges chosen at any stage of Kruskal’s algorithm on an input graph 𝐺, then there issome minimal spanning tree of 𝐺 that contains all the edges in 𝑆.

If this claim is true for all intermediate sets of edges in Kruskal’s algorithm, it is true for the final set of edges. Sincewe have already shown that the final set of edges is a tree, we can conclude that this set of edges is itself the MST thatcontains the edges in that set.

To prove Claim 42, we make use of the exchange property of minimal spanning trees:

Claim 43 (Exchange Property of MSTs) Let 𝑇 be a minimal spanning tree of a graph 𝐺, and let 𝑒 ∉ 𝑇 be anedge that is not in 𝑇 but is in the original graph 𝐺. Then:

• 𝑇 ∪ 𝑒 has a set of edges 𝐶 that comprise a cycle.

• For any edge 𝑓 in 𝐶, the weight of 𝑓 is no more than the weight of 𝑒, i.e. ∀𝑓 ∈ 𝐶. 𝑤(𝑓) ≤ 𝑤(𝑒).

Proof 44 Recall by Definition 36 that a tree is a maximal graph without cycles. Then the first part of the exchangeproperty immediately follows, since 𝑇 is a tree.

To show the second part, observe that all the edges in 𝐶 other than 𝑒 are contained in 𝑇 . Pick any such edge𝑓 ∈ 𝐶, 𝑓 ≠ 𝑒. Consider the tree that exchanges the edge 𝑓 for 𝑒, i.e. the tree 𝑇

′= 𝑇 ∪ 𝑒 \ 𝑓.

𝑇

e

𝑇 ∪ 𝑒

f

e

𝑇’ = 𝑇 ∪ 𝑒 ∖ 𝑓

e

f

Since 𝑇 is a minimal spanning tree, the total weight of 𝑇 must be no more than the total weight of 𝑇 ′. Then wehave:

𝑤(𝑇 ) ≤ 𝑤(𝑇 ′) = 𝑤(𝑇 ) + 𝑤(𝑒) − 𝑤(𝑓)

Rearranging the terms in the inequality 𝑤(𝑇 ) ≤ 𝑤(𝑇 )+𝑤(𝑒)−𝑤(𝑓), we obtain 𝑤(𝑓) ≤ 𝑤(𝑒), the second partof the exchange property.

We can now prove Claim 42.

Proof 45 We prove Claim 42 by induction over the size of 𝑆.

• Base case: 𝑆 is empty, so every MST contains 𝑆.

46


• Inductive step: Let 𝑇 be an MST that contains 𝑆. Suppose the algorithm chooses an edge 𝑒 in the nextstep. We need to show that there is some MST 𝑇

′ that contains the edges in 𝑆 ∪ 𝑒.

There are two possibilities: either 𝑒 is contained in 𝑇 , or it is not.

– Case 1: 𝑒 ∈ 𝑇 . Since 𝑇 is an MST that contains 𝑆, and 𝑇 also contains 𝑒, 𝑇 is an MST that containsall of 𝑆 ∪ 𝑒. (In other words, 𝑇 ′ = 𝑇 in this case.)

– Case 2: 𝑒 ∉ 𝑇 . Then by Definition 36, 𝑇 ∪ 𝑒 has a cycle 𝐶.

From the logic in Kruskal’s algorithm we know that 𝑆 ∪ 𝑒 does not contain a cycle. Thus, theremust be some edge 𝑓 that is in 𝐶 but is not in 𝑆 ∪ 𝑒. Since 𝐶 is composed of edges from 𝑇 plusthe additional edge 𝑒, 𝑓 is in 𝑇 .

Observe that 𝑆 ∪ 𝑓 ⊆ 𝑇 , since all edges in 𝑆 are also contained in 𝑇 . Since 𝑇 does not contain acycle, neither does 𝑆 ∪ 𝑓.

Since adding 𝑓 to 𝑆 does not induce a cycle, the algorithm could have chosen to add 𝑓 to 𝑆. The onlycircumstance in which it chooses 𝑒 first is when the weight of 𝑒 is less than or equal to that of 𝑓 , i.e.𝑤(𝑒) ≤ 𝑤(𝑓).

On the other hand, since 𝑇 is an MST and 𝑇 ∪ 𝑒 has the cycle 𝐶 that contains both 𝑒 and 𝑓 , wehave 𝑤(𝑓) ≤ 𝑤(𝑒) by the exchange property of MSTs.

Thus, we have 𝑤(𝑒) = 𝑤(𝑓). Then the exchanged tree 𝑇′= 𝑇 ∪ 𝑒 \ 𝑓 has weight 𝑤(𝑇 ′) =

𝑤(𝑇 ) + 𝑤(𝑒) − 𝑤(𝑓) = 𝑤(𝑇 ). Since it has the same weight as 𝑇 and 𝑇 is an MST, 𝑇 ′ is also anMST. Since 𝑆 ⊆ 𝑇 and 𝑓 ∉ 𝑆, 𝑇 ′ = 𝑇 ∪ 𝑒 \ 𝑓 contains all the edges in 𝑆 ∪ 𝑒. Thus, 𝑇 ′ is anMST that contains all the edges in 𝑆 ∪ 𝑒.

As we have shown that the set of edges chosen at any point by Kruskal’s algorithm is contained within an MST, thefinal set of edges is also contained within some MST. Since the final set of edges is itself a tree and a tree is maximal,that final set must be an MST.

We have demonstrated that Kruskal’s algorithm does indeed produce an MST. We will also state without proof thatthe running time on an input graph 𝐺 = (𝐸, 𝑉 ) is O(∣𝐸∣ log ∣𝐸∣) with the appropriate choice of data structure for thealgorithm.

Observe that the analysis of Kruskal’s algorithm was nontrivial. Unfortunately, this is often the case for greedyalgorithms. Again, it is not necessarily the case that locally optimal choices lead to a globally optimal solution, so wetypically need to do significant work to demonstrate that this is the case for a particular problem and algorithm.

47

Part II

Computability

48

CHAPTER

SIX

INTRODUCTION TO COMPUTABILITY

We now turn our attention to fundamental questions about computation. While we have seen several algorithmicapproaches to solving problems, the first question we should answer when faced with a new problem is whether or notit is solvable at all. Otherwise, we can waste a lot of time on a fruitless effort to find a solution where none exists.

Before we can reason about what problems are solvable on a computer, we first need abstract definitions of what aproblem is and what is a computer. Rather than getting bogged down in implementation details, we want to focus onabstract representations that strip away all unnecessary details. These abstractions will be easier to reason about thanreal computers and programs. Moreover, our abstract model will capture all possible implementations, so if we provesomething holds in our model, it holds for all real computers.

6.1 Formal Languages

We start by defining an abstraction to represent problems. The simplest problems are those that have a yes or noanswer, such as:

• Is there a flight between Detroit and New York for less that $100?

Such a question is a decision problem – we want a decision on whether the answer is yes or no.

We can generalize the question above into a predicate that takes an input value:

• Is there a flight between Detroit and 𝑥 for less than $100?

Given any particular destination 𝑥, the answer to this question is still either yes or no.

We can further generalize this predicate to work with multiple input values:

• Is there a flight between 𝑥 and 𝑦 for less than 𝑧?

While the predicate is expressed as taking three inputs 𝑥, 𝑦, and 𝑧, we can equivalently consider it to be a predicate thattakes a single input 𝑤 = (𝑥, 𝑦, 𝑧) that is a tuple of those three components. Thus, we lose no generality in restrictingourselves to predicates with a single input.

A predicate is applicable to some universe of inputs, such as all possible airports or cities, for instance. (In program-ming languages, a type is used to represent a universe of values.) The predicate is true (a yes answer) for some set ofinputs and is false (a no answer) for the other inputs. We call the set of yes instances a language. The following is thelanguage defined by our second question above:

𝐿𝑔𝑒𝑡𝑎𝑤𝑎𝑦𝑠 = 𝑥 ∶ there is a flight between Detroit and 𝑥 for less than $100

The complement language is the set of inputs for which the answer is no. We represent a complement with an overbar:

𝐿𝑔𝑒𝑡𝑎𝑤𝑎𝑦𝑠 = 𝑥 ∶ there is no flight between Detroit and 𝑥 for less than $100

Formally, a language is defined over some alphabet that encodes the input. An alphabet is just a nonempty, finite setof symbols, and we use the Greek letter Σ to represent an alphabet.

49


• Σ𝑏𝑖𝑛𝑎𝑟𝑦 = 0, 1 is an alphabet consisting of just the symbols 0 and 1.

• Σ𝑙𝑜𝑤𝑒𝑟𝑐𝑎𝑠𝑒 = 𝑎, 𝑏, . . . , 𝑧 is an alphabet consisting of lowercase English letters.

A string over an alphabet Σ is a sequence of symbols from Σ. For example, a binary string consists of any number ofsymbols from Σ𝑏𝑖𝑛𝑎𝑟𝑦, while an English word is a string over the alphabet Σ𝑙𝑜𝑤𝑒𝑟𝑐𝑎𝑠𝑒. The concept of a string appearsin most programming languages, which usually specify a string data type that consists of a sequence of elementsover a character set (e.g. ASCII or UTF) – the character set is the string’s alphabet. Programming languages oftenspecify notation for string literals, such as enclosing a sequence of characters within matching double quotes. Thisnotation itself is not part of the string data; for instance, the string literal "abcd" in C++ and Python represents thefour-character18 sequence 𝑎𝑏𝑐𝑑, and the quotes are not part of this sequence.

An empty sequence of symbols is also a valid string – for instance, the notation "" represents an empty string in C++and Python. We use a different notation, 𝜀 (called a varepsilon in some formatting systems such as LaTeX), which isthe standard notation in formal languages for an empty string. The empty string is a valid string over any alphabet Σ,as it consists of zero symbols from that alphabet.

We typically denote a nonempty string by just writing out its symbols in order. For instance, 1011 is a string over thealphabet Σ𝑏𝑖𝑛𝑎𝑟𝑦, consisting of the symbol 1, followed by the symbol 0, followed by 1 and 1. The notation ∣𝑤∣ refersto the length (number of symbols) of the string 𝑤; for instance, ∣1011∣ = 4.

We can use powers of a set to specify all strings of a specific length over the characters in that set. In general, thenotation 𝑆

2 for a set 𝑆 is shorthand for 𝑆 × 𝑆, the set product of 𝑆 with itself. (The set product 𝐴 × 𝐵 is the setof all pairs whose first component is chosen from 𝐴 and second component from 𝐵.) Similarly, 𝑆3 is shorthand for𝑆 × 𝑆 × 𝑆. For example:

0, 10 = 𝜀 (𝜀 is the only string of length 0)

0, 11 = 0, 10, 12 = 0, 1 × 0, 1 = 00, 01, 10, 110, 13 = 000, 001, 010, 011, 100, 101, 110, 111

We denote Σ∗ as the set of all finite-length strings over an alphabet Σ. Here, the superscript ∗ is the Kleene star, and

it represents taking zero or more elements from the preceding set. Thus,

Σ∗= Σ

0∪ Σ

1∪ Σ

2∪ . . .

In other words, Σ∗ consists of all strings of length 0 over Σ, plus all strings of length 1 over Σ, all strings of length 2,

and so on for all finite lengths. As an example,

Σ∗𝑏𝑖𝑛𝑎𝑟𝑦 = 0, 1∗

= 𝜀, 0, 1, 01, 10, 000, 001, 010, . . .

is the set of all finite-length binary strings. Observe that while each element of Σ∗ has finite length for any alphabet

Σ, the set of all finite-length strings Σ∗ is infinite for any alphabet Σ.

A language 𝐿 is just a set of finite-length strings over some alphabet Σ. Since Σ∗ is the set of all finite-length strings

over Σ, we have that 𝐿 ⊆ Σ∗.

18 In C++, the character-array string representation includes a null terminator as part of its implementation, but that is an implementation detail,and other implementations (e.g. the std::string type in C++ itself) do not use this same strategy.

6.1. Formal Languages 50


𝑥 ∶ 𝑥 ∈ 𝐿

𝑥 ∶ 𝑥 ∉ 𝐿

Σ∗𝑳

𝑳"

The following are examples of languages:

• 𝐿1 = 11𝑦 ∶ 𝑦 ∈ 0, 1∗ is a language over Σ𝑏𝑖𝑛𝑎𝑟𝑦, and it consists of all finite-length binary strings that beginwith two ones.

• 𝐿2 = ℎ𝑒𝑙𝑙𝑜, 𝑤𝑜𝑟𝑙𝑑 is a language over Σ𝑙𝑜𝑤𝑒𝑟𝑐𝑎𝑠𝑒, and it consists of just the two strings ℎ𝑒𝑙𝑙𝑜 and 𝑤𝑜𝑟𝑙𝑑.

• The empty set ∅ is a language over any alphabet Σ, consisting of no strings. Note that ∅ is distinct from 𝜀 –the former is a set of strings, while the latter is an individual string. (The difference is akin to that between thevalues produced by set() and str() with no arguments in Python, or std::set<std::string> andstd::string in C++.)

As illustrated above, some languages are finite while others consist of infinitely many strings.

Under the formal definition of a language, we can think of the English language itself as a subset of the set of all pos-sible finite-length strings over the English alphabet (including lowercase and uppercase letters, space and punctuationmarkers, and decimal digits). More specifically, it is just the subset of strings that obey the syntactic and semanticrules for English.

In computing, we often restrict ourselves to the binary alphabet 0, 1. Any data value can be represented as a binarystring – in fact, real computers store all data in binary, from simple text files to images, audio files, and this verydocument itself. Thus, the languages we will work with generally consist of binary strings. When we express alanguage such as

𝐿𝑔𝑒𝑡𝑎𝑤𝑎𝑦𝑠 = 𝑥 ∶ there is a flight between Detroit and 𝑥 for less than $100,

what we actually mean can be more formally expressed as:

𝐿𝑔𝑒𝑡𝑎𝑤𝑎𝑦𝑠 = 𝑦 ∶ 𝑦 is the binary representation for an airport 𝑥 such thatthere is a flight between Detroit and 𝑥 for less than $100

We sometimes use the notation ⟨𝑥⟩ to represent the binary encoding of the value 𝑥. We can then express the abovemore concisely as:

𝐿𝑔𝑒𝑡𝑎𝑤𝑎𝑦𝑠 = ⟨𝑥⟩ ∶ there is a flight between Detroit and 𝑥 for less than $100

However, we often elide the angle brackets as they are cumbersome, and the binary encoding of the inputs is implicitlyassumed rather than explicitly denoted.

Now that we have a formal concept of languages (the yes instances of a problem), solving a problem entails determin-ing whether a particular input is a member of that language (whether it is a yes instance). Thus, a program 𝑀 thatsolves a problem 𝐿 is one that:

6.1. Formal Languages 51


1. takes an input 𝑥

2. performs some computations

3. halts and returns 1 if 𝑥 ∈ 𝐿 – the program accepts

4. halts and returns 0 if 𝑥 ∉ 𝐿 – the program rejects

We say that 𝑀 decides 𝐿 – given an input 𝑥, 𝑀 performs computations to decide whether 𝑥 ∈ 𝐿 or not.

Is it possible to decide every language 𝐿? In other words, is it the case that for any language 𝐿, there exists someprogram 𝑀 that decides 𝐿? To answer this question, we need a formal definition of what a program is, or moregenerally, what a computer is.

6.2 Overview of Automata

In computability theory, an abstract computing device is known as an automaton (plural: automata). There are nu-merous different abstract models of computation, such as state machines, recursive functions, lambda calculus, vonNeumann machines, cellular automata, and so on. We will primarily concern ourselves with the Turing-machinemodel, though we will first briefly examine a more restricted variant, finite automata.

Both the finite-automata and Turing-machine models are forms of state machines. A machine consists of a finite set ofstates, each representing a discrete point in a computation, and the machine can transition between different states. Ananalogous concept in a real program is a line of code (or program counter in a compiled executable) – a program hasa finite number of lines, execution of a program is at a single line at any point in time, and the program can transitionto a different line.

In a state machine, computation transitions between states based on what is read from the input, what is contained inmemory (if the machine has memory), or both. We represent states and transitions graphically as follows:

q1

q2

q3

0

1

q0 qa qab qabb qabba

qother

b

a

a

b

a

b

b

a

a, b

a, b

q0 q1 q2

a

b a, b

a, b

2

States are drawn as vertices in the graph, labeled by the name of the state. A directed edge represents a transition,and the label denotes the conditions under which the transition is taken, as well as external side effects (e.g. writinga symbol to memory) of taking the transition. Mathematically, we encode the states as a set and transitions betweenstates using a transition function, with states and other conditions as part of the domain, and states and side effects aspart of the codomain.

The primary difference between the two models we will discuss is whether the model has access to memory cells.This has a significant effect on the computational power of a state machine. More generally, the Chomsky hierarchy isa containment hierarchy of various classes of languages, where each class corresponds to what can be computed on aparticular kind of state machine:

• Regular languages correspond to the computational power of finite automata (FA in the diagram below), whichhave no memory cells.

• Context-free languages correspond to pushdown automata (PDA in the diagram below), which have a singlestack as memory and can only interact with the top of the stack (pushing to and popping from it).

6.2. Overview of Automata 52


• Context-sensitive languages are computable by linear-bounded automata (LBA in the diagram below), whichhave access to storage that is linear in size with respect to the input, but the storage allows access to any of thememory cells.

• Recursively-enumerable languages, also called recognizable languages, correspond to the power of Turing ma-chines (TM in the diagram below), which have access to unbounded memory that allows access to any cell.

Each of these classes of languages is strictly contained within the next, demonstrating the effect that storage has oncomputational power.

Regular(FA)

Context-free(PDA)

Context-sensitive(LBA)

Recursively-enumerable/Recognizable

(TM)

All languages𝒫(Σ∗)

We proceed to examine the finite-automata model in more detail before turning our attention to Turing machines.

6.2. Overview of Automata 53

CHAPTER

SEVEN

FINITE AUTOMATA

The simplest state-machine model is that of finite automata, which consist of a finite number of states and no additionalstorage. The machine is in only one state at a time, and a computational step involves reading a symbol from the inputstring and transitioning to another state (or the same one) based on what symbol is read. This model and slightvariations of it are known by many different names, including finite automata, deterministic finite automata (DFA),finite-state automata, finite-state machine, finite acceptor, and others. We will use the term finite automata to refer tothe model.

The following is a graphical representation of a finite automaton over the input language Σ = 𝑎, 𝑏:

q1

q2

q3

0

1


qother

b

a

a

b

a

b

b

a

a, b

a, b

q0 q1 q2

a

b a, b

a, b

2

This automaton has six states: 𝑞0, 𝑞𝑎, 𝑞𝑎𝑏, 𝑞𝑎𝑏𝑏, 𝑞𝑎𝑏𝑏𝑎, 𝑞𝑜𝑡ℎ𝑒𝑟. One of the states, 𝑞0, is the special initial state, whichis the state in which computation starts. Graphically, this is denoted by an incoming edge with no label and nostart vertex. This particular automaton also has a single accept state, 𝑞𝑎𝑏𝑏𝑎, depicted with a double circle – if thecomputation terminates in this state, the machine accepts the input, otherwise the input is rejected. Finally, transitionsbetween states are depicted as directed edges, with label corresponding to the input symbol(s) that trigger the transition.

A finite automaton works on an input string, performing a computational step for each element of that string. Themachine only does a single pass over the input – computation terminates exactly when the input has been exhausted.The result is acceptance if the final state is an accept state, rejection otherwise.

As an example, we trace the execution of the finite automaton above on the input string 𝑎𝑏𝑏𝑎. We keep track ofthe input in the top-left of the diagrams below, and the current state is represented by a shaded vertex. Initially, themachine is in the start state 𝑞0:

54


a b b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

3

The first input symbol is an 𝑎, so the machine makes the transition specified for reading an 𝑎 in the state 𝑞0 – move tothe state 𝑞𝑎.

a b b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

4

The second step reads the symbol 𝑏 from the input, so the machine transitions from 𝑞𝑎 to 𝑞𝑎𝑏.

a b b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

5

In the third step, a 𝑏 is read, and the corresponding transition is from the state 𝑞𝑎𝑏 to 𝑞𝑎𝑏𝑏.

55


a b b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

6

The fourth step reads an 𝑎, causing a transition from 𝑞𝑎𝑏𝑏 to 𝑞𝑎𝑏𝑏𝑎.

a b b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

7

The input has now been exhausted, so no more transitions are possible. The machine terminates in the state 𝑞𝑎𝑏𝑏𝑎;since this is an accepting state, the machine has accepted the input 𝑎𝑏𝑏𝑎.

The following is an example of running the machine on input 𝑎𝑏𝑎:

a b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

8

As before, the first symbol is an 𝑎, so the machine transitions from the start state 𝑞0 to the state 𝑞𝑎.

56


a b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

9

The second symbol is again a 𝑏, so the machine goes from state 𝑞𝑎 to 𝑞𝑎𝑏.

a b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

10

The third step reads the symbol 𝑎, so the machine transitions from 𝑞𝑎𝑏 to 𝑞𝑜𝑡ℎ𝑒𝑟.

a b a


qother

b

a

a

b

a

b

b

a

a, b

a, b

11

The machine has reached the end of the input, so it terminates. Since the final state 𝑞𝑜𝑡ℎ𝑒𝑟 is not an accepting state, themachine rejects the input 𝑎𝑏𝑎.

The only sequence of input symbols that leads to termination in an accept state is 𝑎𝑏𝑏𝑎, so this finite automaton

57


computes the language

𝐿 = 𝑎𝑏𝑏𝑎

consisting only of the string 𝑎𝑏𝑏𝑎. Observe that an input such as 𝑎𝑏𝑏𝑎𝑏 passes through the accept state as part of thecomputation, but it terminates in a non-accept state (𝑞𝑜𝑡ℎ𝑒𝑟), so the machine rejects such an input.

7.1 Formal Definition

Formally, a finite automaton is defined as a five-tuple:

𝑀 = (𝑄,Σ, 𝛿, 𝑞0, 𝐹 )

The components of this five-tuple are as follows:

• 𝑄 is the finite set of states. In the machine above,

𝑄 = 𝑞0, 𝑞𝑎, 𝑞𝑎𝑏, 𝑞𝑎𝑏𝑏, 𝑞𝑎𝑏𝑏𝑎, 𝑞𝑜𝑡ℎ𝑒𝑟

• Σ is the input alphabet; an input string is a finite sequence of symbols from this alphabet (i.e. the string is anelement of Σ

∗). In the automaton above, Σ = 𝑎, 𝑏.

• 𝛿 is the transition function, which maps a state and input symbol to a new state:

𝛿 ∶ 𝑄 × Σ→ 𝑄

The transition function is depicted above as the edges between state vertices. We can also define the function inlist form:

𝛿(𝑞0, 𝑎) = 𝑞𝑎

𝛿(𝑞0, 𝑏) = 𝑞𝑜𝑡ℎ𝑒𝑟

𝛿(𝑞𝑎, 𝑎) = 𝑞𝑜𝑡ℎ𝑒𝑟

𝛿(𝑞𝑎, 𝑏) = 𝑞𝑎𝑏

𝛿(𝑞𝑎𝑏, 𝑎) = 𝑞𝑜𝑡ℎ𝑒𝑟

𝛿(𝑞𝑎𝑏, 𝑏) = 𝑞𝑎𝑏𝑏

𝛿(𝑞𝑎𝑏𝑏, 𝑎) = 𝑞𝑎𝑏𝑏𝑎

𝛿(𝑞𝑎𝑏𝑏, 𝑏) = 𝑞𝑜𝑡ℎ𝑒𝑟

𝛿(𝑞𝑎𝑏𝑏𝑎, 𝑎) = 𝑞𝑜𝑡ℎ𝑒𝑟

𝛿(𝑞𝑎𝑏𝑏𝑎, 𝑏) = 𝑞𝑜𝑡ℎ𝑒𝑟

𝛿(𝑞𝑜𝑡ℎ𝑒𝑟, 𝑎) = 𝑞𝑜𝑡ℎ𝑒𝑟

𝛿(𝑞𝑜𝑡ℎ𝑒𝑟, 𝑏) = 𝑞𝑜𝑡ℎ𝑒𝑟

Alternatively we can express the function in tabular form:

old state 𝑞 𝛿(𝑞, 𝑎) 𝛿(𝑞, 𝑏)𝑞0 𝑞𝑎 𝑞𝑜𝑡ℎ𝑒𝑟𝑞𝑎 𝑞𝑜𝑡ℎ𝑒𝑟 𝑞𝑎𝑏𝑞𝑎𝑏 𝑞𝑜𝑡ℎ𝑒𝑟 𝑞𝑎𝑏𝑏𝑞𝑎𝑏𝑏 𝑞𝑎𝑏𝑏𝑎 𝑞𝑜𝑡ℎ𝑒𝑟𝑞𝑎𝑏𝑏𝑎 𝑞𝑜𝑡ℎ𝑒𝑟 𝑞𝑜𝑡ℎ𝑒𝑟𝑞𝑜𝑡ℎ𝑒𝑟 𝑞𝑜𝑡ℎ𝑒𝑟 𝑞𝑜𝑡ℎ𝑒𝑟

7.1. Formal Definition 58


• 𝑞0 ∈ 𝑄 is the initial state.

• 𝐹 ⊆ 𝑄 is the set of accept states. In the machine above, 𝐹 only has a single element:

𝐹 = 𝑞𝑎𝑏𝑏𝑎

The language of a finite automaton is the set of strings that the automaton accepts. We denote this language by 𝐿(𝑀)for a machine 𝑀 . For the automaton 𝑀 above, we have

𝐿(𝑀) = 𝑎𝑏𝑏𝑎

Example 46 Consider the finite automaton 𝑀 = (𝑄,Σ, 𝛿, 𝑞0, 𝐹 ) where:

𝑄 = 𝑞0, 𝑞1, 𝑞2Σ = 𝑎, 𝑏

𝛿(𝑞0, 𝑎) = 𝑞0

𝛿(𝑞0, 𝑏) = 𝑞1

𝛿(𝑞1, 𝑎) = 𝑞2

𝛿(𝑞1, 𝑏) = 𝑞2

𝛿(𝑞2, 𝑎) = 𝑞2

𝛿(𝑞2, 𝑏) = 𝑞2

𝐹 = 𝑞1

This machine has three states, of which 𝑞1 is the lone accept state. The following is a graphical representation:


qother

b

a

a

b

a

b

b

a

a, b

a, b

q0 q1 q2

a

b a, b

a, b

2

The machine starts in state 𝑞0 and stays there as long as it reads just 𝑎 symbols. When it sees a 𝑏, the machinetransitions to 𝑞1. Any subsequent symbol moves the machine to 𝑞2, where it stays as it consumes the rest of theinput. Since 𝑞1 is the only accepting state, the machine only accepts strings that have a single 𝑏, which must bethe last element of the string. Thus, the language of the machine is

𝐿(𝑀) = 𝑎∗𝑏 = 𝑏, 𝑎𝑏, 𝑎𝑎𝑏, 𝑎𝑎𝑎𝑏, . . .

which consists of strings that start with any number of 𝑎’s followed by a single 𝑏.

Finite automata have many applications, including software for designing and checking digital circuits, lexical ana-lyzers of most compilers (usually the first phase of compilation, which chops a source file into individual words ortokens), scanning large bodies of text (pattern matching), and reasoning about many kinds of systems that have a finitenumber of states (e.g. financial transactions, network protocols, and so on). However, they do not model all possiblecomputations. For instance, finite automata cannot compute the language

𝐿 = 0𝑛1𝑛∶ 𝑛 ∈ N = 𝜀, 01, 0011, 000111, . . .

consisting of strings composed of an arbitrary number of 0’s followed by an equal number of 1’s – since a finiteautomaton does not have storage, it cannot maintain a counter of the number of 0’s or 1’s it has seen. Yet we can easilywrite a program in most any programming language that computes 𝐿. The following is an example in C++:



#include <iostream>

int main() int count = 0;char ch;while ((std::cin >> ch) && ch == '0') ++count;

if (!std::cin || ch != '1') return !std::cin && count == 0;

--count; // already read one 1while ((std::cin >> ch) && ch == '1') --count;

return !std::cin && count == 0;

Thus, we need a more powerful model of computation to characterize the capabilities of real-world programs.


CHAPTER

EIGHT

TURING MACHINES

A Turing machine is a powerful model for an abstract computer, invented by Alan Turing in 1936. The abstraction wasdesigned to model pencil-and-paper computations. The core assumptions in the design of the model are as follows:

• The paper is divided into discrete cells.

• Each cell contains only a single symbol out of some finite set of possible symbols.

• The paper is infinite – if we run out of space, we can always buy more paper.

• We can only look at one cell at a time.

• Though there may be an arbitrary amount of information on the paper, we can only keep a fixed amount in ourbrain at any time.

We model the latter using a finite set of states, with each state corresponding to a different set of information in ourbrain. The paper itself is represented by a tape that is bounded to the left but unbounded to the right, and the tape isdivided into individual cells. Lastly, there is a read/write head that is positioned over a single cell, and it can read thecontents of the cell, write a new symbol to that cell, and move one position to the left or right. The following is apictorial representation of this model:

0 1 1 1 0 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

FINITE STATE CONTROL

We formalize the definition of a Turing machine 𝑀 as a seven-tuple:

𝑀 = ⟨𝑄,Γ,Σ, 𝛿, 𝑞𝑠𝑡𝑎𝑟𝑡, 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, 𝑞𝑟𝑒𝑗𝑒𝑐𝑡⟩

The seven components of the tuple are as follows:

• 𝑄 is the set of states, and it must have a finite number of elements.

• Σ is the input alphabet. Typically, we use binary as the input alphabet, in which case Σ = 0, 1. However, wemay use other input alphabets on occasion.

• Γ is the tape alphabet. Usually, Γ = Σ∪⊥, where⊥ is a special blank symbol that is distinct from any symbolin Σ. On occasion, Γ will have additional elements as well, but it will always be the case that Σ ∪ ⊥ ⊆ Γ.

• 𝑞𝑠𝑡𝑎𝑟𝑡 is the initial state, and it is an element of 𝑄, i.e. 𝑞𝑠𝑡𝑎𝑟𝑡 ∈ 𝑄.

61


• 𝑞𝑎𝑐𝑐𝑒𝑝𝑡 and 𝑞𝑟𝑒𝑗𝑒𝑐𝑡 are the accept state and reject state respectively, and they are also elements of 𝑄. Together,they comprise the set of final states 𝐹 = 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, 𝑞𝑟𝑒𝑗𝑒𝑐𝑡.

• 𝛿 is the transition function. It takes a non-final state and a tape symbol as input, and its output is a new state, anew tape symbol, and a direction (left or right). Formally, we have

𝛿 ∶ (𝑄 \ 𝐹 ) × Γ→ 𝑄 × Γ × 𝐿,𝑅

where 𝐿 and 𝑅 represent “left” and “right,” respectively. The details of each component of the transition functionare as follows:

𝛿 ∶ (𝑄 \ 𝐹 )ÍÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÑÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÏ

Non-terminal state

× ΓÍÑÏWhat is on the tape

→ 𝑄ÍÑÏ

New state

× ΓÍÑÏSymbol to write

× 𝐿,𝑅ÍÒÒÒÒÒÒÒÒÒÒÒÒÒÒÑÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÏ

How to move TM head

A Turing machine is in exactly one state at any time. Initially, this is 𝑞𝑠𝑡𝑎𝑟𝑡. The input to the machine is written on thetape, starting at the leftmost cell, with one symbol per cell. To the right of the input, every cell is initially set to theblank symbol ⊥. The head is positioned over the leftmost cell at the start of computation. The following illustratesthe initial configuration of a machine when the input is the binary string 111010111:

1 1 1 0 1 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"

A computational step involves an application of the transition function to the current configuration of a machine toproduce a new configuration. The machine must be in a non-final state – otherwise, the machine has halted and nofurther computation is done. To perform a computational step, the machine:

1. is in some non-final state 𝑞 ∈ 𝑄 \ 𝐹 ;

2. reads the symbol 𝑠 under the head;

3. transitions to some state 𝑞′∈ 𝑄;

4. writes a symbol 𝑠′ under the head;

5. moves the head left or right.

The transition function determines what happens in steps 3-5, based on what the situation is in steps 1-2. Given thecurrent state 𝑞 and tape symbol 𝑠, 𝛿(𝑞, 𝑠) = (𝑞′, 𝑠′, 𝑑) gives us the new state (which may be the same as the originalstate 𝑞), the new symbol (which may be the same as the original symbol 𝑠), and the direction in which to move thehead (𝐿 or 𝑅).

A Turing machine computes by repeatedly applying the transition function to perform computational steps, haltingonly when it reaches a final state 𝑞𝑎𝑐𝑐𝑒𝑝𝑡 or 𝑞𝑟𝑒𝑗𝑒𝑐𝑡. If the machine reaches the state 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, the machine accepts theinput. If the machine reaches 𝑞𝑟𝑒𝑗𝑒𝑐𝑡, it rejects the input. As we will see later, there is a third possibility, that themachine never reaches a final state. In this case, we say that the machine loops.

The main differences between finite automata and Turing machines are as follows:

62


Behavior Finite Automata Turing MachinesTape alphabet a superset of input alphabet No YesCan have no or more than one accept state Yes NoHas a reject state No YesTerminates when final state is reached No YesTerminates when input is exhausted Yes NoDirection head can move Right Right or LeftCan write to the tape No Yes

As an example of a Turing machine, we consider a simple machine that accepts any binary string composed of justones (including the empty string), rejecting any string that contains a zero. The components of the seven-tuple are asfollows:

• 𝑄 = 𝑞𝑠𝑡𝑎𝑟𝑡, 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, 𝑞𝑟𝑒𝑗𝑒𝑐𝑡• Σ = 0, 1• Γ = 0, 1,⊥• We describe 𝛿 ∶ 𝑞𝑠𝑡𝑎𝑟𝑡× 0, 1,⊥→ 𝑞𝑠𝑡𝑎𝑟𝑡, 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, 𝑞𝑟𝑒𝑗𝑒𝑐𝑡× 0, 1,⊥× 𝐿,𝑅 by listing the result of all

possible inputs:

𝛿(𝑞𝑠𝑡𝑎𝑟𝑡, 0) = (𝑞𝑟𝑒𝑗𝑒𝑐𝑡, 0, 𝑅)𝛿(𝑞𝑠𝑡𝑎𝑟𝑡, 1) = (𝑞𝑠𝑡𝑎𝑟𝑡, 1, 𝑅)𝛿(𝑞𝑠𝑡𝑎𝑟𝑡,⊥) = (𝑞𝑎𝑐𝑐𝑒𝑝𝑡,⊥, 𝑅)

We can also list the results in tabular format:

old state old symbol new state new symbol direction𝑞𝑠𝑡𝑎𝑟𝑡 0 𝑞𝑟𝑒𝑗𝑒𝑐𝑡 0 𝑅𝑞𝑠𝑡𝑎𝑟𝑡 1 𝑞𝑠𝑡𝑎𝑟𝑡 1 𝑅𝑞𝑠𝑡𝑎𝑟𝑡 ⊥ 𝑞𝑎𝑐𝑐𝑒𝑝𝑡 ⊥ 𝑅

The machine has just three states, the initial state and the two final states. In each step, it reads the current symbol,transitioning to the reject state if it is a zero, staying in 𝑞𝑠𝑡𝑎𝑟𝑡 if it is a one, and transitioning to the accept state if it is a⊥. In all three cases, the machine leaves the contents of the cell unchanged and moves the head to the right, on to thenext input symbol.

We have described the machine above by explicitly writing out its seven-tuple. More commonly, we describe a machinewith a state diagram instead. Such a diagram is a labeled graph, with a vertex for each state and edges representingtransitions between states. The following is an example of a transition:

q1 q21 ! 0, R

qstartqreject qaccept0 ! 0, R ? ! ?, R

1 ! 1, R

q0 q1

? ! a, R

? ! b, R

1

The edge goes from state 𝑞1 to state 𝑞2, and these are the state components of the input and output of the transitionfunction. The remaining components are noted on the edge label. Thus, the diagram above means that 𝛿(𝑞1, 1) =(𝑞2, 0, 𝑅) – when the machine is in state 𝑞1 and it sees a 1 under the head, it transitions to state 𝑞2, writes a 0 underthe head, and moves the head to the right.

The state diagram for the machine we described with the seven-tuple above is as follows:

63

Foundations of Computer Science, Release 0.3q1 q2

1 ! 0, R


1 ! 1, R

q0 q1

? ! a, R

? ! b, R

1

Let us proceed to run this machine on the input 111010111. The machine starts in the initial state, with the inputwritten on the tape and the head in the leftmost position:

1 1 1 0 1 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"

qstartqreject qaccept0 → 0, R ⊥ → ⊥, R

1 → 1, R

The cell underneath the head contains a one. The transition function (looking at either the seven-tuple or the statediagram) tells us that 𝛿(𝑞𝑠𝑡𝑎𝑟𝑡, 1) = (𝑞𝑠𝑡𝑎𝑟𝑡, 1, 𝑅), so the machine stays in the same state, leaves the symbol 1 underthe head, and moves the head to the right.

1 1 1 0 1 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"

64



1 → 1, R

Again, we have a one underneath the head. Again, the machine stays in the same state, leaves the symbol unchanged,and moves to the right.

1 1 1 0 1 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"


1 → 1, R

Once again, there is a one underneath the head, resulting in the same actions as the previous two steps.

1 1 1 0 1 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"


1 → 1, R

65


The machine now has a zero under the head. The transition function has that 𝛿(𝑞𝑠𝑡𝑎𝑟𝑡, 0) = (𝑞𝑟𝑒𝑗𝑒𝑐𝑡, 0, 𝑅), so themachine transitions to the state 𝑞𝑟𝑒𝑗𝑒𝑐𝑡, leaves the 0 unchanged, and moves the head to the right.

1 1 1 0 1 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#"$%


1 → 1, R

We are now in a final state, so no further steps are possible. Since the machine reached the reject state, it rejects theinput 111010111.

Had the input been composed solely of ones, the machine would have stayed in 𝑞𝑠𝑡𝑎𝑟𝑡, moving the head one cell to theright in each step. After examining all the input symbols, it would reach a cell that contains the blank symbol ⊥. Thetransition function tells us that the machine would move to the accept state 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, leave the blank symbol unchanged,and move the head to the right. Thus, the machine would accept any input consisting only of ones.

This machine always reaches a final state. In the worst case, it examines every input symbol, so it runs in time linearwith respect to the size of the input. While this machine always halts, we will soon see that this is not the case for allmachines.

As a second example, consider the following machine:

66


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

This machine has the input alphabet Σ = 0, 1, the tape alphabet Γ = 0, 1,⊥, the six states 𝑄 =

𝑞𝑠𝑡𝑎𝑟𝑡, 𝑞𝑠𝑐𝑎𝑛, 𝑞𝑒𝑛𝑑, 𝑞𝑏𝑎𝑐𝑘, 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, 𝑞𝑟𝑒𝑗𝑒𝑐𝑡, and the following transition function:

old state old symbol new state new symbol direction𝑞𝑠𝑡𝑎𝑟𝑡 0 𝑞𝑠𝑐𝑎𝑛 ⊥ 𝑅𝑞𝑠𝑡𝑎𝑟𝑡 1 𝑞𝑟𝑒𝑗𝑒𝑐𝑡 1 𝑅𝑞𝑠𝑡𝑎𝑟𝑡 ⊥ 𝑞𝑎𝑐𝑐𝑒𝑝𝑡 ⊥ 𝑅𝑞𝑠𝑐𝑎𝑛 0 𝑞𝑠𝑐𝑎𝑛 0 𝑅𝑞𝑠𝑐𝑎𝑛 1 𝑞𝑠𝑐𝑎𝑛 1 𝑅𝑞𝑠𝑐𝑎𝑛 ⊥ 𝑞𝑒𝑛𝑑 ⊥ 𝐿𝑞𝑒𝑛𝑑 0 𝑞𝑟𝑒𝑗𝑒𝑐𝑡 0 𝑅𝑞𝑒𝑛𝑑 1 𝑞𝑏𝑎𝑐𝑘 ⊥ 𝐿𝑞𝑒𝑛𝑑 ⊥ 𝑞𝑟𝑒𝑗𝑒𝑐𝑡 ⊥ 𝑅𝑞𝑏𝑎𝑐𝑘 0 𝑞𝑏𝑎𝑐𝑘 0 𝐿𝑞𝑏𝑎𝑐𝑘 1 𝑞𝑏𝑎𝑐𝑘 1 𝐿𝑞𝑏𝑎𝑐𝑘 ⊥ 𝑞𝑠𝑡𝑎𝑟𝑡 ⊥ 𝑅

Let us trace the execution of this machine on the input 0011. The machine starts in the initial state, with the inputwritten on the tape and the head in the leftmost position:

67


0 0 1 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"

qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The cell underneath the head contains a zero. The transition function tells us that 𝛿(𝑞𝑠𝑡𝑎𝑟𝑡, 0) = (𝑞𝑠𝑐𝑎𝑛,⊥, 𝑅), so themachine transitions to state 𝑞𝑠𝑐𝑎𝑛, writes a blank symbol under the head, and moves the head to the right.

⊥ 0 1 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

68


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The cell underneath the head now contains a zero. The transition function tells us that 𝛿(𝑞𝑠𝑐𝑎𝑛, 0) = (𝑞𝑠𝑐𝑎𝑛, 0, 𝑅), sothe machine stays in 𝑞𝑠𝑐𝑎𝑛, leaves the symbol 0 under the head, and moves the head to the right.

⊥ 0 1 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

69


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The cell underneath the head now contains a one. The transition function tells us that 𝛿(𝑞𝑠𝑐𝑎𝑛, 1) = (𝑞𝑠𝑐𝑎𝑛, 1, 𝑅), sothe machine stays in 𝑞𝑠𝑐𝑎𝑛, leaves the symbol 1 under the head, and moves the head to the right.

⊥ 0 1 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

70


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

Again, the cell underneath the head now contains a one, so the machine stays in the same state, leaves the cell un-changed, and moves the head to the right.

⊥ 0 1 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

71


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

Now the cell underneath the head contains a blank symbol, so the transition function tells us that the machine transi-tions to 𝑞𝑒𝑛𝑑, leaves the blank in the cell, and moves the head to the left.

⊥ 0 1 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#

72


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The cell underneath the head contains a one, so the transition function tells us that the machine transitions to 𝑞𝑏𝑎𝑐𝑘,writes a blank to the cell, and moves the head to the left.

⊥ 0 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

73


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The current cell contains a one, so the transition function tells us that the machine stays in 𝑞𝑏𝑎𝑐𝑘, leaves the one in thecell, and moves the head to the left.

⊥ 0 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

74


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The current cell now contains a zero, so the transition function tells us that the machine stays in 𝑞𝑏𝑎𝑐𝑘, leaves the zeroin the cell, and moves the head to the left.

⊥ 0 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

75


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The current cell contains a blank, so the transition function tells us that the machine transitions to 𝑞𝑠𝑡𝑎𝑟𝑡, leaves theblank in the cell, and moves the head to the right.

⊥ 0 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"

76


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

We have a zero in the current cell, so the machine transitions to 𝑞𝑠𝑐𝑎𝑛, writes a blank to the cell, and moves the headto the right.

⊥ ⊥ 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

77


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

We now have a one in the current cell, so the machine stays in 𝑞𝑠𝑐𝑎𝑛, leaves the one in the cell, and moves the head tothe right.

⊥ ⊥ 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

78


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

We now have a blank, so the machine transitions to 𝑞𝑒𝑛𝑑, leaves the blank in the cell, and moves the head to the left.

⊥ ⊥ 1 ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#

79


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

We have a one in the current cell, so the machine transitions to 𝑞𝑏𝑎𝑐𝑘, writes a blank to the cell, and moves the head tothe left.

⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$

80


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The current cell contains a blank, so the machine transitions to 𝑞𝑠𝑡𝑎𝑟𝑡, leaves a blank in the cell, and moves the headto the right.

⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!"#$"

81


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The current cell contains a blank, so the machine transitions to 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, leaves a blank in the cell, and moves the headto the right.

⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯INFINITE TAPE

𝑞!""#$%

82


qstart qscan qend

qbackqaccept

qreject

⊥ → ⊥, R

0 → ⊥, R

1 → 1, R

0 → 0, R

1 → 1, R

⊥ → ⊥, L

1→⊥, L

0 → ⊥, R

⊥ → ⊥, R

0 → 0, L

1 → 1, L

⊥ →⊥,R

The machine has reached the accept state 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, so it halts and accepts.

The language of this machine is the set of strings that contain a number of zeros followed by the same number of ones:

𝐿 = 0𝑛1𝑛∶ 𝑛 ∈ N = 𝜀, 01, 0011, 000111, . . .

Thus, even though this language cannot be computed by a finite automaton, it can be solved by a Turing machine.

How powerful is the Turing-machine model? The Church-Turing thesis states the following:

Theorem 47 (Church-Turing Thesis) If a function is computable by any mechanical device (i.e. a computer),it is computable by a Turing machine.

Thus, in spite of its simplicity, the Turing-machine model encompasses all possible computation.

8.1 Equivalent Models

As an illustration of the power of the simple Turing-machine model above, we will demonstrate that a modified model,that of the two-tape Turing machine, is no more powerful than the simple model. Similar to the one-tape version, thetwo-tape model consists of a seven-tuple:

𝑀 = ⟨𝑄,Γ,Σ, 𝛿, 𝑞𝑠𝑡𝑎𝑟𝑡, 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, 𝑞𝑟𝑒𝑗𝑒𝑐𝑡⟩

The components 𝑄, Γ, Σ, 𝑞𝑠𝑡𝑎𝑟𝑡, 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, and 𝑞𝑟𝑒𝑗𝑒𝑐𝑡 are as in the one-tape model. The transition function 𝛿, however,now takes two tape symbols as input (one under each head), writes two tape symbols as output, and moves each of the

8.1. Equivalent Models 83


two heads independently. Formally, we have

𝛿 ∶ (𝑄 \ 𝐹 ) × Γ2→ 𝑄 × Γ

2× 𝐿,𝑅2

The notation Γ2 is shorthand for Γ × Γ. In more detail, the components of the transition function are as follows:

𝛿 ∶ (𝑄 \ 𝐹 )ÍÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÑÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÏ

Non-terminal state

× ΓÍÑÏWhat is on tape 1

× ΓÍÑÏWhat is on tape 2

→

𝑄ÍÑÏ

New state

× ΓÍÑÏSymbol to write on tape 1

× ΓÍÑÏSymbol to write on tape 2


How to move TM head 1


How to move TM head 2

The initial state of the machine has the input written on the first tape, followed by blank symbols, and only blanksymbols on the second tape. Both heads begin at the leftmost cell of the respective tape.

⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⊥ ⋯TAPE 2

𝑞!"#$"

1 1 1 0 1 0 1 1 1 ⊥ ⊥ ⊥ ⊥ ⋯TAPE 1

In each step, the machine is in some state 𝑞 and reads one symbol from each tape, 𝛾1 and 𝛾2 for example. The transitionfunction maps (𝑞, 𝛾1, 𝛾2) to a new state, new symbols to write on each tape, and a direction to move each head:

𝛿(𝑞, 𝛾1, 𝛾2) = (𝑞′, 𝛾 ′1, 𝛾 ′2, 𝑑1, 𝑑2)

The machine transitions to the new state 𝑞′, writes 𝛾 ′1 to the first tape and moves its head in the direction 𝑑1, and writes𝛾′2 to the second tape and moves its head in the direction 𝑑2.

We show that the one-tape and two-tape models are equivalent by proving that anything that can be computed on onemodel can also be computed by the other, and vice versa. In other words, for every one-tape Turing machine, there isan equivalent two-tape machine that has the same behavior on each input, and for every two-tape machine, there is aone-tape machine with the same behavior.

The first direction is trivial: given an arbitrary one-tape Turing machine, we can construct an equivalent two-tapemachine that just ignores its second tape. Formally, the two machines have the same set of states 𝑄, tape and inputalphabets Γ and Σ, and start and final states 𝑞𝑠𝑡𝑎𝑟𝑡, 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, and 𝑞𝑟𝑒𝑗𝑒𝑐𝑡. For each mapping of the transition function ofthe one-tape machine

𝛿(𝑞, 𝛾) = (𝑞′, 𝛾 ′, 𝑑)

the two-tape machine has the corresponding mapping

𝛿2(𝑞, 𝛾,⊥) = (𝑞′, 𝛾 ′,⊥, 𝑑, 𝑅)

In other words, it just moves the second head to the right in each step, while maintaining the blank symbols on thesecond tape. Since the behavior is solely dependent on the first tape, the two-tape machine makes the exact sametransitions as the one-tape machine on a given input, and it ends up accepting, rejecting, or looping on the inputexactly when the one-tape machine does so.



The other direction is a bit more complicated: given an arbitrary two-tape machine, we need to construct a one-tapemachine that does the same computation on any given input. There are many ways to do this, and we describe onesuch construction at a high level.

First, we consider how a one-tape machine can encode the contents of two tapes, as well as two head positions. A keyobservation is that the amount of actual data (i.e. excluding trailing blank cells) on the two tapes is finite at any pointin a computation. Thus, we can encode that data on a single tape just as well. We use the following format:

# ⋯ • 𝛾1 ⋯Í ÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÑÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ ÏContents of Tape 1

# ⋯ • 𝛾2 ⋯Í ÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÑÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒÒ ÏContents of Tape 2

#

We use a special # marker to separate the (finite) contents of each tape, as well as to indicate the ends. Between thesemarkers, the contents of each tape are represented using the same tape symbols, except that we denote the position ofeach head by placing a dot symbol to the left of the cell at which the head is located. Thus, if Γ is the tape alphabet ofthe two-tape machine, then the tape alphabet of the one-tape machine is

Γ1 = Γ ∪ #,•

We won’t give a formal definition of the states or transition function, but we will describe how the machine works.Given an input 𝑤1𝑤2 . . . 𝑤𝑛, the one-tape machine does the following:

1. Rewrite the tape to be in the correct format:

# • 𝑤1⋯𝑤𝑛# •⊥#

This entails shifting each input symbol by two spaces, placing a # marker in the first cell and a dot in thesecond, placing a # marker after the last input symbol (at the first blank cell), writing a dot symbol in the nextcell, followed by a blank symbol, followed by one more # marker.

2. Simulate each step of the two-tape machine:

a) Scan the tape to find the dot symbols, representing the positions of each head.

b) Transition to a new state according to the transition function of the two-tape machine (depending on thecurrent state and the symbol immediately to the right of each dot).

c) Replace the symbols to the right of each dot with the symbols to be written to each tape, as specified in thetransition function of the two-tape machine.

d) Move the dots in the appropriate direction. If the dot is to move to the left but there is a # marker there,then it stays in the same position, corresponding to what happens when a tape head tries to move left whenat the leftmost cell. On the other hand, if the dot is to move to the right but there is a # marker one morecell to the right, the machine needs to shift the marker and all following symbols (up to the last # marker)one cell to the right, placing a blank symbol in the space created by the shift. This ensures that there is avalid symbol to the right of the dot (i.e. underneath the corresponding head) that can be read in the nextsimulated step.

Constructing this one-tape machine is a mechanical process from the definition of the corresponding two-tape machine,though we won’t provide all the details. The important result is that this one-tape machine does simulate the operationof the original two-tape machine, even if it takes many steps to carry out what the original machine does in a singlestep. If the two-tape machine accepts an input, then the one-tape simulation will also do so eventually, and similarly ifthe original machine rejects or loops, so will the one-tape machine.

We have demonstrated that a two-tape Turing-machine model is equivalent to the simple, one-tape model previouslydescribed. In general, a computational model or programming language that is equivalent to the model of one-tapeTuring machines is said to be Turing-complete. Most real-world programming languages, such as C++, Python, andeven LaTeX, are Turing-complete. Thus, the results we prove about computation using the Turing-machine model alsoapply to real-world models, and the simplicity of Turing machines makes those results easier to prove than if we wereto reason about C++ or other complicated languages.



8.2 The Language of a Turing Machine

Now that we have formal definitions of both languages and Turing machines, which formalize the notions of a problemand a program, we can see how the two can be related. A Turing machine 𝑀 has an input alphabet Σ, and Σ

∗ is theset of all possible inputs to the machine. There are three possible behaviors for a machine 𝑀 on input 𝑥:

• 𝑀 accepts 𝑥 – when it receives 𝑥 as the input, 𝑀 eventually reaches 𝑞𝑎𝑐𝑐𝑒𝑝𝑡, halting execution in the acceptstate

• 𝑀 rejects 𝑥 – when it receives 𝑥 as the input, 𝑀 eventually reaches 𝑞𝑟𝑒𝑗𝑒𝑐𝑡, halting execution in the reject state

• 𝑀 loops on 𝑥 – when it receives 𝑥 as the input, 𝑀 never reaches a final state, and execution continues forever

Thus, a machine accepts some subset of inputs, rejecting or looping on inputs outside of that set. Since a language isa subset of Σ

∗, we can define the language of a machine to be exactly the set of strings that the machine accepts:

𝐿(𝑀) = 𝑥 ∶𝑀 accepts 𝑥

𝑥 ∶ 𝑀accepts 𝑥

𝑥 ∶ 𝑀rejects 𝑥

𝑥 ∶ 𝑀 loops on 𝑥

Σ∗𝑳(𝑴)

As a result, every machine 𝑀 has a corresponding language 𝐿(𝑀). If some input 𝑥 is not in the language of 𝑀 , i.e.𝑥 ∉ 𝐿(𝑀), then 𝑀 either rejects 𝑥 or loops on it. In other words, if 𝐴 = 𝐿(𝑀) is the language of 𝑀 , then:

• If 𝑥 ∈ 𝐴, 𝑀 accepts 𝑥.

• If 𝑥 ∉ 𝐴, 𝑀 rejects or loops on 𝑥.

For the machine above that accepts strings consisting only of ones, we have 𝐿(𝑀) = 1∗.

The most useful machines are those that halt on every input. Such a machine is a decider, and it must either accept orreject every input. A decider is not allowed to loop on any inputs.

8.2. The Language of a Turing Machine 86


𝑥 ∶ 𝑀accepts 𝑥

𝑥 ∶ 𝑀rejects 𝑥

Σ∗𝑳(𝑴)

If a machine 𝑀 never loops, we say that 𝑀 decides its language 𝐿(𝑀). In other words, if 𝑀 decides some language𝐴, then:


• If 𝑥 ∉ 𝐴, 𝑀 rejects 𝑥.

Observe that this is more restrictive than a machine 𝑀 having 𝐴 as its language (𝐿(𝑀) = 𝐴). The latter allows 𝑀 toeither reject or loop on an 𝑥 ∉ 𝐴. But if 𝑀 decides 𝐴, then it is only allowed to reject 𝑥 ∉ 𝐴. In either case, 𝑀 mustaccept 𝑥 ∈ 𝐴. Thus, we can conclude that 𝑀 decides 𝐴 if 𝐿(𝑀) = 𝐴 and 𝑀 halts on every input.

We now have the following formalizations:

• A language is a formalization of a decision problem.

• A Turing machine is an abstraction of a program.

• A machine deciding a language is a formalization of a program solving a decision problem.

A language 𝐴 is decidable if there exists some machine 𝑀 that decides it. Thus, our original question about whethera problem is solvable boils down to whether its corresponding language is decidable.

8.3 Proving a Language is Decidable

Must we construct a Turing machine to demonstrate that a language is decidable? No! The Church-Turing thesis(page 83) allows us to write an algorithm in any language, including pseudocode, and if we can do that, we couldconstruct a corresponding Turing machine. As an example, consider the following language:

𝐿GCD = (𝑎 ∈ N, 𝑏 ∈ N) ∶ gcd(𝑎, 𝑏) = 1

An algorithm to decide this language is as follows:

𝐷GCD on input (𝑎 ∈ N, 𝑏 ∈ N) ∶Compute 𝐸𝑢𝑐𝑙𝑖𝑑(max(𝑎, 𝑏),min(𝑎, 𝑏))If 𝐸𝑢𝑐𝑙𝑖𝑑(max(𝑎, 𝑏),min(𝑎, 𝑏)) = 1, acceptElse reject

8.3. Proving a Language is Decidable 87


This machine always halts, since 𝐸𝑢𝑐𝑙𝑖𝑑(max(𝑎, 𝑏),min(𝑎, 𝑏)) takes O(log(𝑎+ 𝑏)) steps to compute. Furthermore,the machine has the correct behavior for inputs that are in and are not in 𝐿GCD:

• If (𝑎, 𝑏) ∈ 𝐿GCD, gcd(𝑎, 𝑏) = 1, so 𝐸𝑢𝑐𝑙𝑖𝑑(max(𝑎, 𝑏),min(𝑎, 𝑏)) = 1 and 𝐷GCD accepts (𝑎, 𝑏).

• If (𝑎, 𝑏) ∉ 𝐿GCD, gcd(𝑎, 𝑏) > 1, so 𝐸𝑢𝑐𝑙𝑖𝑑(max(𝑎, 𝑏),min(𝑎, 𝑏)) > 1 and 𝐷GCD rejects (𝑎, 𝑏).

Thus, 𝐷GCD decides 𝐿GCD, so 𝐿GCD is a decidable language.

In general, to demonstrate a language is decidable, we write an algorithm to decide it and then analyze the algorithmto prove that it is a proper decider for the language:

• The algorithm halts on (accepts or rejects) all inputs.

• The algorithm accepts all strings in the language.

• The algorithm rejects all strings not in the language.

Example 48 Suppose language 𝐿 is decidable. We show that 𝐿′ = 𝐿 ∪ 𝜀 is also decidable.

Since 𝐿 is decidable, there exists some Turing machine 𝐷 that decides 𝐿. We can use it define another machine𝐸 as follows:

𝐸 on input 𝑥 ∶If 𝑥 = 𝜀, acceptRun 𝐷 on 𝑥 an answer as 𝐷

This machine halts on all inputs since the only nontrivial thing it does is call 𝐷, and 𝐷 halts since it is a decider.We analyze 𝐸’s behavior as follows:

• If 𝑥 = 𝜀, the first line of 𝐸 accepts 𝑥.

• If 𝑥 ≠ 𝜀 and 𝑥 ∈ 𝐿, then 𝐷 accepts 𝑥, so 𝐸 accepts 𝑥.

• If 𝑥 ≠ 𝜀 and 𝑥 ∉ 𝐿, then 𝐷 rejects 𝑥, so 𝐸 rejects 𝑥.

Since 𝐸 accepts all strings in 𝐿′= 𝐿 ∪ 𝜀 and rejects all strings not in 𝐿

′, it is a decider for 𝐿′, and 𝐿′ is

decidable.

Are all languages decidable? We have seen that every machine is associated with a language, but does every languagehave a corresponding machine? We will consider this question next.

8.3. Proving a Language is Decidable 88

CHAPTER

NINE

DIAGONALIZATION

One way of answering the question of whether or not there exists an undecidable language is to compare the numberof languages with the number of machines. Let ℒ be the set of all languages, and let ℳ be the set of all machines. Ifwe can demonstrate that ℒ is a larger set than ℳ, i.e. ∣ℳ∣ < ∣ℒ∣, then there are more languages than machines, andit follows that there must be some language that does not have a corresponding machine. However, both ℳ and ℒ areinfinite sets, so we need the appropriate tools to reason about the size of an infinite set.

9.1 Countable Sets

For a finite set, its size is the number of elements it contains, and this number is an element of the natural numbersN. The number of elements in an infinite set, on the other hand, cannot be represented as a natural number19. For ourpurposes, however, we need only to reason about the relative sizes of infinite sets to answer questions such as whether∣ℳ∣ < ∣ℒ∣. The key is to determine whether or not an infinite set is the same size as N. We start by defining what itmeans for a set to be countable.

Definition 49 (Countable) A set 𝑆 is countable if and only if there exists a one-to-one function 𝑓 ∶ 𝑆 → N from𝑆 to the natural numbers N.

A one-to-one or injective function is one that maps each element of the domain to a distinct element of the codomain.An onto or surjective function is one that for each element of its codomain, the function maps at least one element ofthe domain to that element. A function may also be both one-to-one and onto, in which case it is bijective.

Observe that the definition of countable above only requires that the function be one-to-one, and it need not be onto.For example, the set 𝑆 = 𝑎, 𝑏, 𝑐 is countable, since the following function is one-to-one:

𝑓(𝑎) = 0

𝑓(𝑏) = 1

𝑓(𝑐) = 2

In fact, any finite set 𝑆 is countable – we can just list all the elements of 𝑆, assigning them natural numbers in theorder that we listed them. However, this doesn’t just apply to finite sets – the same logic applies to an infinite set ifwe can list its elements in some order. We can do so exactly when we can construct a one-to-one function 𝑓 ∶ 𝑆 → N.If we have a list, we can construct such a function using the same process as above, and if we have a one-to-onefunction 𝑓 , we can list the elements by the ordering determined by 𝑓 : start with the element 𝑠 with the minimal valuefor 𝑓(𝑠), then the one with the minimal value from the remaining elements, and so on. Since the natural numbers arewell-ordered by the less-than operator <, there is always some minimal value in any nonempty subset of N.

Observation 50 A set 𝑆 is countable if and only if we can construct a list of all the elements of 𝑆.

19 There are other kinds of numbers20 that can represent the sizes of infinite sets, but they are beyond the scope of this text.20 https://en.wikipedia.org/wiki/Cardinal_number

89

https://en.wikipedia.org/wiki/Cardinal_number


If 𝑆 is an infinite set and is countable, we can actually construct a bijection between 𝑆 and N: list the elements of 𝑆 inorder (e.g. according to a one-to-one function 𝑓 ), then number the elements in the list with successive naturals startingfrom 0. Since the list is infinitely long, there will be some element 𝑠𝑛 ∈ 𝑆 associated with each element 𝑛 ∈ N,namely the 𝑛th item from our list of the elements of 𝑆.

We take it by definition that two sets have the same size or cardinality if there exists a bijection between them, whetherthe sets are finite or infinite. Since an infinite and countable (or countably infinite) set 𝑆 can be put into a bijectionwith N, we have that ∣𝑆∣ = ∣N∣ for such a set.

In practice, we do not always define an explicit one-to-one function 𝑓 ∶ 𝑆 → N to list the elements of 𝑆. Instead, wecan list a set 𝑆 by describing how the list is constructed and showing that an arbitrary element 𝑠 ∈ 𝑆 appears at somefinite position in that list.

Example 51 We show that the set of integers Z is countable. Observe that listing the nonnegative integers firstfollowed by the negative integers does not work – there are infinitely many nonnegative integers, so we never getto the negative ones:

0, 1, 2, 3, . . . ,−1,−2, . . .

Since an element like −1 does not appear at some finite position in this list, it is not a valid listing of the integers.

What does work is to order the integers by their absolute value, which interleaves the positive and negativeelements:

0, 1,−1, 2,−2, 3,−3, . . .

An arbitrary positive integer 𝑖 appears at (zero-indexed) position 2𝑖−1, and an arbitrary negative integer 𝑖 appearsat position 2𝑖. Thus, any arbitrary integer appears at some finite position on the list.

Formally, we can define a one-to-one function 𝑓 ∶ Z→ N as follows:

𝑓(𝑖) = 2𝑖 − 1 if 𝑖 > 0

2𝑖 otherwise

Thus, Z is countably infinite.

Example 52 We show that the set of positive rational numbers Q+ is countable. A rational number 𝑞 ∈ Q+ canbe represented by a pair of integers 𝑥/𝑦, where 𝑥, 𝑦 ∈ N+. So we start by showing that the set of pairs of naturalnumbers N × N is countable.

We use a similar idea as what we did to show that Z is countable, i.e. demonstrating an interleaving of theelements that results in a valid list. Since we are working with pairs, we have two degrees of freedom from theelements of the pairs. Thus, we can write them out in two dimensions:

9.1. Countable Sets 90


𝟏 𝟐 𝟑 𝟒 𝟓

𝟏 (1,1) (1,2) (1,3) (1,4) (1,5)

𝟐 (2,1) (2,2) (2,3) (2,4) (2,5)

𝟑 (3,1) (3,2) (3,3) (3,4) (3,5)

𝟒 (4,1) (4,2) (4,3) (4,4) (4,5)

𝟓 (5,1) (5,2) (5,3) (5,4) (5,5)

To demonstrate a one-to-one mapping to N, however, we need to describe a one-dimensional list. Observe thatwe can do so by taking elements in order of the diagonals – each diagonal has a finite number of elements, andwe eventually reach an arbitrary element after a finite number of diagonals.

𝟏 𝟐 𝟑 𝟒 𝟓

𝟏 (1,1) (1,2) (1,3) (1,4) (1,5)

𝟐 (2,1) (2,2) (2,3) (2,4) (2,5)

𝟑 (3,1) (3,2) (3,3) (3,4) (3,5)

𝟒 (4,1) (4,2) (4,3) (4,4) (4,5)

𝟓 (5,1) (5,2) (5,3) (5,4) (5,5)

We are ordering elements (𝑥, 𝑦) ∈ N×N according to the value of 𝑥+𝑦, and an arbitrary element (𝑥, 𝑦) appearsat a position that is O((𝑥 + 𝑦)2) from the beginning.

We can proceed the same way with Q+, but observe that there are duplicate elements; for example, 2/2 is thesame element as 1/1. However, we can avoid duplicates by simply ignoring them when we encounter them; inother words, when we construct our list, we skip over an item if it is a duplicate of an element already in the list.

9.1. Countable Sets 91


𝟏 𝟐 𝟑 𝟒 𝟓

𝟏 1/1 1/2 1/3 1/4 1/5

𝟐 2/1 2/2 2/3 2/4 2/5

𝟑 3/1 3/2 3/3 3/4 3/5

𝟒 4/1 4/2 4/3 4/4 4/5

𝟓 5/1 5/2 5/3 5/4 5/5

𝟏 𝟐 𝟑 𝟒 𝟓

𝟏 1/1 1/2 1/3 1/4 1/5

𝟐 2/1 2/2 2/3 2/4 2/5

𝟑 3/1 3/2 3/3 3/4 3/5

𝟒 4/1 4/2 4/3 4/4 4/5

𝟓 5/1 5/2 5/3 5/4 5/5

Since we can construct a list of the elements in Q+, the set is countable.

Exercise 53 Show that the set of all rationals Q is countable.

9.2 Uncountable Sets

So far, we have demonstrated that several infinite sets are countable. How do we show that a set 𝑆 is uncountable? Weneed to show that there is no one-to-one function 𝑓 ∶ 𝑆 → N. Equivalently, we can demonstrate that there is no listthat enumerates all the elements of 𝑆.

As an example, consider the set of real numbers in the interval [0, 1). Assume for the purposes of contradiction thatthis set is countable. We can then list all the elements in order, and the result looks something like the following:

𝑟! : 0. 1 2 1 5 6 6

𝑟": 0. 2 3 3 9 9 7

𝑟# : 0. 4 5 6 7 1 1⋯

𝑟$ : 0. 3 2 8 9 4 5

𝑟%: 0. 3 4 1 7 7 5

𝑟&: 0. 4 2 4 3 2 3

⋮ ⋱

Here, we have represented each element in decimal form. Most real numbers do not have a finite representation indecimal, so the digits continue indefinitely to the right. If a number does have a finite decimal representation, we padit to the right with infinitely many zeros.

We now use this list to construct a real number 𝑟∗ ∈ [0, 1) as follows: we choose the 𝑖th digit (zero-indexed, after thedecimal) of 𝑟∗ such that it differs from the 𝑖th digit of item 𝑟𝑖 in the list. For example:

9.2. Uncountable Sets 92


𝑟! : 0. 1 2 1 5 6 6

𝑟": 0. 2 3 3 9 9 7

𝑟# : 0. 4 5 6 7 1 1⋯

𝑟$ : 0. 3 2 8 9 4 5

𝑟%: 0. 3 4 1 7 7 5

𝑟&: 0. 4 2 4 3 2 3

⋮ ⋱

𝑟∗ : 0. 2 4 7 1 9 4 ?

Here, 𝑟0 has 1 as its 0th digit, so we choose 2 as the 0th digit of 𝑟∗. Similarly, 𝑟1 has 3 as its 1st digit so we choose4 as the 1st digit of 𝑟∗, 𝑟2 has 6 as its 2nd digit so we choose 7 for the 2nd digit of 𝑟∗, and so on21. Thus, 𝑟∗ differsfrom 𝑟𝑖 in the 𝑖th digit.

Claim 54 𝑟∗ does not appear on the list of the elements of [0, 1).

Proof 55 Suppose 𝑟∗ appears on the list at some concrete position 𝑖. However, 𝑟∗ was constructed such that its

𝑖th digit differs from the 𝑖th digit of 𝑟𝑖, the element at the 𝑖th position on the list. Since 𝑟∗ and 𝑟𝑖 differ in the

𝑖th digit, 𝑟∗ ≠ 𝑟𝑖, which contradicts our assumption that 𝑟∗ appears on the list at position 𝑖. Since we made noassumptions about the position 𝑖, this reasoning applies to all possible positions, which means that 𝑟∗ does notappear at any position on the list.

Though 𝑟∗ differs from any number in our list, it is a valid element of [0, 1). We have arrived at a contradiction,

since we previously stated that the list contains all the elements of the set. Since we have a contradiction, our originalassumption that [0, 1) is countable must be false.

This technique is called diagonalization, and it is a general technique for showing that a set is uncountable. Since wecannot list the reals in [0, 1), we cannot list the full set R. Thus, R is uncountable – no bijection exists between N andR, and we can conclude that ∣N∣ < ∣R∣.

21 Observe that 𝑟3 has 9 as its 3rd digit, but we choose 1 rather than 0 as the 3rd digit of 𝑟∗. This is to avoid the technicality that when the digitsafter the 𝑘th digit are all 9, the number has an equivalent representation where the 𝑘th digit is one larger and the remaining digits are all 0; forexample, 0.42329 = 0.4233. Choosing a 1 rather than a 0 ensures that the number we construct is actually a different number than the one in thelist.

9.2. Uncountable Sets 93


9.3 The Existence of an Undecidable Language

Returning to our original question of how the set of machines ℳ is related to the set of languages ℒ, we can show usingthe techniques above that ℳ is countable while ℒ is uncountable. This implies that ∣ℳ∣ < ∣ℒ∣, and it immediatelyfollows that there are languages that are not associated with any machine.

We leave the countability argument as an exercise and instead demonstrate a direct diagonalization proof that thereis some language 𝐴 that is not associated with any machine in ℳ. We first note that the set of finite binary strings0, 1∗ is countable, since we can list its elements in order of increasing length:

0, 1∗ = 𝜀, 0, 1, 00, 01, 10, 11, 000, 001, . . .

We also use the fact that ℳ is countable and therefore its elements can be listed as well. Without loss of generality,we assume that each machine takes binary strings as input (the input values can be re-encoded in binary if this is notthe case). We can then construct an infinite, two-dimensional table with machines enumerated along one dimensionand inputs along the other:

𝜀 0 1 00 01 10 11 000 ⋯

𝑀! yes no yes yes no no no no

𝑀" yes yes no no no no no yes

𝑀# yes no no no yes no no yes

𝑀$ yes no yes no yes yes no no

𝑀% yes no yes no yes no no yes

𝑀& yes yes no no no no yes yes

𝑀' no yes no no yes no no no

𝑀( yes no no yes no yes no yes

⋮ ⋱

The entries of this table are whether the machine corresponding to the given row accepts the input corresponding tothe given column. For example, we have here that 𝑀0 accepts the empty string but rejects or loops on the string 0,while 𝑀1 accepts both of those strings but rejects or loops on the string 1.

Consider the diagonal of this table, which contains the result of machine 𝑀𝑖 applied to the input 𝑥𝑖 corresponding tocolumn 𝑖. By definition, we have that 𝑥𝑖 ∈ 𝐿(𝑀𝑖) if 𝑀𝑖 accepts 𝑥𝑖. In other words, 𝑥𝑖 is in the language of 𝑀𝑖 if thecorresponding entry in the table above is “yes”.

Now construct language 𝐴 as follows:

𝐴 = 𝑥𝑖 ∶ 𝑥𝑖 ∉ 𝐿(𝑀𝑖)

𝐴 1 00 10 11 ⋯

9.3. The Existence of an Undecidable Language 94


This means that 𝐴 contains 𝑥𝑖 exactly when 𝐿(𝑀𝑖) does not contain 𝑥𝑖. In other words, for each 𝑖, we have that𝑥𝑖 ∈ 𝐴 if 𝑀𝑖 rejects or loops on 𝑥𝑖. But if 𝑀𝑖 accepts 𝑥𝑖, then 𝑥𝑖 ∉ 𝐴.

We now have a language that differs from the language of any machine in our list. In particular, 𝐴 differs from 𝐿(𝑀𝑖)with respect to element 𝑥𝑖 – if 𝑥𝑖 ∈ 𝐿(𝑀𝑖), then 𝑥𝑖 ∉ 𝐴, but if 𝑥𝑖 ∉ 𝐿(𝑀𝑖), then 𝑥𝑖 ∈ 𝐴. Since our list exhaustivelyenumerates all machines, there is no machine whose language is 𝐴. If there is no machine whose language is 𝐴, thereis no machine that decides 𝐴, so 𝐴 is undecidable22. This means that no computer in the universe can solve it, nomatter how many resources we throw at it.

We can go further – we know that that ℳ is countable but ℒ is uncountable, so there are far more (in fact, uncountablyinfinitely more) undecidable languages than decidable languages. Thus, most problems are actually unsolvable!

Observe that our proof above was nonconstructive – we showed that there exists some undecidable language, but wedid not show that a specific language is undecidable. We will do so next, demonstrating the undecidability of a specificlanguage of interest.

Exercise 56 a) Prove that the set of all machines ℳ is countable.

Hint: Use the fact that 0, 1∗ is countable and what we know about binary strings.

b) Use diagonalization to show that the set of all languages ℒ is uncountable.

22 Not all machines are deciders, and we have shown that 𝐴 is not the language of any machine, not just deciders. Thus, 𝐴 is not just undecidablebut unrecognizable, a concept we will see later (page 120).

9.3. The Existence of an Undecidable Language 95

CHAPTER

TEN

THE HALTING PROBLEM

I don’t care to belong to any club that will have me as a member. — Groucho Marx

Suppose upon visiting a new town, you come across a barbershop with the following sign in its window:

Barber 𝐵 is the best barber in town! 𝐵 cuts the hair of those, and only those, who do not cut their ownhair.

In other words, if person 𝑋 lives in town, there are two possibilities:

1. 𝑋 cuts their own hair. Then 𝐵 does not cut 𝑋’s hair.

2. 𝑋 does not cut their own hair. Then 𝐵 cuts 𝑋’s hair.

Assuming the sign is true and that 𝐵 is also a resident of the town, does 𝐵 cut their own hair? Substituting 𝑋 = 𝐵into the two cases above, we have:

1. 𝐵 cuts their own hair. Then 𝐵 does not cut 𝐵’s hair.

2. 𝐵 does not cut their own hair. Then 𝐵 cuts 𝐵’s hair.

Both cases result in a contradiction! Thus, our assumption that the sign is true and that 𝐵 lives in town must beincorrect.

This is known as the barber paradox, and we will see that we can construct a similar situation out of Turing machines.In particular, we will define a specific language 𝐿ACC, and we will assume that there exists a decider 𝐶 for 𝐿ACC. Wewill then be able to construct another decider 𝐷 such that:

𝐷 accepts the source code of those, and only those, programs that do not accept their own source code asinput.

If we consider whether 𝐷 accepts its own source code, we end up in a similar contradiction as that of the barberparadox. Thus, our assumption that there exists a decider for 𝐿ACC is incorrect.

Before we proceed with this construction, we first review some relevant details about data representations and Turingmachines. Every data value can be represented as a binary string, and in fact, real computers store all data in binaryformat. A Turing machine itself can be represented as a binary string, e.g. by taking its seven-tuple description andencoding it into binary. We refer to this string representation of a machine as its source code, and we typically denoteit with angle brackets (e.g. ⟨𝑀⟩ is the source code of machine 𝑀 ). Since the individual components of a seven-tuple(the set of states, input and tape alphabets, transition function, etc.) are all finite, the source code of a Turing machineis a binary string of finite length.

We have also seen the Church-Turing thesis, which states that any computable function can be computed by a Turingmachine. A direct implication of the thesis is that if we can write a program in some other model, such as C++ or evenpseudocode, we can write a Turing machine that is equivalent to that program. Thus, when reasoning about Turingmachines, we often give pseudocode descriptions rather than doing all the work of defining an equivalent seven-tupleTuring machine.

96


Similarly, as discussed previously, if all algorithms that can be implemented on a Turing machine can also be imple-mented in some other programming model 𝑃𝐿, 𝑃𝐿 is said to be Turing-complete (page 85). Real-world programminglanguages, such as C++, Python, and even LaTeX, are generally Turing-complete.

Since the source code of a program is just a binary string, we can pass it as the input to another program. We can evenpass a program its own source code as input! Examples of doing so in C++ are as follows:

$ g++ prog.cpp -o prog.exe$ ./prog.exe prog.cpp # pass the source filename as a command-line argument$ ./prog.exe < prog.cpp # pass the source code to standard input$ ./prog.exe "`cat prog.cpp`" # pass the source code as a single command-line argument

A compiler or interpreter is itself a program that operates on the source code of other programs. In the example above,we passed the source code prog.cpp as a command-line argument to the g++ compiler. The g++ compiler itself isself-hosting23: it is compiled by passing its own source code to itself24.

Recall that a language 𝐴 is decidable if there exists a program 𝑀 that decides it:

• If 𝑥 ∈ 𝐴, then 𝑀 accepts 𝑥.

• If 𝑥 ∉ 𝐴, then 𝑀 rejects 𝑥.

We now consider whether the union of two decidable languages is decidable.

Example 57 Let 𝐴1 and 𝐴2 be decidable languages, and let 𝐵 = 𝐴1∪𝐴2. In other words, 𝐵 is the set of stringsthat are in either 𝐴1 or 𝐴2 or both: 𝐵 = 𝑥 ∶ 𝑥 ∈ 𝐴1 or 𝑥 ∈ 𝐴2. Is 𝐵 a decidable language?

Since 𝐴1 and 𝐴2 are decidable, there exist machines 𝑀1 and 𝑀2, respectively, that decide them. Then 𝑀1

accepts exactly those strings that are in 𝐴1, rejecting every other string, and similarly for 𝑀2 and 𝐴2. We canthen construct a new machine 𝑀 as follows:

𝑀 on input 𝑥 ∶Run 𝑀1 on 𝑥, accept if 𝑀1 acceptsRun 𝑀2 on 𝑥, accept if 𝑀2 acceptsReject

We analyze this machine as follows:

• If 𝑥 ∈ 𝐴1, then 𝑀1 accepts 𝑥 since 𝑀1 is a decider for 𝐴1. Since 𝑀1 accepts, so does 𝑀 .

• If 𝑥 ∉ 𝐴1, then 𝑀1 rejects 𝑥 since 𝑀1 is a decider (a decider cannot loop). So running 𝑀1 on 𝑥 eventuallyterminates, and 𝑀 proceeds to invoke 𝑀2 on 𝑥. Then:

– If 𝑥 ∈ 𝐴2, 𝑀2 accepts 𝑥, and so does 𝑀 .

– If 𝑥 ∉ 𝐴2, 𝑀2 rejects 𝑥 since 𝑀2 is a decider. Thus, the invocation of 𝑀2 on 𝑥 terminates, and 𝑀proceeds to reject 𝑥.

Thus, 𝑀 accepts all inputs that are in 𝐴1 or 𝐴2, and it rejects an input 𝑥 if 𝑥 ∉ 𝐴1∧𝑥 ∉ 𝐴2. Since 𝐵 = 𝐴1∪𝐴2,this means that 𝑀 accepts all inputs in 𝐵 and rejects all inputs not in 𝐵, so 𝑀 is a decider for 𝐵. Since thereexists a machine that decides 𝐵, 𝐵 is a decidable language.

Example 57 demonstrates that the class of decidable languages is closed under union, meaning that if we apply theunion operation to two members of the class, the result is also a member of the class, i.e. a decidable language.

23 https://en.wikipedia.org/wiki/Self-hosting_(compilers)24 A quine25 is an analogous concept of a program that produces its own source code as its output.25 https://en.wikipedia.org/wiki/Quine_(computing)

97

https://en.wikipedia.org/wiki/Self-hosting_(compilers)

https://en.wikipedia.org/wiki/Quine_(computing)


Exercise 58 Show that the class of decidable languages is closed under:

a) intersection (i.e if 𝐴 and 𝐵 are decidable, so is 𝐴 ∩𝐵)

b) complement (i.e. if 𝐴 is decidable, so is 𝐴)

The reasoning in Example 57 assumed that a program 𝑀 can invoke a separate program 𝑀′ on an input. There are

two ways we accomplish this:

1. Incorporate 𝑀′ as a subroutine directly in 𝑀 itself. For a Turing machine, this can be done by inserting the

state diagram for 𝑀 ′ into 𝑀 , with the addition of a transition between some state of 𝑀 to the start state of 𝑀 ′,and likewise from each of the final states of 𝑀 ′ to states in 𝑀 . In programming languages such as C++, we cando similarly with a #include directive26:

#include "Mprime.hpp" // include declaration of Mprime

int M(string x) Mprime(x); // invoke Mprime...

Other languages include analogous import directives.

2. Pass the source code of 𝑀 ′ to 𝑀 and have 𝑀 simulate the execution of 𝑀 ′. In other words, 𝑀 examines thesource code for 𝑀 ′ and performs exactly the steps that are specified in that source code. An interpreter is areal-world incarnation of this concept. For example, we can pass the source code of a Python program to aPython interpreter:

$ python prog.py

The Python interpreter reads the source code in prog.py, executing each statement contained therein one afterthe other.

When it comes to Turing machines, this second concept of an interpreter is that of a universal Turing machine. Sucha machine 𝑈 is given a pair of inputs (⟨𝑀⟩, 𝑥), where the first input is the source code of some machine 𝑀 and thesecond input is the actual input on which we want to run 𝑀 . Then 𝑈 simulates the execution of 𝑀 on 𝑥, so that:

• If 𝑀 accepts 𝑥, then 𝑈 accepts the pair (⟨𝑀⟩, 𝑥).

• If 𝑀 rejects 𝑥, then 𝑈 rejects the pair (⟨𝑀⟩, 𝑥).

• If 𝑀 loops on 𝑥, then 𝑈 loops on the pair (⟨𝑀⟩, 𝑥).

There are many examples of universal Turing machines in the literature, and their descriptions can be found elsewhere.For our purposes, what is important is the existence of universal machines, not the details of their implementation.

Abstraction of a Turing Machine in C++

Since C++ is Turing-complete, for any Turing machine 𝑀 , we can write an equivalent C++ function M. If werepresent acceptance as returning 1 and rejection as returning 0, the signature of this function would look somethinglike the following:

int M(string x);

The function argument is the input to the machine 𝑀 , suitably encoded in C++ string format.

The signature of a universal Turing machine would then be as follows:

26 Typically, we #include header files in C and C++ to pull in declarations from another translation unit and then invoke the compiler/linkeron both translation units. But we can also #include source .cpp files directly if we wish.

98


int U(string M_source, string x);

Here, the source code of some function M is provided as the first argument, and the second argument is the intendedinput to that function. Then:

• If M(x) returns 1, so does U(M_source, x), where M_source is a string representation of M‘s sourcecode.

• If M(x) returns 0, then U(M_source, x) returns 0.

• If M(x) loops (it never returns), then U(M_source, x) also loops.

Since 𝑈 is itself a Turing machine, it has a corresponding language 𝐿(𝑈):

𝐿(𝑈) = (⟨𝑀⟩, 𝑥) ∶𝑀 is a program and 𝑀 accepts 𝑥

Thus, (⟨𝑀⟩, 𝑥) ∈ 𝐿(𝑈) exactly when 𝑀 accepts 𝑥. If 𝑀 rejects or loops on 𝑥, then (⟨𝑀⟩, 𝑥) ∉ 𝐿(𝑈).

The language of a universal Turing machine consists of pairs of machines and inputs, where the machine acceptsthe corresponding input. We use the notation 𝐿ACC to represent this language, the language of acceptance (ACC isshorthand for “acceptance”).

𝐿ACC = (⟨𝑀⟩, 𝑥) ∶𝑀 is a program and 𝑀 accepts 𝑥

Now that we have a definition for this interesting language 𝐿ACC, the obvious next question is whether or not 𝐿ACC

is decidable. For it to be decidable, there must exist some decider 𝐶 that takes a pair of inputs (⟨𝑀⟩, 𝑥) and has thefollowing behavior:

• If 𝑀 accepts 𝑥, then 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥 (i.e. 𝑀 rejects or loops on 𝑥), then 𝐶 rejects (⟨𝑀⟩, 𝑥).

Observe that a universal Turing machine 𝑈 does not meet these requirements. In particular:

• If 𝑀 loops on 𝑥, then 𝑈 loops on (⟨𝑀⟩, 𝑥).

• If 𝑀 loops on 𝑥, then 𝐶 rejects (⟨𝑀⟩, 𝑥).

Thus, even though 𝐿(𝑈) = 𝐿(𝐶) = 𝐿ACC, 𝑈 loops on some inputs and therefore is not a decider, while 𝐶 must halton all inputs. We know that 𝑈 exists, but we do not yet know whether or not 𝐶 does.

Claim 59 There is no decider for the language 𝐿ACC.

Proof 60 Assume for the purposes of contradiction that there exists a decider 𝐶 for the language 𝐿ACC. We canthen define a new program 𝐷 as follows:

𝐷 on input ⟨𝑀⟩ ∶Run 𝐶 on (⟨𝑀⟩, ⟨𝑀⟩)If 𝐶 accepts (⟨𝑀⟩, ⟨𝑀⟩), rejectElse accept

Given that 𝐶 exists, we can trivially implement 𝐷 as a program that invokes 𝐶 as a subroutine. Since 𝐶 alwayshalts, so does 𝐷.

Observe that 𝐶 takes the pair of inputs (⟨𝑀⟩, 𝑥), while 𝐷 takes only a single input ⟨𝑀⟩. 𝐷 then invokes 𝐶 onthe pair (⟨𝑀⟩, ⟨𝑀⟩).

By assumption, we have that:

99


• If 𝑀 accepts ⟨𝑀⟩, then 𝐶 accepts (⟨𝑀⟩, ⟨𝑀⟩).

• If 𝑀 rejects or loops on ⟨𝑀⟩, then 𝐶 rejects (⟨𝑀⟩, ⟨𝑀⟩).

As we discussed previously, it is perfectly valid for a program to take its own source code as input – the sourcecode is just a binary string like any other.

Given the behavior of 𝐶 above, we have by the construction of 𝐷 that:

• If 𝑀 accepts ⟨𝑀⟩, then 𝐷 rejects ⟨𝑀⟩.• If 𝑀 rejects or loops on ⟨𝑀⟩, then 𝐷 accepts ⟨𝑀⟩.

In other words, we have the following two possibilities for a program 𝑀 :

1. 𝑀 accepts its own source code. Then 𝐷 does not accept 𝑀 ’s source code.

2. 𝑀 does not accept its own source code. Then 𝐷 accepts 𝑀 ’s source code.

Thus, 𝐷 accepts the source code of those, and only those, programs that do not accept their own source code.

We now have a situation like the barber paradox: what does 𝐷 do when given its own source code as its input?Substituting 𝑀 = 𝐷 gives us the two possibilities:

1. 𝐷 accepts its own source code. Then 𝐷 does not accept 𝐷’s source code.

2. 𝐷 does not accept its own source code. Then 𝐷 accepts 𝐷’s source code.

Both cases lead to a contradiction! Since we arrive at a contradiction in all cases, our original assumption thatthe decider 𝐶 exists must be incorrect. Thus, no decider exists for 𝐿ACC, and 𝐿ACC is undecidable.

Constructing the Barber in C++

Recall our prior signature of a universal Turing machine 𝑈 in C++:

int U(string M_source, string x);

Since 𝐶 is a decider for the same language 𝐿ACC, it would have a similar signature:

int C(string M_source, string x);

We can then construct 𝐷 as follows:

int D(string M_source) return 1 - C(M_source, M_source);

Then if C(M_source, M_source) returns 1 (accepts), D(M_source) returns 0 (rejects). Conversely, ifC(M_source, M_source) returns 0, D(M_source) returns 1.

We can run D() on its own source:

string D_source ="int D(string M_source) \n"" return 1 - C(M_source, M_source);\n""";

D(D_source); // does this return 1 or 0?

This demonstrates how trivial it is to construct 𝐷 given the existence of 𝐶, and how we can run it on its own sourcecode. The only assumption we make is that 𝐶 exists, and since this leads to a contradiction, that assumption mustbe false.

Previously, we demonstrated with diagonalization that some undecidable language exists. We now have shown that𝐿ACC is an explicit example of an undecidable language. We will use that result to show that other, specific languagesare undecidable as well.

100


Recall that the only difference between a universal Turing machine 𝑈 , which we know to exist, and the nonexistentdecider 𝐶 is how they behave when given a pair (⟨𝑀⟩, 𝑥) such that 𝑀 loops on 𝑥. For inputs 𝑥 on which 𝑀 halts,𝑈 and 𝐶 have the exact same behavior. If we could just predetermine whether 𝑀 halts on a given input 𝑥, we coulduse actually construct 𝐶 from 𝑈 . Determining whether a machine 𝑀 halts on an input 𝑥 is the halting problem. Sincewe just argued that if we could solve the halting problem, we could also decide 𝐿ACC, this implies that the haltingproblem is undecidable as well!

We proceed to formalize this reasoning. Define 𝐿HALT, the language corresponding to the halting problem, as follows:

𝐿HALT = (⟨𝑀⟩, 𝑥) ∶𝑀 is a program that halts on input 𝑥

We have:

• If 𝑀 accepts 𝑥, then (⟨𝑀⟩, 𝑥) ∈ 𝐿HALT.

• If 𝑀 rejects 𝑥, then (⟨𝑀⟩, 𝑥) ∈ 𝐿HALT.

• If 𝑀 loops on 𝑥, then (⟨𝑀⟩, 𝑥) ∉ 𝐿HALT.

Claim 61 There is no decider for the language 𝐿HALT.

Proof 62 Assume for the purposes of contradiction that there exists a decider 𝐻 for the language 𝐿HALT. Wecan then construct a decider 𝐶 for 𝐿ACC as follows:

𝐶 on input (⟨𝑀⟩, 𝑥) ∶Run 𝐻 on (⟨𝑀⟩, 𝑥)If 𝐻 rejects (⟨𝑀⟩, 𝑥), rejectElse Run 𝑀 on 𝑥

If 𝑀 accepts 𝑥, acceptElse reject

Since this program does little more than run 𝐻 as a subroutine and simulate the execution of 𝑀 , we can triviallyconstruct it if we have a decider 𝐻 and an interpreter 𝑈 . We know that 𝑈 exists, so the only assumption we aremaking here is that 𝐻 exists as well.

We now analyze the behavior of 𝐶 on (⟨𝑀⟩, 𝑥):

• If 𝑀 accepts 𝑥, then (⟨𝑀⟩, 𝑥) ∈ 𝐿HALT, so 𝐻 accepts (⟨𝑀⟩, 𝑥). The program 𝐶 proceeds to run 𝑀 on𝑥. 𝑀 accepts 𝑥, so 𝐶 accepts as well.

• If 𝑀 rejects 𝑥, then (⟨𝑀⟩, 𝑥) ∈ 𝐿HALT, so 𝐻 accepts (⟨𝑀⟩, 𝑥). The program 𝐶 proceeds to run 𝑀 on𝑥. 𝑀 rejects 𝑥, so 𝐶 rejects as well.

• If 𝑀 loops on 𝑥, then (⟨𝑀⟩, 𝑥) ∉ 𝐿HALT, so 𝐻 rejects (⟨𝑀⟩, 𝑥). Then 𝐶 rejects as well.

Observe that 𝐶 accepts (⟨𝑀⟩, 𝑥) exactly when 𝑀 accepts 𝑥, and that it rejects (⟨𝑀⟩, 𝑥) when 𝑀 either rejectsor loops on 𝑥. Thus, 𝐶 accepts exactly those inputs that are in 𝐿ACC and rejects all other inputs, so 𝐶 is a deciderfor 𝐿ACC. This contradicts our known fact that 𝐿ACC has no decider. Since we have obtained a contradiction,our original assumption that a decider 𝐻 exists for 𝐿HALT is incorrect, and 𝐿HALT is undecidable.

We have demonstrated that 𝐿HALT is undecidable; this is quite unfortunate, since it is a fundamental problem insoftware and hardware design. Given a program, one of the most basic questions we can ask about its correctness iswhether it halts on all inputs. However, since 𝐿HALT is undecidable, we cannot answer this question in general evenfor a single input, much less the space of all possible inputs.

Our proof of the undecidability of the halting problem follows a common pattern for showing that a language 𝐵 isundecidable:

101


1. Assume that 𝐵 is decidable. Then it has some decider 𝑀𝐵 .

2. Use 𝑀𝐵 to construct a decider 𝑀𝐴 for a known undecidable language 𝐴.

3. Since 𝑀𝐴 is a decider for 𝐴 but we know that 𝐴 is undecidable, we have a contradiction. So the assumptionthat 𝐵 is decidable must be false.

This pattern is an instance of reducing problem 𝐴 to problem 𝐵: we can reduce the task of solving 𝐴 to that of solving𝐵, so that if we have a solver for 𝐵, we can construct a solver for 𝐴. We proceed to formalize this reasoning and applyit to more examples.

Exercise 63 Construct an elementary proof that 𝐿HALT is undecidable by following the steps below. This wasthe first “existence” proof for an undecidable language, and it dates back to Alan Turing.

a) Suppose there exists a Turing Machine 𝑀 that decides 𝐿HALT. Using 𝑀 , design a Turing Machine 𝑀′

such that for any Turing Machine 𝑇 , 𝑀 ′(⟨𝑇 ⟩) halts if and only if 𝑀(⟨𝑇 ⟩, ⟨𝑇 ⟩) rejects.

b) Design an input 𝑥 for 𝑀 ′ that creates paradoxical behavior, and conclude that 𝐿HALT is undecidable.

102

CHAPTER

ELEVEN

REDUCIBILITY

In the real world, almost all programs make use of existing components or libraries to accomplish their intended task.As an example, we take a look at a “hello world” program in C:

#include <stdio.h>

int main() printf("Hello world!");return 0;

This program uses the standard stdio.h library, invoking the printf() function located in that library. Observethat we do not need to know how printf() works to write this program – we need only know what it does. As longas the function does what it claims to do, our program will work correctly, even if the implementation of printf()were to be swapped out with a completely different one. Thus, we treat printf() as a black box that prints the givenarguments according to the format specification in the C language standard.

In a sense, our program reduces the task of printing Hello world! to that of a generic printf() function. Aslong as we have access to the latter, we can use it to rather trivially accomplish our task. A reduction then is atransformation of one task to another, and if a reduction exists from task 𝐴 to task 𝐵, then 𝐴 is no harder than 𝐵. Wedefine this notion formally for languages as follows:

Definition 64 (Turing Reduction) Language 𝐴 is Turing reducible to language 𝐵, written 𝐴 ≤T 𝐵, if thereexists a program 𝑀𝐴 that decides 𝐴 using a black box 𝑀𝐵 that decides 𝐵. The definition of 𝑀𝐴 in terms of 𝑀𝐵

is called a Turing reduction from language 𝐴 to language 𝐵.

The program 𝑀𝐴 can query the black box 𝑀𝐵 any number of times, including zero, one, or multiple times. It cansend any input to 𝑀𝐵 – the input to 𝑀𝐵 need not be the same as the input to 𝑀𝐴.

𝑀! 𝑀"𝑤 𝑤’

accept/reject

accept/reject

The intuition behind the notation 𝐴 ≤T 𝐵 is that 𝐴 is no harder to solve than 𝐵, since we know that we can construct asolver for 𝐴 from a solver for 𝐵. We demonstrated previously that 𝐿ACC ≤T 𝐿HALT, since we were able to constructa decider 𝐶 for 𝐿ACC from a black box 𝐻 that decides 𝐿HALT.

Note that 𝐴 ≤T 𝐵 does not require that a decider for 𝐵 actually exist. It is merely a conditional statement that if wedo have access to a decider for 𝐵, we can go ahead and construct a decider for 𝐴 using the decider for 𝐵 as a blackbox. In the case of 𝐿ACC ≤T 𝐿HALT, the decider 𝐻 for 𝐿HALT does not actually exist.

103


Theorem 65 Suppose 𝐴 ≤T 𝐵. Then if 𝐵 is decidable, 𝐴 is also decidable.

Proof 66 Since 𝐴 ≤T 𝐵, we have by definition that we can construct a decider 𝑀𝐴 for 𝐴 that uses a decider for𝐵 as a black box. Since 𝐵 is decidable, a decider 𝑀𝐵 does indeed exist for 𝐵. Thus, we can plug 𝑀𝐵 in as theblack box in our definition for 𝑀𝐴, resulting in a concrete decider for 𝐴. Since we now have an actual deciderfor 𝐴, 𝐴 is decidable.

The contraposition27 of Theorem 65 gives us the following:

Corollary 67 Suppose 𝐴 ≤T 𝐵. Then if 𝐴 is undecidable, 𝐵 is also undecidable.

This gives us a powerful tool to demonstrate that other languages are undecidable, once we have a known undecidablelanguage. Given that a language 𝐴 is known to be undecidable, we just have to show that 𝐴 ≤T 𝐵 to prove that 𝐵 isalso undecidable. We did exactly that for the halting problem, demonstrating that 𝐿ACC ≤T 𝐿HALT.

If we know that a particular language is decidable or undecidable, we can proceed to perform a reduction with thatlanguage as either 𝐴 or 𝐵 in the reduction 𝐴 ≤T 𝐵. Observe that there are four possibilities, but only two give usinformation about the decidability of the other language:

Known Fact Conclusion𝐴 decidable ?𝐴 undecidable 𝐵 undecidable𝐵 decidable 𝐴 decidable𝐵 undecidable ?

Thus, if we know that a language 𝐿 is decidable, we can use it as the right-hand side of a reduction 𝐴 ≤T 𝐿 to showthat 𝐴 is decidable. On the other hand, if we know that a language 𝐿 is undecidable, we can use it as the left-hand sideof a reduction 𝐿 ≤T 𝐵 to show that 𝐵 is undecidable. The other two possibilities tell us nothing about the decidabilityof the other language. In fact, if 𝐴 is decidable, we can show that 𝐴 ≤T 𝐵 for any language 𝐵

28– the decider for 𝐴simply ignores the black-box decider for 𝐵 and solves 𝐴 directly.

The Barber and 𝐿ACC Revisited

Now that we have a formal notion of Turing reducibility, we revisit the proof that 𝐿ACC is undecidable usingdiagonalization and a reduction. In our original proof, we were able to demonstrate a contradiction by constructinga decider 𝐷 such that

𝐷 accepts the source code of those, and only those, programs that do not accept their own source codeas input.

The language of 𝐷 is the set of source codes that it accepts, which is as follows:

𝐿BARBER = ⟨𝑀⟩ ∶𝑀 does not accept ⟨𝑀⟩

We can use diagonalization to demonstrate that this language is not the language of any machine and is thus unde-cidable. Construct an infinite table with row 𝑖 corresponding to machine 𝑀𝑖, and column 𝑗 corresponding to thesource code of machine 𝑀𝑗 . Since the set of machines is countable, we can indeed list machines along the rows

27 We can also demonstrate this through contradiction, since contraposition and contradiction are closely related. We have that 𝐴 ≤T 𝐵 andthat 𝐴 is undecidable. Suppose for the purposes of contradiction that 𝐵 is decidable. This means that there is some decider 𝑀𝐵 for 𝐵. Since𝐴 ≤T 𝐵, we can use 𝑀𝐵 as a black box to construct a decider 𝑀𝐴 for 𝐴. This contradicts the fact that 𝐴 is undecidable, so our assumption that𝐵 is decidable must be false.

28 The same is not true for the fourth case, when 𝐵 is undecidable – it is not the case that if 𝐵 is undecidable, any language 𝐴 is reducible to 𝐵.There are in fact hierarchies29 of undecidable languages, but they are beyond the scope of this text.

29 https://en.wikipedia.org/wiki/Arithmetical_hierarchy

104

https://en.wikipedia.org/wiki/Arithmetical_hierarchy


and their source codes along the columns. Entry (𝑖, 𝑗) of the table is whether or not machine 𝑀𝑖 accepts when runon the source code ⟨𝑀𝑗⟩:

⟨𝑀!⟩ ⟨𝑀"⟩ ⟨𝑀#⟩ ⟨𝑀$⟩ ⟨𝑀%⟩ ⟨𝑀&⟩ ⟨𝑀'⟩ ⟨𝑀(⟩ ⋯

𝑀! yes yes no yes no no yes no

𝑀" no no yes yes yes no yes no

𝑀# no no yes no no yes no no

𝑀$ no no yes yes yes no yes no

𝑀% no yes no no no yes no no

𝑀& yes no no yes yes no yes yes

𝑀' no no yes no no no no no

𝑀( no yes no yes no no yes yes

⋮ ⋱

𝐿)*+),+ ⟨𝑀"⟩ ⟨𝑀%⟩ ⟨𝑀&⟩ ⟨𝑀'⟩ ⋯

Entry (𝑖, 𝑖) is whether or not 𝑀𝑖 accepts its own source code. If we retain only the source codes of those machinesthat do not accept their own source code, we end up with the inverse of the diagonal. The resulting language𝐿BARBER is not the language of any machine 𝑀𝑖 in the table:

• If 𝑀𝑖 accepts ⟨𝑀𝑖⟩, then ⟨𝑀𝑖⟩ ∈ 𝐿(𝑀𝑖), but ⟨𝑀𝑖⟩ ∉ 𝐿BARBER. Thus, 𝐿(𝑀𝑖) ≠ 𝐿BARBER.

• If 𝑀𝑖 does not accept ⟨𝑀𝑖⟩, then ⟨𝑀𝑖⟩ ∉ 𝐿(𝑀𝑖), but ⟨𝑀𝑖⟩ ∈ 𝐿BARBER. Again, 𝐿(𝑀𝑖) ≠ 𝐿BARBER.

Since 𝐿BARBER is not the language of any machine, it is undecidable.

We can then proceed to show 𝐿BARBER ≤T 𝐿ACC using the same construction as before. Given a black box 𝐶 thatdecides 𝐿ACC, we can construct a decider 𝐷 for 𝐿BARBER as follows:

𝐷 on input ⟨𝑀⟩ ∶Run 𝐶 on (⟨𝑀⟩, ⟨𝑀⟩)If 𝐶 accepts (⟨𝑀⟩, ⟨𝑀⟩), rejectElse accept

𝐷 halts since it just calls 𝐶, which is a decider and halts. We then have:

• If ⟨𝑀⟩ ∈ 𝐿BARBER, then 𝑀 does not accept ⟨𝑀⟩. Thus, 𝐶 rejects (⟨𝑀⟩, ⟨𝑀⟩), so 𝐷 accepts ⟨𝑀⟩.• If ⟨𝑀⟩ ∉ 𝐿BARBER, then 𝑀 accepts ⟨𝑀⟩. Thus, 𝐶 accepts (⟨𝑀⟩, ⟨𝑀⟩), so 𝐷 rejects ⟨𝑀⟩.

Thus, 𝐷 decides 𝐿BARBER, and we have shown that 𝐿BARBER ≤T 𝐿ACC. Since 𝐿BARBER is undecidable, weconclude that 𝐿ACC is also undecidable.

105


We saw previously that 𝐿HALT is undecidable. We proceed to show that a more restricted form of this language,known as 𝐿𝜀-HALT, is also undecidable. Define 𝐿𝜀-HALT as follows:

𝐿𝜀-HALT = ⟨𝑀⟩ ∶𝑀 is a program that halts on the empty string 𝜀

Clearly we can decide 𝐿𝜀-HALT given a decider for 𝐿HALT:

𝐸 on input ⟨𝑀⟩ ∶Run 𝐻 on (⟨𝑀⟩, 𝜀) and answer as 𝐻

This demonstrates that 𝐿𝜀-HALT ≤T 𝐿HALT, but as we can see from the table above, that tells us nothing about thedecidability of 𝐿𝜀-HALT. Instead, we must show that an undecidable language is reducible to 𝐿𝜀-HALT, not the otherway around.

We show that 𝐿𝜀-HALT is undecidable by performing the Turing reduction 𝐿HALT ≤T 𝐿𝜀-HALT.

Suppose that 𝐸 is a black box that decides 𝐿𝜀-HALT. Then 𝐸 is given the single input ⟨𝑀⟩, the source code of aprogram 𝑀 . We have:

• If 𝑀 halts on (accepts or rejects) 𝜀, then 𝐸 accepts ⟨𝑀⟩.• If 𝑀 loops on 𝜀, then 𝐸 rejects ⟨𝑀⟩.

To show that 𝐿HALT ≤T 𝐿𝜀-HALT, we need to use 𝐸 as a black box to construct 𝐻 , a decider for 𝐿HALT. Specifically,𝐻 is given the pair of inputs (⟨𝑀⟩, 𝑥), and we require that:

• If 𝑀 halts on 𝑥, then 𝐻 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 loops on 𝑥, then 𝐻 rejects (⟨𝑀⟩, 𝑥).

We need to write a program that works on two inputs, but the black box we have at our disposal only takes one input.We need to somehow transform our two original inputs into a single input that can be passed to 𝐸. A typical strategyis to construct a new program on the fly that hardcodes the two original inputs, transforming them into a program thatis suitable to be passed to the black box. The result is as follows:

𝐻 on input (⟨𝑀⟩, 𝑥) ∶Construct a new program 𝑀

′∶

“𝑀 ′ on input 𝑤 ∶Run 𝑀 on 𝑥 and answer as 𝑀”

Run 𝐸 on ⟨𝑀 ′⟩If 𝐸 accepts ⟨𝑀 ′⟩, acceptElse reject

Constructing a Program on the Fly in C++

Assuming we have access to a C++ interpreter 𝑈 , we can trivially construct the source code of a program that runs𝑀 on 𝑥, given the source code for 𝑀 . The full definition of 𝐻 , a decider for 𝐿HALT, in terms of 𝐸, a decider for𝐿𝜀-HALT, is as follows:

int H(string M_source, string x) string Mp =

"#include \"U.hpp\"\n"+ "int Mprime(string w) \n"+ " return U(\"" + M_source + "\", \"" + x + "\");\n"+ "";

return E(Mp);

106


We hardcode the inputs M_source and x as part of the source code for Mprime()30. Since M_source is sourcecode in string format, we cannot invoke it directly, but we can pass it to a C++ interpreter to simulate its executionon x. We know that an interpreter 𝑈 for C++ exists, so we know that the code for Mprime() works as intended.

The only novel assumption we make in this construction is that the program 𝐸 exists, and we use it as a black boxin writing 𝐻 .

The newly constructed program 𝑀′ takes an input 𝑤, but it ignores that input and just runs 𝑀 on the original input 𝑥.

Thus:

• If 𝑀 accepts 𝑥, then 𝑀′ accepts all inputs 𝑤.

• If 𝑀 rejects 𝑥, then 𝑀′ rejects all inputs 𝑤.

• If 𝑀 loops on 𝑥, then 𝑀′ loops on all inputs 𝑤.

If 𝑀 accepts or rejects 𝑥, then 𝑀′ halts on all inputs, which means that it halts on the specific input 𝜀. On the other

hand, if 𝑀 loops on 𝑥, then 𝑀′ loops on all inputs, including the specific input 𝜀.

Observe that 𝐻 merely constructs 𝑀 ′ and passes it as an argument to 𝐸. 𝐻 does not actually run 𝑀′. The result from

𝐸 is as follows:

• If 𝑀 ′ halts on 𝜀, then 𝐸 accepts ⟨𝑀 ′⟩, so 𝐻 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 ′ loops on 𝜀, then 𝐸 rejects ⟨𝑀 ′⟩, so 𝐻 rejects (⟨𝑀⟩, 𝑥).

As we pointed out previously, 𝑀 ′ halts on 𝜀 exactly when 𝑀 halts on 𝑥, and 𝑀′ loops on 𝜀 exactly when 𝑀 loops on

𝑥. We thus have:

• If 𝑀 halts on 𝑥, 𝑀 ′ halts on 𝜀. Then 𝐸 accepts ⟨𝑀 ′⟩, so 𝐻 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 loops on 𝑥, 𝑀 ′ loops on 𝜀. Then 𝐸 rejects ⟨𝑀 ′⟩, so 𝐻 rejects (⟨𝑀⟩, 𝑥).

Thus, 𝐻 accepts (⟨𝑀⟩, 𝑥) when 𝑀 halts on 𝑥, and 𝐻 rejects (⟨𝑀⟩, 𝑥) when 𝑀 loops on 𝑥. This makes 𝐻 a deciderfor the language 𝐿HALT.

Since we constructed a decider 𝐻 for 𝐿HALT from a decider 𝐸 for 𝐿𝜀-HALT, we have demonstrated that 𝐿HALT ≤T

𝐿𝜀-HALT. Since 𝐿HALT is undecidable, it follows that 𝐿𝜀-HALT is also undecidable.

Example 68 Let

𝐿FOO = ⟨𝑀⟩ ∶ 𝑓𝑜𝑜 ∈ 𝐿(𝑀)

We show that 𝐿FOO is undecidable via the Turing reduction 𝐿ACC ≤T 𝐿FOO. Suppose that 𝐹 is a black box thatdecides 𝐿FOO. It takes a single input ⟨𝑀⟩, and we have:

• if 𝑀 accepts 𝑓𝑜𝑜, then 𝑓𝑜𝑜 ∈ 𝐿(𝑀) and 𝐹 accepts ⟨𝑀⟩• if 𝑀 rejects or loops on 𝑓𝑜𝑜, then 𝑓𝑜𝑜 ∉ 𝐿(𝑀) and 𝐹 rejects ⟨𝑀⟩

We need to construct a decider 𝐶 for 𝐿ACC that takes two inputs ⟨𝑀⟩ and 𝑥 and:

• if 𝑀 accepts 𝑥, then 𝐶 accepts (⟨𝑀⟩, 𝑥)

30 Since M_source and x may contain newlines and other characters that are not allowed in a C++ string literal, we would need to either processthe strings to replace such characters with their respective escape sequences or use raw string literalsPage 107, 31. The following is the relevant line ofcode using raw literals:

+ " return U(R\"@" + M_source + "@\", R\"@" + x + "@\");\n"

Here, @ represents a sequence of characters that does not appear in M_source or x.31 http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm

107

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm


• if 𝑀 rejects or loops on 𝑥, then 𝐶 rejects (⟨𝑀⟩, 𝑥)Passing ⟨𝑀⟩ directly to 𝐹 does not tell us anything about whether 𝑀 accepts a given input 𝑥. Instead, weconstruct a new program 𝑀

′ such that:

• if 𝑀 accepts 𝑥, then 𝑀′ accepts 𝑓𝑜𝑜

• if 𝑀 rejects or loops on 𝑥, then 𝑀′ rejects or loops on 𝑓𝑜𝑜

The following is one such program:

𝑀′ on input 𝑤 ∶

If 𝑤 = 𝑓𝑜𝑜, run 𝑀 on 𝑥 and answer as 𝑀Else reject

Here, by “answer as 𝑀”, we mean accept if 𝑀 accepts and reject if 𝑀 rejects. Observe that both 𝑀 and 𝑥 arehardcoded into the source code for 𝑀 ′.

The full construction of the decider 𝐶 is as follows:

𝐶 on input (⟨𝑀⟩, 𝑥) ∶Construct a new program 𝑀

′∶

“𝑀 ′ on input 𝑤 ∶If 𝑤 = 𝑓𝑜𝑜, run 𝑀 on 𝑥 and answer as 𝑀Else reject ”

Run 𝐹 on 𝑥 and answer as 𝐹

We analyze 𝐶:

• If 𝑀 accepts 𝑥, then 𝑀′ accepts 𝑓𝑜𝑜 and 𝑓𝑜𝑜 ∈ 𝐿(𝑀 ′). Thus, 𝐹 accepts ⟨𝑀 ′⟩, and 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥, then 𝑀′ does not accept 𝑓𝑜𝑜 and 𝑓𝑜𝑜 ∉ 𝐿(𝑀 ′). Thus, 𝐹 rejects ⟨𝑀 ′⟩, and 𝐶

rejects (⟨𝑀⟩, 𝑥).

Since 𝐶 accepts (⟨𝑀⟩, 𝑥) exactly when 𝑀 accepts 𝑥, 𝐶 is a decider for 𝐿ACC, and we have shown 𝐿ACC ≤T

𝐿FOO. We conclude that 𝐿FOO is undecidable.

Example 69 Define

𝐿EQ = (⟨𝑀1⟩, ⟨𝑀2⟩) ∶ 𝐿(𝑀1) = 𝐿(𝑀2)

This corresponds to the problem of determining whether two programs accept the same set of inputs. We showthat 𝐿EQ is undecidable using the reduction 𝐿ACC ≤T 𝐿EQ. Suppose 𝑄 is a black-box decider for 𝐿EQ. Then ittakes two inputs ⟨𝑀1⟩ and ⟨𝑀2⟩ and:

• if 𝐿(𝑀1) = 𝐿(𝑀2), then 𝑄 accepts (⟨𝑀1⟩, ⟨𝑀2⟩)• if 𝐿(𝑀1) ≠ 𝐿(𝑀2), then 𝑄 rejects (⟨𝑀1⟩, ⟨𝑀2⟩)

A decider 𝐶 for 𝐿ACC takes two inputs ⟨𝑀⟩ and 𝑥 and accepts exactly when 𝑀 accepts 𝑥. We do not care how𝑀 behaves on inputs other than 𝑥, so we likely do not want to pass ⟨𝑀⟩ directly to 𝑄. Instead, we construct anew program 𝑀1 such that it has a different language depending on whether 𝑀 accepts 𝑥. In particular, we willarrange for 𝐿(𝑀1) = Σ

∗ if 𝑀 accepts 𝑥, and 𝐿(𝑀1) = ∅ if 𝑀 does not accept 𝑥. The following definition of

108


𝑀1 does the trick:

𝑀1 on input 𝑤 ∶Run 𝑀 on 𝑥 and answer as 𝑀

We still need a second argument 𝑀2 to pass to 𝑄 – we fix 𝑀2 such that it accepts everything, so that 𝐿(𝑀1) =Σ∗= 𝐿(𝑀2) exactly when 𝑀 accepts 𝑥. Then the full definition of 𝐶 is as follows:

𝐶 on input (⟨𝑀⟩, 𝑥) ∶Construct a new program 𝑀1 ∶

“𝑀1 on input 𝑤 ∶Run 𝑀 on 𝑥 and answer as 𝑀”

Construct a new program 𝑀2 ∶

“𝑀2 on input 𝑤 ∶Accept ”

Run 𝑄 on (⟨𝑀1⟩, ⟨𝑀2⟩) and answer as 𝑄

We analyze 𝐶:

• If 𝑀 accepts 𝑥, then 𝑀1 accepts all inputs, so 𝐿(𝑀1) = Σ∗= 𝐿(𝑀2). Thus, 𝑄 accepts (⟨𝑀1⟩, ⟨𝑀2⟩),

and 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥, then 𝑀1 does not accept any input, so 𝐿(𝑀1) = ∅ ≠ 𝐿(𝑀2). Thus, 𝑄 rejects(⟨𝑀1⟩, ⟨𝑀2⟩), and 𝐶 rejects (⟨𝑀⟩, 𝑥).

Thus, 𝐶 is a decider for 𝐿ACC, and we have shown 𝐿ACC ≤T 𝐿EQ. We conclude that 𝐿EQ is undecidable.

Example 70 We have seen that 𝐿ACC ≤T 𝐿HALT. We now show that 𝐿HALT ≤T 𝐿ACC. Assume that we havea black-box decider 𝐶 for 𝐿ACC. We have:

• if 𝑀 accepts 𝑥, then 𝐶 accepts (⟨𝑀⟩, 𝑥)• if 𝑀 rejects 𝑥, then 𝐶 rejects (⟨𝑀⟩, 𝑥)• if 𝑀 loops on 𝑥, then 𝐶 rejects (⟨𝑀⟩, 𝑥)

We need to construct a decider 𝐻 for 𝐿HALT such that:

• if 𝑀 accepts 𝑥, then 𝐻 accepts (⟨𝑀⟩, 𝑥)• if 𝑀 rejects 𝑥, then 𝐻 accepts (⟨𝑀⟩, 𝑥)• if 𝑀 loops on 𝑥, then 𝐻 rejects (⟨𝑀⟩, 𝑥)

The only case where the behavior of 𝐶 and 𝐻 differs is when 𝑀 rejects 𝑥; if we were to invoke 𝐶 on (⟨𝑀⟩, 𝑥),we would get the right answer for when 𝑀 accepts or loops on 𝑥. To use 𝐶 to get the correct answer when 𝑀rejects 𝑥, we can construct a new program 𝑀

′ that accepts 𝑥 when 𝑀 rejects 𝑥. There are several ways to do so;here is one:

𝑀′ on input 𝑤 ∶

Run 𝑀 on 𝑥 and answer the opposite of 𝑀

109


We can then invoke 𝐶 twice, once on 𝑀 and once on 𝑀′. We define 𝐻 as follows:

𝐻 on input (⟨𝑀⟩, 𝑥) ∶Construct a new program 𝑀

′∶

“𝑀 ′ on input 𝑤 ∶Run 𝑀 on 𝑥 and answer the opposite of 𝑀”

Run 𝐶 on (⟨𝑀⟩, 𝑥) and (⟨𝑀 ′⟩, 𝑥)If 𝐶 accepts either (⟨𝑀⟩, 𝑥) or (⟨𝑀 ′⟩, 𝑥), acceptElse reject

We analyze 𝐻 as follows:

• If 𝑀 accepts 𝑥, then 𝐶 accepts (⟨𝑀⟩, 𝑥), so 𝐻 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 rejects 𝑥, then 𝑀′ accepts all inputs, so 𝐶 accepts (⟨𝑀 ′⟩, 𝑥) and 𝐻 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 loops on 𝑥, then 𝑀′ loops on all inputs, so 𝐶 rejects both (⟨𝑀⟩, 𝑥) and (⟨𝑀 ′⟩, 𝑥). Then 𝐻 rejects

(⟨𝑀⟩, 𝑥).

Thus, 𝐻 decides 𝐿HALT, and we have demonstrated that 𝐿HALT ≤T 𝐿ACC.

Observe that we invoked the black box 𝐶 twice. In general, for the Turing reduction 𝐴 ≤T 𝐵, the decider for 𝐴may call the black-box decider for 𝐵 any number of times, including zero or more than once.

Exercise 71 Prove that 𝐿ACC ≤T 𝐿𝜀-HALT.

Exercise 72 Let

𝐿𝜀-ACC = ⟨𝑀⟩ ∶𝑀 accepts 𝜀

Prove that 𝐿𝜀-ACC is undecidable.

Exercise 73 Determine, with proof, whether or not the following language is decidable:

𝐿3 = ⟨𝑀⟩ ∶ ∣𝐿(𝑀)∣ = 3

Exercise 74 Determine, with proof, whether or not the following language is decidable:

𝐿EVEN = ⟨𝑀⟩ ∶ ∣𝐿(𝑀)∣ is finite and even

110


11.1 Wang Tiling

The examples of undecidable problems we have seen thus far have been concerned with the behavior of Turing ma-chines or programs in general. However, undecidable problems are everywhere – as we discussed previously, thereare more undecidable problems than decidable ones. We take a look at undecidability in the context of a geometricproblem, that of tiling a plane32.

Given a finite set of shapes 𝑆, is it possible to tile a plane using only tiles from 𝑆? We are allowed to use as manyof each tile as we want, but we are not allowed to overlap them or leave any gaps between the tiles we use. For thepurposes of this discussion, we will also disallow rotation of tiles – typically, we can just increase the tile set withrotated copies instead.

An example of a tiling is the following from Study of Regular Division of the Plane with Reptiles by M. C. Escher33:

Here, the plane is tiled with a regular pattern using three reptile-shaped tiles.

Another example more akin to a jigsaw puzzle is using the following set of tiles:

We can construct a regular grouping of tiles that fit together:

We can then repeat this pattern to tile the whole plane:

32 https://en.wikipedia.org/wiki/Tessellation33 https://en.wikipedia.org/wiki/M._C._Escher

11.1. Wang Tiling 111

https://en.wikipedia.org/wiki/Tessellation

https://en.wikipedia.org/wiki/M._C._Escher


While the two examples above admit tilings that are periodic, it is also possible to tile the plane in a non-periodicmanner34. The following is an example of a Penrose tiling35 that does so:

Not all tile sets 𝑆 admit tilings of the entire plane. Given a set of tiles 𝑆, we would like to use an algorithm to determinewhether it is possible to tile the plane with that set. We define the language 𝐿TILE as follows:

𝐿TILE = ⟨𝑆⟩ ∶ 𝑆 is a set of tiles that admits a tiling of the entire plane

Is 𝐿TILE decidable? It turns out the answer is no. We can use tiling to simulate a Turing machine, and we can constructa tile set 𝑆𝑀 from a Turing machine 𝑀 such that deciding whether 𝑆𝑀 tiles the plane is equivalent to deciding whether⟨𝑀⟩ ∈ 𝐿𝜀-HALT. This means that 𝐿𝜀-HALT ≤T 𝐿TILE, so that 𝐿TILE is undecidable.

For the rest of our discussion, we will restrict ourselves to the slightly simpler problem of determining whether a tileset 𝑆 can tile a quadrant of the plane rather than a whole plane. Thus, the language we examine will be as follows:

𝐿QTILE = ⟨𝑆⟩ ∶ 𝑆 is a set of tiles that admits a tiling of a quadrant of the plane34 https://en.wikipedia.org/wiki/Aperiodic_tiling35 https://en.wikipedia.org/wiki/Penrose_tiling


https://en.wikipedia.org/wiki/Aperiodic_tiling

https://en.wikipedia.org/wiki/Aperiodic_tiling

https://en.wikipedia.org/wiki/Penrose_tiling


We consider tiles with a particular structure, namely squares with a color on each edge. These are known as Wangtiles.

1 2 3green

green

black

red

green

black

black

blue

red

black

blue red

Two tiles may only be adjacent if the colors match on their shared border. Tiles may not be rotated or flipped. We thenpick a color for the boundary of the quadrant; here, we have chosen black for the boundary.

2

1

1

3

The question then is whether or not a given set of tiles can tile the whole quadrant. For the tile set above, we can seethat it cannot. Only tile 2 may be placed in the bottom-left corner, then only tile 3 may be to its right and only tile 1above. While we can continue placing more copies of tile 1 in the upward direction, no tile has a left boundary that iscolored red, so we cannot fill out the rest of the quadrant to the right.

While we were able to reason that in this specific case, the quadrant cannot be tiled with the given set, we cannotdo so in general with any computational process. To demonstrate this, we will construct a tile set that simulates theexecution of a Turing machine. We will illustrate this on a single, simple machine, but the same process can be donefor any machine.

Before proceeding to our construction, we define a notation for representing the execution state, or configuration, of aTuring machine. We need to capture several details that uniquely describe the configuration of the machine:

• the state 𝑞 ∈ 𝑄 in which the Turing machine is

• the contents of the tape

• the position of the head

We represent these details with an infinite string over the alphabet Γ ∪ 𝑄. Specifically, the string contains the fullcontents of the tape, plus the state 𝑞 ∈ 𝑄 immediately to the left of the symbol for the cell at which the head is located.Thus, if the input to a machine is 𝑠1𝑠2 . . . 𝑠𝑛 and the initial state is 𝑞0, the following encodes the starting configurationof the machine:

𝑞0𝑠1𝑠2 . . . 𝑠𝑛⊥⊥ . . .



Since the head is at the leftmost cell and the state is 𝑞0, the string has 𝑞0 to the left of the leftmost tape symbol. Thenif the transition function gives 𝛿(𝑞0, 𝑠1) = (𝑞′, 𝑠′, 𝑅), the new configuration is:

𝑠′𝑞′𝑠2 . . . 𝑠𝑛⊥⊥ . . .

The first cell has been modified to 𝑠′, the machine is in state 𝑞

′, and the head is over the second cell, represented hereby writing 𝑞

′ to the left of that cell’s symbol.

As a more concrete example, the configuration of the machine we saw previously that accepts strings containing onlyones (page 63), when run on the input 111010111, is as follows at each step:

𝑞𝑠𝑡𝑎𝑟𝑡111010111⊥⊥⊥⊥ . . .

1𝑞𝑠𝑡𝑎𝑟𝑡11010111⊥⊥⊥⊥ . . .

11𝑞𝑠𝑡𝑎𝑟𝑡1010111⊥⊥⊥⊥ . . .

111𝑞𝑠𝑡𝑎𝑟𝑡010111⊥⊥⊥⊥ . . .

1110𝑞𝑟𝑒𝑗𝑒𝑐𝑡10111⊥⊥⊥⊥ . . .

We will encode such a configuration string using an infinite sequence of tiles, where the symbols along the top edgesof the tiles comprise the configuration.

We consider a simple machine that just alternately writes two different symbols to the tape. Given an initial blank tape(i.e. 𝜀 as the input), the machine will write symbols indefinitely, never terminating.

q1 q21 ! 0, R


1 ! 1, R

q0 q1

? ! a, R

? ! b, R

1

We start constructing our tile set by introducing a tile for each symbol in the tape alphabet. Each tile is “colored” bythe symbol at the top and bottom, and the left and right edges are colored blank.

𝑎

𝑎

𝑏

𝑏

⊥

⊥

We then examine the transition function. For each leftward transition

𝛿(𝑝, 𝑖) = (𝑞, 𝑗, 𝐿)

and each element of the tape alphabet 𝑠 ∈ Γ, we introduce the following pair of tiles:

𝑞, 𝑠

𝑠𝑞

𝑗

(𝑝, 𝑖)𝑞

This represents part of the configuration when applying the transition. The bottom of the right tile indicates that weare originally in state 𝑝 with the head over the symbol 𝑖, and the bottom of the left tile indicates there is a symbol 𝑠 in



the cell to the left. The top of the tiles represents the result of the transition. The new state is 𝑞 and the head movesto the left, which we indicate by the top of the left tile. The symbol in the right tile is overwritten with 𝑗, which weindicate at the top of the right tile. Finally, we label the adjacent edges with 𝑞 to enable the two tiles to be adjacent inour tiling.

Similarly, for each rightward transition

𝛿(𝑝, 𝑖) = (𝑞, 𝑗, 𝑅)

and tape symbol 𝑠, we introduce the following pair of tiles:

𝑗

(𝑝, 𝑖)𝑞

𝑞, 𝑠

𝑠𝑞

Here, the head moves to the right, so we see that the original state 𝑝 appears at the bottom in the left tile and the newstate 𝑞 appears at the top in the right tile.

For our simple Turing machine above, we only have two rightward transitions and three tape symbols, so the resultingtransition tiles are those below:

𝑏

(𝑞!,⊥)𝑞"

𝑞" ,⊥

⊥𝑞"

𝑎

(𝑞",⊥)𝑞!

𝑞! ,⊥

⊥𝑞!

𝑏

(𝑞!,⊥)𝑞"

𝑞" , 𝑎

𝑎𝑞"

𝑏

(𝑞!,⊥)𝑞"

𝑞" , 𝑏

𝑏𝑞"

𝑎

(𝑞",⊥)𝑞!

𝑞! , 𝑎

𝑎𝑞!

𝑎

(𝑞",⊥)𝑞!

𝑞! , 𝑏

𝑏𝑞!

We have repeated tiles here, but since we are constructing a set, the duplicates are consolidated into a single copy.

Finally, we introduce start tiles to encode the initial state of the machine. Since we are interested in the empty stringas input, only blank symbols appear in our start tiles. We have a corner tile that encodes the start state and the initialhead position all the way at the left, and we have a bottom tile that encodes the remaining blank cells:

𝑞! ,⊥∗

⊥∗ ∗

The full set of tiles for the simple Turing machine is as follows, with duplicates removed:



𝑎

𝑎

𝑏

𝑏

⊥

⊥

𝑞! ,⊥∗

⊥∗ ∗

𝑏

(𝑞",⊥)𝑞!

𝑞! ,⊥

⊥𝑞!

𝑞! , 𝑎

𝑎𝑞!

𝑞! , 𝑏

𝑏𝑞!

𝑎

(𝑞!,⊥)𝑞"

𝑞" ,⊥

⊥𝑞"

𝑞" , 𝑎

𝑎𝑞"

𝑞" , 𝑏

𝑏𝑞"

Now that we have fleshed out our tile set, we can proceed to attempt to tile the quadrant. The quadrant boundarymatches with an empty tile edge. The only tile that can go in the corner is our corner start tile.

𝑞! ,⊥∗

Tim

e

Turing Machine Tape

Once we place the corner tile, the only tile that can go to the right of it is a bottom start tile. We can place copies ofthis tile indefinitely to the right.



𝑞! ,⊥∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

Tim

e

Turing Machine Tape

The resulting bottom row encodes the initial state of the machine: every cell contains the blank symbol (since the inputis 𝜀), the initial state is 𝑞0, and the head is on the leftmost cell. The position of the head is indicated by which tile hasboth a state and a symbol in its top edge.

Now that we have the bottom row, we look for tiles that can go above that row. Only one of our tiles has (𝑞0,⊥) at itsbottom, and it is one of the tiles for the transition 𝛿(𝑞0,⊥) = (𝑞1, 𝑎, 𝑅). This forces us to use one of the other tiles forthat transition as well, and since the symbol to the right is⊥, we use the corresponding transition tile that has⊥ at thebottom. The remaining tiles in the row can be symbol tiles for⊥.



𝑞! ,⊥∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

𝑎

(𝑞!,⊥)𝑞"

𝑞" ,⊥

⊥𝑞"

Tim

e

Turing Machine Tape

⊥

⊥

⊥

⊥

⊥

⊥⊥

∗ ∗⊥

∗ ∗

⊥

⊥

⊥

⊥

The configuration represented in the second row above is the machine having written 𝑎 in the first cell, being in thestate 𝑞1, and with the head on the second cell. This is exactly the result of the transition.

For the next row, we have to use a symbol tile for 𝑎 at the left. Then we have to use transition tiles for 𝛿(𝑞1,⊥) =(𝑞0, 𝑏, 𝑅). Finally, we have to fill out the rest of the row with symbol tiles for⊥.

𝑞! ,⊥∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

𝑎

(𝑞!,⊥)𝑞"

𝑞" ,⊥

⊥𝑞"

Tim

e

𝑎

𝑎

𝑏

(𝑞",⊥)𝑞!

𝑞! ,⊥

⊥𝑞!

Turing Machine Tape

⊥∗ ∗

⊥∗ ∗

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

Again, the configuration encoded here corresponds exactly with the state of the machine after two steps. We canproceed to do the next step as well.



Tim

e𝑎

𝑎

𝑏

𝑏

𝑎

(𝑞!,⊥)𝑞"

𝑞" ,⊥

⊥𝑞"

Turing Machine Tape

⊥

⊥

⊥

⊥

⊥

⊥

𝑞! ,⊥∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

⊥∗ ∗

𝑎

(𝑞!,⊥)𝑞"

𝑞" ,⊥

⊥𝑞"

𝑎

𝑎

𝑏

(𝑞",⊥)𝑞!

𝑞! ,⊥

⊥𝑞!

⊥∗ ∗

⊥∗ ∗

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

⊥

Since each row corresponds to a step of the Turing machine, the question of whether or not the quadrant can be tiled isthe same as whether the machine performs infinitely many steps, i.e. whether it loops. Since the input we are workingon here is 𝜀, we can tile the quadrant exactly when the machine 𝑀 loops on 𝜀.

The following construction formalizes our proof of undecidability of 𝐿QTILE by performing the reduction 𝐿𝜀-HALT ≤T

𝐿QTILE. Assume we have a decider 𝑇 for 𝐿QTILE. We can then construct a decider 𝐸 for 𝐿𝜀-HALT:

𝐸 on input ⟨𝑀⟩ ∶Construct tile set 𝑆 from 𝑀 as outlined aboveRun 𝑇 on ⟨𝑆⟩If 𝑇 accepts ⟨𝑆⟩, rejectElse accept

We then have the following cases:

• If 𝑀 halts on 𝜀, 𝑇 rejects ⟨𝑆⟩, so 𝐸 accepts ⟨𝑀⟩.• If 𝑀 loops on 𝜀, 𝑇 accepts ⟨𝑆⟩, so 𝐸 rejects ⟨𝑀⟩.

Thus, 𝐸 is a decider for 𝐿𝜀-HALT, and we have demonstrated that 𝐿𝜀-HALT ≤T 𝐿QTILE. Since 𝐿𝜀-HALT is undecidable,so is 𝐿QTILE.

Exercise 75 Modify the construction we used in the reduction 𝐿𝜀-HALT ≤T 𝐿QTILE to show instead that𝐿HALT ≤T 𝐿QTILE.

Hint: Only the set of start tiles needs to be changed, and the set will be different for different inputs.


CHAPTER

TWELVE

RECOGNIZABILITY

We have seen several examples of undecidable languages, including 𝐿ACC, 𝐿HALT, and 𝐿𝜀-HALT. As we discussedbefore, for a machine 𝑀 to decide a language 𝐴, it must halt on every input:


• If 𝑥 ∉ 𝐴, 𝑀 rejects 𝑥.

We also saw that not all machines are deciders. If a machine 𝑀 is not a decider, it can still be associated with alanguage 𝐴 if it accepts all inputs in 𝐴 and does not accept any inputs not in 𝐴:


• If 𝑥 ∉ 𝐴, 𝑀 rejects or loops on 𝑥.

If this is the case, then 𝐿(𝑀) = 𝐴 and we say that 𝑀 recognizes the language 𝐴. Since every machine 𝑀 has anassociated language 𝐿(𝑀), every machine 𝑀 is a recognizer for its own language 𝐿(𝑀). But a machine is only adecider if it halts on every input.

A language 𝐴 is recognizable if there exists some machine that recognizes 𝐴, i.e. there exists a machine 𝑀 such that𝐿(𝑀) = 𝐴. A language may be recognizable but undecidable. In fact, we saw that the language of a universal Turingmachine (an interpreter) 𝑈 is 𝐿(𝑈) = 𝐿ACC. Thus, 𝐿ACC is recognizable.

Are all undecidable languages recognizable? In other words, if we give up the requirement of always halting, is there amachine associated with every possible language? Unfortunately, we have already seen (page 94) this is not the case –we constructed a language 𝐴 that is not the language of any machine. Thus, we proved that there exist unrecognizablelanguages. However, our proof was not constructive, so we have yet to see a concrete example of an unrecognizablelanguage.

How do we show that a language 𝐿 is recognizable? Similar to how we show that a language is decidable, we needonly write a recognizer for 𝐿. We have already seen that the universal Turing machine is a recognizer for 𝐿ACC. Weproceed to construct a recognizer for 𝐿HALT:

𝐻𝑟 on input (⟨𝑀⟩, 𝑥) ∶Run 𝑀 on 𝑥

Accept

We need to show that if (⟨𝑀⟩, 𝑥) ∈ 𝐿HALT, then 𝐻𝑟 accepts (⟨𝑀⟩, 𝑥), otherwise 𝐻𝑟 rejects or loops on (⟨𝑀⟩, 𝑥).

• If (⟨𝑀⟩, 𝑥) ∈ 𝐿HALT, then 𝑀 halts on 𝑥. In this case, the simulation of 𝑀 on 𝑥 in 𝐻𝑟 terminates, and 𝐻𝑟

proceeds to accept (⟨𝑀⟩, 𝑥).

• If (⟨𝑀⟩, 𝑥) ∉ 𝐿HALT, then 𝑀 loops on 𝑥. In this case, the simulation of 𝑀 on 𝑥 does not terminate, so 𝐻𝑟

loops on (⟨𝑀⟩, 𝑥).

Thus, 𝐻𝑟 correctly recognizes 𝐿HALT.

As another example, suppose that 𝐴1 and 𝐴2 are recognizable languages. Is 𝐵 = 𝐴1 ∪𝐴2 = 𝑥 ∶ 𝑥 ∈ 𝐴1 or 𝑥 ∈ 𝐴2recognizable? In other words, is the class of recognizable languages closed under the union operation?

120


We previously demonstrated (page 97) that the class of decidable languages is closed under union. We proceed tofollow a similar strategy here. Since 𝐴1 and 𝐴2 are recognizable, there exist machines 𝑀1 and 𝑀2 that recognize 𝐴1

and 𝐴2, respectively. We construct a machine 𝑀 to recognize 𝐴1 ∪𝐴2:

𝑀 on input 𝑥 ∶Run 𝑀1 on 𝑥, accept if 𝑀1 acceptsRun 𝑀2 on 𝑥, accept if 𝑀2 acceptsReject

Unfortunately, this construction does not work when 𝑀1 and 𝑀2 are not deciders. For instance, if 𝑀1 were to loopon 𝑥, we would never get to running 𝑀2 on 𝑥. If it were the case that 𝑀2 accepts 𝑥, 𝑀 would never get around toaccepting 𝑥 since it would be stuck running the non-terminating execution of 𝑀1 on 𝑥.

The problem here is similar to what we encountered in showing that the set of integers is countable (page 90). Ourattempt to do so by listing all the nonnegative integers first failed because we would never exhaust the nonnegativeintegers to get to the negative integers. The solution was to interleave the nonnegative integers with the negativeintegers, so that we would eventually reach any specific element. We can follow a similar strategy here of alternation,interleaving the execution of 𝑀1 and 𝑀2:

𝑀 on input 𝑥 ∶Alternate between the executions of 𝑀1 and 𝑀2 on 𝑥

Accept if any of 𝑀1 and 𝑀2 acceptsReject if both 𝑀1 and 𝑀2 reject

How can we alternate between the execution of two programs 𝑀1 and 𝑀2? If we are simulating their execution, thismeans running one step of 𝑀1 (e.g. applying its transition function once), then one step of 𝑀2, then another stepof 𝑀1, and so on. On an actual computer, we can run a program step-by-step through a debugger. Similarly, realcomputers simulate parallel execution on sequential processing units by rapidly switching between programs. Thus,we can alternate between programs both in theory and in practice.

We proceed to analyze the behavior of 𝑀 :

• If 𝑥 ∈ 𝐴1, then 𝑀1 accepts 𝑥 since 𝑀1 is a recognizer for 𝐴1. Since 𝑀 alternates between the executions of𝑀1 and 𝑀2 on 𝑥, it will eventually get to the point where 𝑀1 accepts (or 𝑀2 accepts first), so 𝑀 accepts 𝑥.

• If 𝑥 ∉ 𝐴1, then 𝑀1 rejects or loops on 𝑥. There are two cases depending on whether 𝑥 is or isn’t in 𝐴2:

– If 𝑥 ∈ 𝐴2, 𝑀2 accepts 𝑥. Since 𝑀 alternates between executing 𝑀1 and 𝑀2, it will eventually reach apoint where 𝑀2 accepts 𝑥, and 𝑀 will also accept 𝑥.

– If 𝑥 ∉ 𝐴2, 𝑀2 rejects or loops on 𝑥. If both 𝑀1 and 𝑀2 reject 𝑥, then 𝑀 proceeds to reject 𝑥. If either of𝑀1 or 𝑀2 loops on 𝑥, then 𝑀 will never finish simulating execution of that machine, so 𝑀 will loop on𝑥. In either case, 𝑀 rejects or loops on 𝑥.

Thus, 𝑀 accepts all inputs that are in 𝐴1 or 𝐴2, and it rejects or loops on an input 𝑥 if 𝑥 ∉ 𝐴1 ∧ 𝑥 ∉ 𝐴2. Since𝐵 = 𝐴1∪𝐴2, this means that 𝑀 accepts all inputs in 𝐵 and rejects or loops on all inputs not in 𝐵, so 𝑀 is a recognizerfor 𝐵. Since there exists a machine that decides 𝐵, 𝐵 is a recognizable language.

We conclude that the class of recognizable languages is closed under union, and we can similarly show that it is closedunder intersection.

Exercise 76 Show that the class of recognizable languages is closed under intersection.

Is the class of recognizable languages closed under complement? In other words, if 𝐴 is a recognizable language, is𝐴 recognizable?

Claim 77 If a language 𝐴 and its complement 𝐴 are both recognizable, then 𝐴 is decidable.

121


Proof 78 Suppose both 𝐴 and 𝐴 are recognizable. Then there exist machines 𝑀1 and 𝑀2 that recognize 𝐴 and𝐴, respectively. Since 𝑀1 recognizes 𝐴, we have:

• If 𝑥 ∈ 𝐴, then 𝑀1 accepts 𝑥.

• If 𝑥 ∉ 𝐴, then 𝑀1 rejects or loops on 𝑥.

Similarly, since 𝑀2 recognizes 𝐴, we have:

• If 𝑥 ∈ 𝐴, then 𝑀2 rejects or loops on 𝑥.

• If 𝑥 ∉ 𝐴, then 𝑀2 accepts on 𝑥.

This follows from the fact that 𝐴 and 𝐴 partition the universe Σ∗, so 𝑥 ∈ 𝐴 means 𝑥 ∉ 𝐴 and 𝑥 ∉ 𝐴 means

𝑥 ∈ 𝐴.

𝑥 ∶ 𝑥 ∈ 𝐴

𝑥 ∶ 𝑥 ∉ 𝐴

Σ∗𝑨

𝑨"

We proceed to construct a decider 𝑀 for 𝐴:

𝑀 on input 𝑥 ∶Alternate between the executions of 𝑀1 and 𝑀2 on 𝑥

If 𝑀1 accepts, acceptIf 𝑀2 accepts, reject

Then there are the following cases:

• If 𝑥 ∈ 𝐴, then 𝑀1 accepts 𝑥, so the simulation of 𝑀1 on 𝑥 will eventually accept. On the other hand, 𝑀2

must reject or loop on 𝑥, so 𝑀2 will not accept 𝑥. Thus, only the first conditional in 𝑀 applies, and 𝑀accepts 𝑥.

• If 𝑥 ∉ 𝐴, then 𝑀2 accepts 𝑥, so the simulation of 𝑀2 on 𝑥 will eventually accept. On the other hand, 𝑀1

must reject or loop on 𝑥, so 𝑀1 will not accept 𝑥. Thus, only the second conditional in 𝑀 applies, and 𝑀rejects 𝑥.

Since 𝑀 accepts all inputs 𝑥 ∈ 𝐴 and rejects all inputs 𝑥 ∉ 𝐴, 𝑀 is a decider for 𝐴, and 𝐴 is a decidablelanguage.

The contraposition of Claim 77 is that if a language 𝐴 is undecidable, then at least one of 𝐴 and 𝐴 is unrecognizable.This gives us a tool to demonstrate that a language is unrecognizable.

122


Corollary 79 If a language 𝐴 is undecidable but recognizable, then 𝐴 is unrecognizable.

We can immediately conclude that

𝐿ACC = (⟨𝑀⟩, 𝑥) ∶𝑀 is a program and 𝑀 does not accept 𝑥

and

𝐿HALT = (⟨𝑀⟩, 𝑥) ∶𝑀 is a program that loops on input 𝑥

are unrecognizable. Since their complements are recognizable, we refer to these languages as co-recognizable.

Definition 80 A language 𝐿 is co-recognizable if 𝐿 is recognizable.

Since the class of decidable languages is closed under complement and all decidable languages are recognizable, adecidable language is both recognizable and co-recognizable. In fact, Claim 77 demonstrates that the class of decidablelanguages is exactly the intersection of the recognizable and co-recognizable languages36.

Since every program 𝑀 recognizes exactly one language 𝐿(𝑀), it “co-recognizes” exactly one language 𝐿(𝑀). Aswe discussed previously (page 94), the set of programs is countable, and since each program corresponds to onlya single recognizable and co-recognizable language, the set of recognizable and co-recognizable languages is alsocountable. As the set of languages is uncountable, most languages are neither recognizable nor co-recognizable.

Example 81 Define the langage

𝐿NO-FOO = ⟨𝑀⟩ ∶ 𝑓𝑜𝑜 ∉ 𝐿(𝑀)

We demonstrate that 𝐿NO-FOO is unrecognizable.

We start by showing that it is undecidable via the reduction 𝐿ACC ≤T 𝐿NO-FOO. Assume we have a decider 𝐹for 𝐿NO-FOO. We construct a decider 𝐶 for 𝐿ACC as follows:


′∶


Run 𝐹 on ⟨𝑀 ′⟩ and answer the opposite of 𝐹

We analyze 𝐶 as follows:

• If 𝑀 accepts 𝑥, then 𝑀′ accepts all inputs, including 𝑓𝑜𝑜, so 𝑓𝑜𝑜 ∈ 𝐿(𝑀) and 𝐹 rejects ⟨𝑀 ′⟩. Then 𝐶

accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 rejects 𝑥, then 𝑀′ rejects all inputs, including 𝑓𝑜𝑜, so 𝑓𝑜𝑜 ∉ 𝐿(𝑀) and 𝐹 accepts ⟨𝑀 ′⟩. Then 𝐶

rejects (⟨𝑀⟩, 𝑥).

• If 𝑀 loops on 𝑥, then 𝑀′ loops on all inputs, including 𝑓𝑜𝑜, so 𝑓𝑜𝑜 ∉ 𝐿(𝑀) and 𝐹 accepts ⟨𝑀 ′⟩. Then

𝐶 rejects (⟨𝑀⟩, 𝑥).

Thus, 𝐶 is a decider for 𝐿ACC. We have demonstrated that 𝐿ACC ≤T 𝐿NO-FOO, and we conclude that 𝐿NO-FOO

is undecidable.

36 The class of recognizable languages is denoted RE (short for recursively enumerable, an equivalent definition of recognizable), and the classof co-recognizable languages is denoted coRE. The class of decidable languages is denoted R (short for recursive). We have RE∩ coRE = R. It ispossible for a language to be neither in RE nor in coRE, but how to prove this is beyond the scope of this text.

123


We now demonstrate that

𝐿NO-FOO = ⟨𝑀⟩ ∶ 𝑓𝑜𝑜 ∈ 𝐿(𝑀)

is recognizable. The following program recognizes it:

𝑅 on input ⟨𝑀⟩ ∶Run 𝑀 on 𝑓𝑜𝑜 and answer as 𝑀

We have:

• If 𝑀 accepts 𝑓𝑜𝑜, then 𝑅 accepts ⟨𝑀⟩.• If 𝑀 rejects 𝑓𝑜𝑜, then 𝑅 rejects ⟨𝑀⟩.• If 𝑀 loops on 𝑓𝑜𝑜, then 𝑅 loops on ⟨𝑀⟩.

We thus have that 𝑅 accepts ⟨𝑀⟩ when ⟨𝑀⟩ ∈ 𝐿NO-FOO, and 𝑅 rejects or loops on ⟨𝑀⟩ when ⟨𝑀⟩ ∉ 𝐿NO-FOO.Thus, 𝑅 is a recognizer for 𝐿NO-FOO, and 𝐿NO-FOO is recognizable.

Since 𝐿NO-FOO is undecidable, its complement is also undecidable as the class of decidable languages is closedunder complement. Then we have that 𝐿NO-FOO is undecidable but recognizable, so we conclude from Corollary79 that its complement 𝐿NO-FOO is unrecognizable.

Example 82 Define the langage

𝐿NO-FOO-BAR = ⟨𝑀⟩ ∶ 𝑓𝑜𝑜 ∉ 𝐿(𝑀) and 𝑏𝑎𝑟 ∉ 𝐿(𝑀)

The same construction as for 𝐿NO-FOO shows that 𝐿ACC ≤T 𝐿NO-FOO-BAR, so 𝐿NO-FOO-BAR is undecidable.

We demonstrate that 𝐿NO-FOO-BAR is unrecognizable by showing that its complement

𝐿NO-FOO-BAR = ⟨𝑀⟩ ∶ 𝑓𝑜𝑜 ∈ 𝐿(𝑀) or 𝑏𝑎𝑟 ∈ 𝐿(𝑀)

is recognizable. The following is a recognizer for it:

𝑅 on input ⟨𝑀⟩ ∶Alternate between the executions of 𝑀 on 𝑓𝑜𝑜 and 𝑏𝑎𝑟

If at least one execution accepts, acceptElse reject

We have:

• If 𝑀 accepts 𝑓𝑜𝑜 or accepts 𝑏𝑎𝑟, then 𝑅 accepts ⟨𝑀⟩.• If 𝑀 rejects 𝑓𝑜𝑜 and rejects 𝑏𝑎𝑟, then 𝑅 rejects ⟨𝑀⟩.• If 𝑀 does not accept either 𝑓𝑜𝑜 or 𝑏𝑎𝑟 and loops on at least one of the two, then 𝑅 loops on ⟨𝑀⟩.

We thus have that 𝑅 accepts ⟨𝑀⟩ when ⟨𝑀⟩ ∈ 𝐿NO-FOO-BAR, and 𝑅 rejects or loops on ⟨𝑀⟩ when ⟨𝑀⟩ ∉𝐿NO-FOO-BAR. Thus, 𝑅 is a recognizer for 𝐿NO-FOO-BAR, and 𝐿NO-FOO-BAR is recognizable. We concludefrom Corollary 79 that 𝐿NO-FOO-BAR is unrecognizable.

124


12.1 Dovetailing

As another example of an unrecognizable language, define 𝐿∅ as:

𝐿∅ = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) = ∅

In other words, 𝐿∅ is the set of source codes for programs that do not accept any input.

Note that 𝐿∅ is not the same language as ∅ = . The latter is the language that contains no elements, and thefollowing is a decider for it:

𝐷∅ on input 𝑥 ∶Reject

𝐿∅ on the other hand is a language that contains the source codes of programs 𝑀 such that 𝐿(𝑀) = ∅, i.e. 𝑀 doesnot accept any inputs. There are infinitely many such programs; the following is another example distinct from 𝐷∅:

𝑀′ on input 𝑥 ∶

If ∣𝑥∣ < 5, rejectElse loop

𝑀′ is not a decider since it may loop, but it is still the case that 𝐿(𝑀 ′) = ∅ since 𝑀

′ does not accept any inputs.Thus, 𝑀 ′

∈ 𝐿∅.

We proceed to show that 𝐿∅ is unrecognizable. First, we must demonstrate that 𝐿∅ is undecidable. We do so with thereduction 𝐿ACC ≤T 𝐿∅. Given a decider 𝑁 for 𝐿∅, we construct a decider 𝐶 for 𝐿ACC:


′∶


Run 𝑁 on ⟨𝑀 ′⟩If 𝑁 accepts ⟨𝑀 ′⟩, rejectElse accept

We analyze this program as follows:

• If 𝑀 accepts 𝑥, then 𝑀′ accepts all inputs 𝑤, so 𝐿(𝑀 ′) ≠ ∅ and ⟨𝑀 ′⟩ ∉ 𝐿∅. Thus, 𝑁 rejects ⟨𝑀 ′⟩, in which

case 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥, then 𝑀′ does not accept any inputs 𝑤, so 𝐿(𝑀 ′) = ∅ and ⟨𝑀 ′⟩ ∈ 𝐿∅. Then 𝑁 accepts

⟨𝑀 ′⟩, in which case 𝐶 rejects (⟨𝑀⟩, 𝑥).

Observe that 𝐶 always halts – it just constructs the new program 𝑀′ and runs 𝑁 on it, and the latter halts because it is

a decider. Thus, 𝐶 is a decider, and the language it decides is 𝐿ACC. This demonstrates that 𝐿ACC ≤T 𝐿∅, and since𝐿ACC is undecidable, 𝐿∅ is also undecidable.

Since the class of decidable languages is closed under complement, we can conclude that

𝐿∅ = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) ≠ ∅

is also undecidable.

We now define a recognizer 𝑅 for 𝐿∅. We require the following behavior:

• If 𝑀 accepts at least one input 𝑥, then ⟨𝑀⟩ ∈ 𝐿∅, so 𝑅 must accept ⟨𝑀⟩.

12.1. Dovetailing 125


• If 𝑀 does not accept any inputs 𝑥, then ⟨𝑀⟩ ∉ 𝐿∅, so 𝑅 must reject or loop on ⟨𝑀⟩.We don’t know a priori which inputs 𝑀 accepts. If we want to determine whether 𝑀 accepts at least one input in somefinite set 𝑆 = 𝑤1, . . . , 𝑤𝑛, we can alternate between executing 𝑀 on each element of 𝑆 until one of those executionsaccepts, similar to how we showed that the class of recognizable languages is closed under union. Unfortunately, we donot have a finite set of possible inputs here; rather, we need to determine whether 𝑀 accepts an input in the countablyinfinite set Σ

∗.

More explicitly, we need to be able to simulate infinitely many inputs 𝑤 ∈ Σ∗ on infinitely many steps 𝑠 ∈ Z+. We

have the product Σ∗ × Z+ of two countably infinite sets, and we need to come up with an ordering such that we will

eventually reach each element (𝑤, 𝑠) ∈ Σ∗ × Z+ of this product. We can follow a similar ordering as we did when

listing the pairs of natural numbers (page 90), as illustrated below:

step

𝟏 𝟐 𝟑 𝟒 𝟓 ⋯

inpu

t

𝜺 (𝜀, 1) (𝜀, 2) (𝜀, 3) (𝜀, 4) (𝜀, 5)

𝟎 (0,1) (0,2) (0,3) (0,4) (0,5)

𝟏 (1,1) (1,2) (1,3) (1,4) (1,5)

𝟎𝟎 (00,1) (00,2) (00,3) (00,4) (00,5)

𝟎𝟏 (01,1) (01,2) (01,3) (01,4) (01,5)

𝟏𝟎 (10,1) (10,2) (10,3) (10,4) (10,5)

𝟏𝟏 (11,1) (11,2) (11,3) (11,4) (11,5)

000 (000,1) (000,2) (000,3) (000,4) (000,5)

⋮

This process is called dovetailing, and it allows us to alternate between a countably infinite set of executions. Oursimulation proceeds in rounds, and in round 𝑖, we perform a single step of each execution whose index is at most 𝑖. Tomake this concrete, we define the recognizer 𝑅 as:

𝑅 on input ⟨𝑀⟩ ∶For 𝑖 = 1, 2, . . . ∶

For 𝑗 = 1, 2, . . . , 𝑖 ∶

Run one step of the execution of 𝑀 on 𝑤𝑗

If 𝑀 accepts 𝑤𝑗 , accept

In the first round, 𝑅 simulates a step of the execution of 𝑀 on the first input 𝑤1. In the second round, 𝑅 simulatesa step on 𝑤1 and another on 𝑤2. In the third round, a step on each of 𝑤1, 𝑤2, and 𝑤3 gets simulated, and so on.A round 𝑖 executes a finite number of steps 𝑖. An arbitrary input 𝑤𝑘 begins execution in round 𝑘, i.e. after a finitenumber of steps, and its execution will continue in subsequent rounds until it halts, or indefinitely if it loops. Thus,this dovetailing process will consider all inputs 𝑤. 𝑅 only terminates when the simulation of 𝑀 on a particular input𝑤 terminates in the accept state.



We analyze 𝑅 as follows:

• If ⟨𝑀⟩ ∈ 𝐿∅, then 𝑀 accepts some input 𝑤𝑗 in a finite number of steps 𝑠. 𝑅 simulates execution of 𝑀 on 𝑤𝑗

in rounds [𝑗, 𝑗+ 𝑠), and each round takes a finite amount of time. Thus, 𝑅 completes its simulation of 𝑀 on 𝑤𝑗

in finite time and subsequently accepts ⟨𝑀⟩.• If ⟨𝑀⟩ ∉ 𝐿∅, then 𝑀 does not accept any inputs. Then 𝑅 never terminates – it simulates 𝑀 on infinitely many

inputs 𝑤, none of which accept, so 𝑅 does not accept. Thus, 𝑅 loops on ⟨𝑀⟩.Since 𝑅 accepts all ⟨𝑀⟩ ∈ 𝐿∅ and loops on all ⟨𝑀⟩ ∉ 𝐿∅, 𝑅 is a recognizer for 𝐿∅. Then because 𝐿∅ is undecidablebut recognizable, it follows that 𝐿∅ is unrecognizable.

Exercise 83 Prove whether the following language is recognizable, co-recognizable, or both:

𝐿REJ = (⟨𝑀⟩, 𝑥) ∶𝑀 rejects 𝑥

Exercise 84 Prove whether the following language is recognizable, co-recognizable, or both:

𝐿HATES-EVENS = ⟨𝑀⟩ ∶𝑀 does not accept any even number


CHAPTER

THIRTEEN

COMPUTABLE FUNCTIONS AND KOLMOGOROV COMPLEXITY

Thus far, we have only considered decision problems, for which the answer is either yes or no. We formalized such aproblem as a language 𝐿 over an alphabet Σ, where 𝐿 is the set of all yes instances. Then solving a decision problemmeans being able to determine whether 𝑥 ∈ 𝐿 for all inputs 𝑥 ∈ Σ

∗.

We can define a function 𝑓𝐿 ∶ Σ∗→ 0, 1 corresponding to the language 𝐿:

𝑓𝐿(𝑥) = 1 if 𝑥 ∈ 𝐿

0 if 𝑥 ∉ 𝐿

A program that decides 𝐿 also computes the function 𝑓𝐿, meaning that it determines the value 𝑓𝐿(𝑥) for any input 𝑥.For an arbitrary function 𝑓 ∶ Σ

∗→ Σ

∗, we say that 𝑓 is computable if there exists a program that outputs 𝑓(𝑥) giventhe input 𝑥.

Computability is a generalization of decidability for arbitrary functions. Decidability only applies to functions whosecodomain is 0, 1, while computability applies to any function whose codomain is comprised of finite-length strings.

In the Turing-machine model, we represent a result in 0, 1 with specific reject and accept states. For a functionwhose codomain is Σ

∗, we cannot define a new state for each element in Σ∗ – there are infinitely many elements in

Σ∗, but the set of states 𝑄 for a Turing machine must be finite. Instead, we consider a Turing machine to output the

string 𝑓(𝑥) if it reaches a final state with 𝑓(𝑥) written on its tape.

For a concrete programming model, we rely on whatever convention is used in that model for output, such as returninga value from a function or writing something to a standard output stream. In pseudocode, we typically just state“output 𝑤” or “return 𝑤” as an abstraction of outputting the value 𝑤.

Equivalence of Functional and Decision Models

We can actually turn a functional problem into a decision one by recasting it as a series of yes/no questions. Thereare many ways to do so, and the following is one concrete way.

Let 𝑓 ∶ Σ∗→ Σ

∗ be a computable function. We can define a language corresponding to 𝑓 as follows:

𝐿𝑓 = ⟨𝑥, 𝑖, 𝜎⟩ ∈ Σ∗× N × Σ ∶ ∣𝑓(𝑥)∣ ≥ 𝑖 and the 𝑖th symbol of 𝑓(𝑥) is 𝜎

In other words, ⟨𝑥, 𝑖, 𝜎⟩ ∈ 𝐿𝑓 when 𝑓(𝑥) has at least 𝑖 symbols, and the 𝑖th symbol is 𝜎.

Given a decider 𝐷 for 𝐿𝑓 , we can construct a program 𝐶 that computes 𝑓 as follows:

𝐶 on input 𝑥 ∶For 𝑖 = 1, 2, . . . ∶

For each 𝜎 ∈ Σ ∶

Run 𝐷 on ⟨𝑥, 𝑖, 𝜎⟩If 𝐷(⟨𝑥, 𝑖, 𝜎⟩) accepts , output 𝜎

If 𝐷 rejects ⟨𝑥, 𝑖, 𝜎⟩ for all 𝜎 ∈ Σ, halt

128


This machine queries 𝐷 for each possible symbol at position 𝑖 of 𝑓(𝑥), outputting the correct symbols one by one.It halts when it reaches an 𝑖 where there is no symbol.

To complete our demonstration of equivalence, we show how to construct 𝐷 given a program 𝐶 that computes 𝑓 :

𝐷 on input ⟨𝑥, 𝑖, 𝜎⟩ ∶Run 𝐶 on 𝑥 to compute 𝑓(𝑥)If ∣𝑓(𝑥)∣ ≥ 𝑖 and the 𝑖th symbol of 𝑓(𝑥) is 𝜎, acceptElse reject

This equivalence shows that we do not lose any power by restricting ourselves to decision problems, as we havedone until this point.

Given any alphabet Σ, there are uncountably many functions 𝑓 ∶ Σ∗→ Σ

∗. Since a Turing machine 𝑀 can computeat most one function and the set of Turing machines is countable, there are uncountably many uncomputable functionsover any alphabet Σ.

Exercise 85 Use diagonalization to show that the set of functions 𝑓 ∶ Σ∗→ Σ

∗ is uncountable when Σ is theunary alphabet Σ = 1.

As a specific example of an uncomputable function, consider the problem of data compression. Given a string 𝑤,we would like to encode its data as a shorter string 𝑤𝑐 that still allows us to recover the original string 𝑤. This isknown as lossless compression, and many algorithms for lossless compression are in use including the GIF format forimages, ZIP and GZIP for arbitrary files, and so on. (There are lossy formats as well such as JPEG and MP3, wheresome information is lost as part of the compression process.) It can be demonstrated by a counting argument that anylossless compression algorithm has uncompressible strings for all lengths 𝑛 ≥ 1, where the result of compressing thestring is no shorter than the original string.

Exercise 86 Let 𝐶 be an arbitrary compression algorithm for binary strings. Define 𝐶(𝑤) to be the result ofcompressing the string 𝑤, where 𝑤 ∈ 0, 1∗ and 𝐶(𝑤) ∈ 0, 1∗.

a) Prove that for all 𝑛 ≥ 1, there are at most 2𝑛 − 1 binary strings 𝑤 such that the corresponding compressed

string 𝐶(𝑤) has length strictly less than 𝑛.

b) Conclude that for all 𝑛 ≥ 1, there exists some uncompressible string 𝑤 of length 𝑛 such that 𝐶(𝑤) haslength at least 𝑛.

We specifically consider “compression by program,” where we compress a string 𝑤 by writing a program that, givenan empty input, outputs the string 𝑤. Assume we are working in a concrete programming language 𝑈 . We define theKolmogorov complexity of 𝑤 in language 𝑈 as the length of the shortest program in 𝑈 that outputs 𝑤, given an emptystring. We denote this by 𝐾𝑈(𝑤), and we have:

𝐾𝑈(𝑤) = min∣⟨𝑀⟩∣ ∶𝑀 is a program in language 𝑈 that outputs 𝑤, given an empty input

Observe that 𝐾𝑈(𝑤) = O(∣𝑤∣) for any 𝑤 ∈ 0, 1∗ – we can just hardcode the string 𝑤 in the program itself. Forinstance, the following is a C++ program that outputs 𝑤 = 0101100101:

#include <iostream>

int main() std::cout << "0101100101";

To output a different value 𝑤′, we need only swap out the hardcoded string 𝑤 for 𝑤′. The total length of the program

that has a string hardcoded is a constant number of symbols, plus the length of the desired output string itself. Thus,there exists a program of length O(∣𝑤∣) that outputs 𝑤, so the shortest program that does so has length no more thanO(∣𝑤∣), and 𝐾C++(𝑤) = O(∣𝑤∣). The same is true for any other programming language 𝑈 .

129


For some strings 𝑤, we can define a much shorter program that outputs 𝑤. Consider 𝑤 = 0𝑚 for some large number

𝑚, meaning that 𝑤 consists of 𝑚 zeros. We can define a short program to output 𝑤 as follows:

#include <iostream>

int main() for (int i = 0; i < m; ++i)std::cout << "0";

As a concrete example, let 𝑚 = 1000. The program that hardcodes 𝑤 = 0𝑚 would have a length on the order of 1000.

But the program that follows the looping pattern above is:

#include <iostream>

int main() for (int i = 0; i < 1000; ++i)std::cout << "0";

Here, we hardcode 𝑚 rather than 0𝑚. The former has length O(log𝑚) (for 𝑚 = 1000, four digits in decimal or ten

in binary), as opposed to the latter that has length 𝑚. Thus, we achieve a significant compression for 𝑤 = 0𝑚, and we

have 𝐾C++(0𝑚) = O(log𝑚).

We now demonstrate that the function 𝐾𝑈(𝑤) is uncomputable for any programming language 𝑈 . Assume for thepurposes of contradiction that 𝐾𝑈(𝑤) is computable. For a given length 𝑛, we define the program 𝑄𝑛 as follows:

Algorithm 𝑄𝑛 ∶

For all 𝑥 ∈ 0, 1𝑛 ∶Compute 𝐾𝑈(𝑥)If 𝐾𝑈(𝑥) ≥ 𝑛, output 𝑥 and halt

As mentioned previously, any compression algorithm has uncompressible strings for every length 𝑛 ≥ 1. Thus, 𝑄𝑛

will find a string 𝑤𝑛 such that 𝐾𝑈(𝑤𝑛) ≥ 𝑛, output it, and halt.

Since 𝑄𝑛 outputs 𝑤𝑛 given an empty input, by definition, we have 𝐾𝑈(𝑤𝑛) ≤ ∣𝑄𝑛∣; that is, the Kolmogorov com-plexity of 𝑤𝑛 in language 𝑈 is no more than the length of 𝑄𝑛, since 𝑄𝑛 itself is a program that outputs 𝑤𝑛. How longis 𝑄𝑛? The only part of 𝑄𝑛 that depends on 𝑛 is the hardcoded value 𝑛 in the loop that iterates over 0, 1𝑛. As wesaw before, it takes only O(log 𝑛) symbols to encode a value 𝑛. Thus, we have ∣𝑄𝑛∣ = O(log 𝑛).

We have arrived at a contradiction: we have demonstrated that 𝐾𝑈(𝑤𝑛) ≥ 𝑛 and also that 𝐾𝑈(𝑤𝑛) ≤ ∣𝑄𝑛∣ =O(log 𝑛). These two conditions cannot be simultaneously satisfied. Since we have reached a contradiction, ouroriginal assumption that 𝐾𝑈(𝑤) is computable must be false, and 𝐾𝑈(𝑤) is an uncomputable function.

130

CHAPTER

FOURTEEN

RICE’S THEOREM

We previously demonstrated (page 125) that 𝐿∅ is undecidable, where

𝐿∅ = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) = ∅

Let’s take a look at a similar language and determine whether it too is undecidable. Define 𝐿𝜀 as

𝐿𝜀 = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) = 𝜀

As we saw with 𝐿∅ and ∅ = , the languages 𝐿𝜀 and 𝜀 are distinct. The latter is a language containing only thesingle element 𝜀, and it is decidable:

𝐷𝜀 on input 𝑥 ∶If 𝑥 = 𝜀, acceptReject

On the other hand, 𝐿𝜀 is a language containing source codes of programs, and it is a countably infinite set. Wedemonstrate that 𝐿𝜀 is undecidable with the reduction 𝐿ACC ≤T 𝐿𝜀. Assume we have a decider 𝐸 for 𝐿𝜀. Thenwe construct a decider 𝐶 for 𝐿ACC as follows:


′∶

“𝑀 ′ on input 𝑤 ∶If 𝑤 = 𝜀 ∶

Run 𝑀 on 𝑥 and answer as 𝑀Else reject ”

Run 𝐸 on ⟨𝑀 ′⟩If 𝐸 accepts ⟨𝑀 ′⟩, acceptElse reject

We analyze this program as follows:

• If 𝑀 accepts 𝑥, then 𝑀′ accepts the input 𝜀 and rejects all other inputs, so 𝐿(𝑀 ′) = 𝜀 and ⟨𝑀 ′⟩ ∈ 𝐿𝜀.

Thus, 𝐸 accepts ⟨𝑀 ′⟩, in which case 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥, then 𝑀′ does not accept any inputs 𝑤, so 𝐿(𝑀 ′) = ∅ ≠ 𝜀 and ⟨𝑀 ′⟩ ∉ 𝐿𝜀. Then 𝐸

rejects ⟨𝑀 ′⟩, in which case 𝐶 rejects (⟨𝑀⟩, 𝑥).

Thus, 𝐿ACC ≤T 𝐿𝜀, and 𝐿𝜀 is undecidable.

We can generalize this reasoning as follows.

131


Claim 87 Let 𝐴 be a recognizable language. Define the language

𝐿𝐴 = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) = 𝐴

comprised of source codes of programs 𝑀 such that 𝐿(𝑀) = 𝐴. Then 𝐿𝐴 is undecidable.

Proof 88 We first observe that since 𝐴 is recognizable, there exist programs 𝑀 such that 𝐿(𝑀) = 𝐴. In fact,there are infinitely many such programs – given a Turing machine 𝑀 that recognizes 𝐴, we can construct adistinct Turing machine 𝑀 ′ that also recognizes 𝐴. If 𝑄 is the set of states of 𝑀 , we can construct 𝑀 ′ by addinga new state 𝑞

′ to 𝑄 without any incoming transitions. Since there are no incoming transitions to 𝑞′, the addition

of 𝑞′ does not affect the behavior of the machine.

Since 𝐿𝐴 has infinitely many elements, we cannot decide it by just hardcoding its elements. In fact, we demon-strate that 𝐿𝐴 is undecidable by reducing 𝐿ACC to it. Assume we have a decider 𝐷 for 𝐿𝐴. Then we construct adecider 𝐶 for 𝐿ACC. Since 𝐴 is recognizable, we know that there exists a recognizer 𝑅 for it, which we use inour construction of 𝐶. For the purposes of our construction, we assume that 𝐴 ≠ ∅, as we have already shown(page 125) that 𝐿∅ is undecidable.


′∶

“𝑀 ′ on input 𝑤 ∶Run 𝑀 on 𝑥

If 𝑀 accepts 𝑥 ∶Run 𝑅 on 𝑤 and answer as 𝑅

Else reject ”

Run 𝐷 on ⟨𝑀 ′⟩If 𝐷 accepts ⟨𝑀 ′⟩, acceptElse reject

We first analyze the behavior of 𝑀 ′:

• If 𝑀 accepts 𝑥, we have:

– If 𝑤 ∈ 𝐴, then 𝑅 accepts 𝑤, so 𝑀′ accepts 𝑤.

– If 𝑤 ∉ 𝐴, then 𝑅 either rejects or loops on 𝑤, so 𝑀′ rejects or loops on 𝑤.

• If 𝑀 does not accept 𝑥, then 𝑀′ does not accept any input 𝑤 – 𝑀

′ loops on all inputs if 𝑀 loops on 𝑥,otherwise 𝑀

′ rejects all inputs.

Thus, we see that 𝐿(𝑀 ′) = 𝐴 exactly when 𝑀 accepts 𝑥; otherwise 𝐿(𝑀 ′) = ∅. (Again, we are assuming that𝐴 ≠ ∅.) Then the behavior of 𝐶 is as follows:

• If 𝑀 accepts 𝑥, then 𝐿(𝑀 ′) = 𝐴 so 𝐷 accepts ⟨𝑀 ′⟩. Then 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥, then 𝐿(𝑀 ′) = ∅ ≠ 𝐴 so 𝐷 rejects ⟨𝑀 ′⟩. Then 𝐶 rejects (⟨𝑀⟩, 𝑥).

Thus, 𝐶 is a decider for 𝐿ACC, and we have shown that 𝐿ACC ≤T 𝐿𝐴.

The construction above only works for languages 𝐴 that are recognizable. If 𝐴 is unrecognizable, then 𝐿𝐴 is decidable– since 𝐴 is unrecognizable, there is no program 𝑀 such that 𝐿(𝑀) = 𝐴. Thus, 𝐿𝐴 = ∅, which is decided by aprogram that rejects all inputs.

We can further generalize Claim 87 by considering sets of languages rather than just a single language in isolation.

132


Definition 89 (Semantic Property) A semantic property is a set of languages.

We often use the symbol P to represent a semantic property. The following is an example of a semantic property:

P∞ = 𝐿 ⊆ Σ∗∶ ∣𝐿∣ is infinite

P∞ is the set of languages that have infinitely many elements. Examples of languages contained in P∞ include Σ∗,

𝑥 ∈ Σ∗ ∶ 𝑥 contains no 1’s (assuming Σ is not the unary alphabet 1), and all undecidable languages, and examples

that are not in P∞ include ∅ and 𝑥 ∈ Σ∗ ∶ ∣𝑥∣ < 100.

Definition 90 (Trivial Semantic Property) A semantic property P is trivial if it either contains all recognizablelanguages, or if it contains no recognizable languages.

Examples of trivial properties include the empty set, the set of all recognizable languages, the set of all unrecognizablelanguages, and the set of all languages. The set of decidable languages is nontrivial, since there are both recognizablelanguages that are decidable as well as recognizable languages that are undecidable.

We now generalize our notion of a language that contains source codes of programs, from programs that recognize aspecific language 𝐴 to programs that recognize a language contained in a semantic property P. Define 𝐿P as follows:

𝐿P = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) ∈ P

Whether or not 𝐿P is decidable depends on whether P is trivial.

Claim 91 If P is a trivial semantic property, then 𝐿P is decidable.

Proof 92 If P is trivial, then it either contains all recognizable languages or no recognizable languages.

• Case 1: P contains all recognizable languages. Then every program 𝑀 is contained in 𝐿P – since 𝐿(𝑀)is recognizable by definition, 𝐿(𝑀) ∈ P. We can then write a decider for 𝐿P that accepts all programs asinput.

• Case 2: P contains no recognizable languages. Then no program 𝑀 is contained in 𝐿P, since 𝐿(𝑀) isrecognizable and therefore not in P. Thus, we can decide 𝐿P by rejecting all inputs.

We now take a look at Rice’s theorem, which states that 𝐿P is undecidable for a nontrivial semantic property P.

Theorem 93 (Rice’s Theorem) If P is a nontrivial semantic property, then 𝐿P is undecidable.

133


Proof 94 We show that 𝐿ACC ≤T 𝐿P: given a decider 𝐷 for 𝐿P, we will construct a decider 𝐶 for 𝐿ACC.

First, we consider the case where∅ ∉ P. Since P is nontrivial, there must be some recognizable language 𝐴 ∈ P.Observe that 𝐴 ≠ ∅ since 𝐴 ∈ P while ∅ ∉ P. Since 𝐴 is recognizable, there is a recognizer 𝑅𝐴 for it. Then weconstruct 𝐶 as follows:


′∶


If 𝑀 accepts 𝑥 ∶Run 𝑅𝐴 on 𝑤 and answer as 𝑅𝐴

Else reject ”

Run 𝐷 on ⟨𝑀 ′⟩If 𝐷 accepts ⟨𝑀 ′⟩, acceptElse reject

Similar to our analysis in Proof 88, we have that 𝐿(𝑀 ′) = 𝐴 when 𝑀 accepts 𝑥, and 𝐿(𝑀 ′) = ∅ when 𝑀 doesnot accept 𝑥. Then the behavior of 𝐶 is as follows:

• If 𝑀 accepts 𝑥, then 𝐿(𝑀 ′) = 𝐴. Since 𝐴 ∈ P, we have 𝐿(𝑀 ′) ∈ P and ⟨𝑀 ′⟩ ∈ 𝐿P. 𝐷 is a decider for𝐿P, so it accepts ⟨𝑀 ′⟩. Then 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥, then 𝐿(𝑀 ′) = ∅ ∉ P. Then 𝐷 rejects ⟨𝑀 ′⟩, so 𝐶 rejects (⟨𝑀⟩, 𝑥).

We now consider the case where∅ ∈ P. Since P is nontrivial, there must be some recognizable language 𝐵 ∉ P.Observe that 𝐵 ≠ ∅ since 𝐵 ∉ P while ∅ ∈ P. Since 𝐵 is recognizable, there is a recognizer 𝑅𝐵 for it. Thenwe construct 𝐶 as follows:


′∶


If 𝑀 accepts 𝑥 ∶Run 𝑅𝐵 on 𝑤 and answer as 𝑅𝐵

Else reject ”

Run 𝐷 on ⟨𝑀 ′⟩If 𝐷 accepts ⟨𝑀 ′⟩, rejectElse accept

We now have that 𝐿(𝑀 ′) = 𝐵 when 𝑀 accepts 𝑥, and 𝐿(𝑀 ′) = ∅ when 𝑀 does not accept 𝑥. Then thebehavior of 𝐶 is as follows:

• If 𝑀 accepts 𝑥, then 𝐿(𝑀 ′) = 𝐵. Since 𝐵 ∉ P, we have 𝐿(𝑀 ′) ∉ P and ⟨𝑀 ′⟩ ∉ 𝐿P. 𝐷 is a decider for𝐿P, so it rejects ⟨𝑀 ′⟩. Then 𝐶 accepts (⟨𝑀⟩, 𝑥).

• If 𝑀 does not accept 𝑥, then 𝐿(𝑀 ′) = ∅ ∈ P. Then 𝐷 accepts ⟨𝑀 ′⟩, so 𝐶 rejects (⟨𝑀⟩, 𝑥).

In both cases, 𝐶 is a decider for 𝐿ACC. We have shown that 𝐿ACC ≤T 𝐿P, so 𝐿P is undecidable.

134


14.1 Applying Rice’s Theorem

Rice’s theorem gives us a third tool for proving the undecidability of a language, in addition to the direct proof we sawfor 𝐿ACC and Turing reduction from a known undecidable language. Given a language 𝐿, we need to do the followingto apply Rice’s theorem:

1. Define a semantic property P, i.e. a set of languages.

2. Show that 𝐿P = 𝐿. In other words, we show that the language

𝐿P = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) ∈ P

consists of the same set of elements as 𝐿.

3. Show that P is nontrivial. This requires demonstrating that there is some recognizable language 𝐴 ∈ P andanother recognizable language 𝐵 ∉ P.

We can then conclude by Rice’s theorem that 𝐿 is undecidable.

Example 95 Define 𝐿Σ∗ as follows:

𝐿Σ∗ = ⟨𝑀⟩ ∶𝑀 is a program that accepts all inputs

We use Rice’s theorem to show that 𝐿Σ∗ is undecidable. Let P be the semantic property:

P = Σ∗

The property contains just a single language. Observe that 𝑀 accepts all inputs exactly when 𝐿(𝑀) = Σ∗. Thus,

we can express 𝐿Σ∗ as:

𝐿Σ∗ = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) = Σ∗

This is the exact definition of 𝐿P, so we have 𝐿Σ∗ = 𝐿P.

We proceed to show that there is a recognizable language in P and another not in P. Σ∗ is a recognizable language

in P – we can recognize Σ∗ with a program that accepts all inputs. ∅ is a recognizable language not in P – we

can recognize ∅ with a program that rejects all inputs. Thus, P is nontrivial.

Since P is nontrivial, by Rice’s theorem, 𝐿P is undecidable. Since 𝐿P = 𝐿Σ∗ , 𝐿Σ∗ is undecidable.

Example 96 Define 𝐿A376 as follows:

𝐿A376 = ⟨𝑀⟩ ∶𝑀 is a program that accepts all strings of length less than 376

We use Rice’s theorem to show that 𝐿A376 is undecidable. Define 𝑆376 to be the set of all strings whose length isless than 376:

𝑆376 = 𝑥 ∈ Σ∗∶ ∣𝑥∣ < 376

Then let P be the semantic property:

P = 𝐿 ⊆ Σ∗∶ 𝑆376 ⊆ 𝐿

14.1. Applying Rice’s Theorem 135


The property contains all languages that themselves contain all of 𝑆376. If 𝑀 accepts all inputs of length lessthan 376, then 𝑆376 ⊆ 𝐿(𝑀). We thus have:

𝐿A376 = ⟨𝑀⟩ ∶𝑀 is a program that accepts all strings of length less than 376= ⟨𝑀⟩ ∶𝑀 is a program and 𝑆376 ⊆ 𝐿(𝑀)= ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) ∈ P= 𝐿P

We proceed to show that there is a recognizable language in P and another not in P. Σ∗ is a recognizable

language in P – a program that recognizes Σ∗ accepts all inputs, including all those of length less than 376. ∅

is a recognizable language not in P – a program that recognizes ∅ does not accept any inputs, including those oflength less than 376. Thus, P is nontrivial.

Since P is nontrivial, by Rice’s theorem, 𝐿P is undecidable. Since 𝐿P = 𝐿A376, 𝐿A376 is undecidable.

Example 97 Define 𝐿DEC as follows:

𝐿DEC = ⟨𝑀⟩ ∶𝑀 is a program and 𝐿(𝑀) is decidable

We use Rice’s theorem to show that 𝐿DEC is undecidable. Let P be the semantic property:

P = 𝐿 ⊆ Σ∗∶ 𝐿 is decidable

The property contains all decidable languages, and 𝐿DEC = 𝐿P.

We proceed to show that there is a recognizable language in P and another not in P. Σ∗ is a recognizable language

in P – we can both decide and recognize Σ∗ with a program that accepts all inputs. 𝐿ACC is a recognizable

language not in P – we know that 𝐿ACC is undecidable, and it is recognized by the universal Turing machine 𝑈 .Thus, P is nontrivial.

Since P is nontrivial, by Rice’s theorem, 𝐿P is undecidable. Since 𝐿P = 𝐿DEC, 𝐿DEC is undecidable.

Rice’s theorem is not applicable to all undecidable languages. Some examples of languages where Rice’s theoremcannot be applied are:

• The language

𝐿REJ = (⟨𝑀⟩, 𝑥) ∶𝑀 is a program and 𝑀 rejects 𝑥

does not consist of source codes of programs on their own, so we cannot define a property P such that 𝐿P =𝐿REJ. We must show undecidability either directly or through a Turing reduction.

• The language

𝐿SmallTM = ⟨𝑀⟩ ∶𝑀 has fewer than 100 states

is concerned with the structure of 𝑀 rather than its language 𝐿(𝑀). This is a syntactic property rather than asemantic property, and it is decidable by examining the source code of 𝑀 .

• The languages

𝐿R376 = ⟨𝑀⟩ ∶𝑀 is a program that rejects all strings of length less than 376𝐿L376 = ⟨𝑀⟩ ∶𝑀 is a program that loops on all strings of length less than 376

14.1. Applying Rice’s Theorem 136


do not have a direct match with a semantic property. In both cases, 𝐿(𝑀) ∩ 𝑆376 = ∅, but the two languages𝐿R376 and 𝐿L376 are clearly distinct and non-overlapping. In fact, a program 𝑀1 that rejects all strings is in𝐿R376 while a program 𝑀2 that loops on all strings is in 𝐿L376, but both programs have the same language𝐿(𝑀1) = 𝐿(𝑀2) = ∅. Thus, both 𝐿R376 and 𝐿L376 contain some programs and exclude others with the samelanguage, so it is impossible to define a property P such that 𝐿P = 𝐿R376 or 𝐿P = 𝐿L376.

However, both 𝐿R376 and 𝐿L376 are actually undecidable. While we cannot apply Rice’s theorem, we can showundecidability via a Turing reduction.

Thus, Rice’s theorem allows us to take a shortcut for languages of the right format, but we still need the previous toolswe saw for languages that do not follow the structure required by the theorem.

14.2 Rice’s Theorem and Program Analysis

Rice’s theorem has profound implications for compilers and program analysis. While the formulation above is withrespect to the language of a program, the typical expression of Rice’s theorem in the field of programming languagesis more expansive:

Any nontrivial question about the behavior of a program at runtime is undecidable.

Here, “nontrivial” means there are some programs that have the behavior in question and others that do not37.

As an example, let’s consider the question of whether a Turing machine ever writes the symbol 1 to its tape, when runon a given input 𝑥. The language we want to decide is as follows:

𝐿WritesOne = (⟨𝑀⟩, 𝑥) ∶𝑀 is a Turing machine that writes a 1 to its tape when run on 𝑥

This language is not in the form required by our formulation of Rice’s theorem. However, we can show it is undecidableby the reduction 𝐿HALT ≤T 𝐿WritesOne. Given a decider 𝑊 for 𝐿WritesOne, we can construct a decider 𝐻 for 𝐿HALT.

First, we observe that we can construct a new program 𝑀′ such that 𝑀 ′ writes a 1 to its tape exactly when 𝑀

halts. Start by replacing the symbol 1 with a new symbol 1 everywhere in 𝑀 – in its input alphabet, tape alphabet,and all occurrences in its transition function. Then replace 𝑞𝑎𝑐𝑐𝑒𝑝𝑡 and 𝑞𝑟𝑒𝑗𝑒𝑐𝑡 with new states 𝑞

′𝑎 and 𝑞

′𝑟 such that

𝛿(𝑞′𝑎, 𝛾) = (𝑞′𝑎, 1, 𝑅) and 𝛿(𝑞′𝑟, 𝛾) = (𝑞′𝑟, 1, 𝑅) for all 𝛾 in the modified tape alphabet. Then 𝑀′ reaches 𝑞

′𝑎 or 𝑞′𝑟

exactly when 𝑀 halts, and 𝑀′ proceeds to write a 1 in the next step. Thus, 𝑀 ′ writes a 1 exactly when 𝑀 halts38.

Using the above transformation, we can write 𝐻 as follows:

𝐻 on input (⟨𝑀⟩, 𝑥) ∶Construct 𝑀 ′ from 𝑀 as described above

Run 𝑊 on (⟨𝑀 ′⟩, 𝑥) and answer as 𝑊

Since 𝑀 ′ writes a 1 exactly when 𝑀 halts, 𝑊 accepts (⟨𝑀 ′⟩, 𝑥) exactly when 𝑀 halts on 𝑥. Thus, 𝐻 is a decider for𝐿HALT, and we have shown that 𝐿HALT ≤T 𝐿WritesOne.

A similar construction can be done for any nontrivial question about program behavior, with respect to any Turing-complete language. The implication is that there is no such thing as a perfect program analysis, and any real programanalysis produces the wrong result for some nonempty set of inputs.

37 The behavior in question must be concerned with program meaning, or semantics, and it must be robust with respect to semantics-preservingtransformations. For instance, a question such as, “Does the program halt in less than a thousand steps?” is not about the program’s meaning, andthe answer is different for different programs that have identical semantics.

38 This is an example of a mapping reduction, where we transform an input 𝑎 to an input 𝑏 such that 𝑎 ∈ 𝐿1 ⟺ 𝑏 ∈ 𝐿2. In this case, we have(⟨𝑀⟩, 𝑥) ∈ 𝐿HALT ⟺ (⟨𝑀 ′⟩, 𝑥) ∈ 𝐿WriteOnes. We will see more details about mapping reductions later (page 155).

14.2. Rice’s Theorem and Program Analysis 137


Soundness and Completeness

Suppose we wish to analyze some arbitrary nontrivial program behavior 𝐵. Define the language 𝐿𝐵 as:

𝐿𝐵 = ⟨𝑀⟩ ∶𝑀 has behavior 𝐵

We would like an analysis 𝐴 such that:

• if ⟨𝑀⟩ ∈ 𝐿𝐵 , then 𝐴 accepts ⟨𝑀⟩• if ⟨𝑀⟩ ∉ 𝐿𝐵 , then 𝐴 rejects ⟨𝑀⟩

Unfortunately, we have seen that such an analysis is impossible since 𝐿𝐵 is undecidable. We must settle for animperfect analysis. We have several choices:

• A sound analysis rejects all incorrect inputs, but also rejects some correct inputs. For a sound analysis 𝐴𝑆 ,we have:

– if ⟨𝑀⟩ ∈ 𝐿𝐵 , then 𝐴𝑆 may either accept or reject ⟨𝑀⟩; the set 𝑥 ∈ 𝐿𝐵 ∶ 𝐴𝑆 rejects 𝑥 is nonempty

– if ⟨𝑀⟩ ∉ 𝐿𝐵 , then 𝐴𝑆 rejects ⟨𝑀⟩; the set 𝑥 ∉ 𝐿𝐵 ∶ 𝐴𝑆 accepts 𝑥 is empty

If a program is accepted by a sound analysis, the program is guaranteed to have the desired behavior 𝐵. Ifthe program is rejected by the analysis, it may or may not have behavior 𝐵.

• A complete analysis accepts all correct inputs, but also accepts some incorrect inputs. For a complete analysis𝐴𝐶 , we have:

– if ⟨𝑀⟩ ∈ 𝐿𝐵 , then 𝐴𝐶 accepts ⟨𝑀⟩; the set 𝑥 ∈ 𝐿𝐵 ∶ 𝐴𝐶 rejects 𝑥 is empty

– if ⟨𝑀⟩ ∉ 𝐿𝐵 , then 𝐴𝐶 may accept or reject ⟨𝑀⟩; the set 𝑥 ∉ 𝐿𝐵 ∶ 𝐴𝐶 accepts 𝑥 is nonempty

If a program is accepted by a complete analysis, it may or may not have behavior 𝐵. If the program isrejected, it definitely does not have behavior 𝐵.

• An analysis is both incomplete and unsound if it may answer incorrectly on both correct and incorrect inputs.For an incomplete and unsound analysis 𝐴𝐼 , we have:

– if ⟨𝑀⟩ ∈ 𝐿𝐵 , then 𝐴𝐼 may either accept or reject ⟨𝑀⟩; the set 𝑥 ∈ 𝐿𝐵 ∶ 𝐴𝐼 rejects 𝑥 is nonempty

– if ⟨𝑀⟩ ∉ 𝐿𝐵 , then 𝐴𝐼 may accept or reject ⟨𝑀⟩; the set 𝑥 ∉ 𝐿𝐵 ∶ 𝐴𝐼 accepts 𝑥 is nonempty

A program may or may not have behavior 𝐵 whether it is accepted or rejected by an incomplete and unsoundanalysis.

• An analysis may also loop on some inputs.

An analysis that loops is not useful in practice, so we reject the last choice. This leaves us the choice of giving upsoundness or completeness. Typically, we choose one to give up rather than giving up both, though in practice someanalyses are indeed both incomplete and unsound.

A type system is a concrete example of a program analysis, and the usual choice for a static type system is for it tobe sound rather than complete. This means the compiler rejects some correct programs that would not have a typeerror at runtime. However, if the compiler accepts a program, we can be assured that it will not have a type errorwhen it is run, as long as the type system is sound.

14.2. Rice’s Theorem and Program Analysis 138

Part III

Complexity

139

CHAPTER

FIFTEEN

INTRODUCTION TO COMPLEXITY

In the previous unit, we examined what problems are solvable without any constraints on the time and space resourcesused by an algorithm. We saw that even with unlimited resources, most problems are not solvable by an algorithm. Inthis unit, we focus on those problems that are actually solvable and consider how efficiently they can be solved. Westart with decision problems (languages) and defer consideration of functional (i.e. search) problems until later.

For a Turing machine, its time complexity is the number of steps until it halts, and its space complexity is the numberof tape cells used before halting. We focus primarily on time complexity, and as mentioned previously (page 3), weare interested in the asymptotic complexity with respect to the input size, ignoring leading constants, over worst-caseinputs. For a particular asymptotic bound O(𝑡(𝑛)), we can define a class of languages that are decidable by a Turingmachine with time complexity O(𝑡(𝑛)), where 𝑛 is the size of the input:

Definition 98 (DTIME) DTIME(𝑡(𝑛)) is the class of all languages decidable by a Turing machine with timecomplexity O(𝑡(𝑛)).

DTIME(𝑡(𝑛)) is a complexity class, as it consists of languages whose time complexity is within a particular bound.Concrete examples include DTIME(𝑛), the class of languages decidable by a Turing machine in linear time, andDTIME(𝑛2), the class of languages decidable in quadratic time by a Turing machine.

When we discussed computability, we defined two classes of languages, the decidable languages (also called recursiveand denoted by R) and the recognizable languages (also called recursively enumerable and denoted by RE). Thedefinitions of these two classes are actually model-independent – recall that the Church-Turing thesis states that anylanguage that is decidable in a different model is also decidable in the Turing-machine model. Similarly, any languagedecidable on a Turing machine is also decidable in another Turing-complete model.

Unfortunately, this model-independence does not extend to DTIME(𝑡(𝑛)). Consider the concrete language39:

PALINDROME = (𝑥, 𝑦) ∶ 𝑦 = 𝑥𝑅 (i.e. 𝑦 is the reverse of the string 𝑥)

In a programming model with random access, or even an extended Turing-machine model with two tapes,PALINDROME can be decided in O(𝑛) time, where 𝑛 = ∣(𝑥, 𝑦)∣ – walk pointers inward from both ends of theinput, comparing character-by-character. In the standard Turing-machine model where all computation is local, analgorithm to decide this language would need to examine the first character of 𝑥, move the head sequentially to thelast character of 𝑦, then move the head back to examine the second character of 𝑥, and so on. Each pair of characterstakes O(𝑛) steps just to move the head, for a total running time of O(𝑛2). Thus, PALINDROME ∈ DTIME(𝑛2) andPALINDROME ∉ DTIME(𝑛), but in other models, it requires only O(𝑛) time to decide.

A model-dependent complexity class is very inconvenient for analysis – we wish to analyze algorithms in pseudocodewithout worrying about details of how an algorithm would be implemented on a Turing machine. Furthermore, wewould like to ignore other implementation details such as choice of data structure, which can affect the asymptoticrunning time.

39 Henceforth, we will elide the encoding (e.g. ⟨𝑥, 𝑦⟩) from our notation when defining a language, taking it to be implicit instead.

140


As a step towards establishing a model-independent complexity class, we note that there is an enormous differencebetween the growth of polynomial and exponential functions. The following illustrates how several polynomialscompare to the growth of 2

𝑛:

1E+001E+041E+081E+121E+161E+201E+241E+281E+321E+361E+401E+441E+481E+521E+561E+60

0 20 40 60 80 100 120 140 160 180 200

Polynomial vs. Exponential Growth

n n² n³ n¹⁰ 2ⁿ

The vertical axis in this plot is logarithmic, so that the growth of 2𝑛 appears as a straight line. Observe that even

a polynomial such as 𝑛10 with a relatively large exponent grows far slower than the exponential 2

𝑛, with the latterexceeding the former for all 𝑛 ≥ 60.

In addition to the dramatic difference in growth between polynomials and exponentials, we observe that the set ofpolynomials is closed under composition – if 𝑓(𝑛) = O(𝑛𝑘) and 𝑔(𝑛) = O(𝑛𝑚), then 𝑓(𝑔(𝑛)) = O(𝑛𝑘𝑚), which isalso polynomial. This means that if we compose a polynomial-time algorithm with another polynomial-time algorithm,the resulting algorithm is also polynomial in time.

As a final remark, the extended Church-Turing thesis states that polynomial time translates between models of com-putation:

Theorem 99 (Extended Church-Turing Thesis) An algorithm that runs in polynomial time in one deterministiccomputational model also runs in polynomial time in any other realistic, deterministic model.

We thus establish a model-independent complexity class of languages decidable by a polynomial-time algorithm.

Definition 100 (P) P is the class of languages decidable by a polynomial-time Turing machine, where

P =⋃𝑘≥1

DTIME(𝑛𝑘)

Given the significant difference between polynomial and exponential time, we informally consider P to be the class ofefficiently decidable languages. The class P includes fundamental problems, such as the decision versions of integerarithmetic and sorting. (Since P is a class of decision problems, it does not include the functional variants of theseproblems.)

141


15.1 Efficiently Verifiable Languages

For some languages, the task of verifying that an input is in the language, given some extra information about theinput, may be easier than deciding whether it is in the language from scratch. Consider the following maze, courtesyof Kees Meijer’s maze generator40:

At first glance, it is not clear that there is a path from start to finish. However, if someone were to give us an actualpath, it is straightforward to follow that path to determine whether it takes us from entrance to exit:

On the other hand, for a maze that has no path from start to finish, no matter what path someone gives us, we wouldbe unable to verify that it takes us from entrance to exit:

40 https://keesiemeijer.github.io/maze-generator/

15.1. Efficiently Verifiable Languages 142

https://keesiemeijer.github.io/maze-generator/


As another example, the traveling salesperson problem (TSP) is the task of finding the minimal-weight tour that visitsall the vertices of a weighted, complete (i.e. fully connected) graph, starting and ending at the same vertex. Supposewe wish to determine the minimal tour of the following graph:

A

C

B

D

1918

2

27

11

29

If someone were to tell us that the path 𝐴 → 𝐵 → 𝐷 → 𝐶 → 𝐴 is the minimal tour, how could we verify that it isindeed minimal? There doesn’t seem to be an obvious way to do so without considering all other tours and checkingwhether any of them produce a tour of lower weight. (This particular graph only has three distinct tours, but in general,the number of distinct tours is exponential in the number of vertices in a graph.) On the other hand, if we modify theproblem so that we are looking for a tour that comes in under a specific budget, verifying that such a tour existsbecomes straightforward. This limited-budget version of TSP is as follows:

Given a graph 𝐺 = (𝑉,𝐸) with weight function 𝑤 ∶ 𝐸 → R, is there a tour of weight at most 𝑘 that visitsall vertices?

For the graph above with 𝑘 = 60, the tour 𝐴 → 𝐵 → 𝐷 → 𝐶 → 𝐴 gives us a total weight of 61, so it does not meetthe budget. On the other hand, the tour 𝐴 → 𝐵 → 𝐶 → 𝐷 → 𝐴 gives us a total weight of 58 and does come underthe budget. Verifying that a tour meets the budget is as simple as adding up the weights of the edges along the tour.Thus, if a given graph does have a tour that meets the budget, we can verify this is the case if we are given the actualtour that is under budget. On the other hand, if a graph has no tour that meets the budget, no matter what tour we aregiven, we will never erroneously conclude that the graph has a tour that comes under budget.

Generalizing the reasoning above, we say that a decision problem 𝐿 is efficiently verifiable when:

• If an input 𝑥 is in 𝐿, then there is an efficient way to verify that it is given some additional information, whichwe call a certificate.

• If an input 𝑥 is not in 𝐿, then there is no additional information, even maliciously produced, that could convinceus that 𝑥 is in 𝐿.

As one more example, consider the language of polynomials that have positive roots:

POLY = 𝑝(𝑥) ∶ 𝑝(𝑥) is a polynomial such that 𝑝(𝑥) = 0 for some 𝑥 > 0

Is 𝑥3 − 𝑥− 60 in POLY? Given 𝑥 = 4 as the certificate, we can indeed verify that 43 − 4− 60 = 64− 4− 60 = 0, so

𝑥3−𝑥−60 ∈ POLY. On the other hand, for the polynomial 𝑥2+1, no certificate can convince us that this polynomial

has a positive root. Since we can compute the value of a polynomial efficiently on a given 𝑥, POLY is efficientlyverifiable.

More formally, a language 𝐿 is efficiently verifiable if there exists an algorithm 𝑉 , called a verifier, that takes twoinputs 𝑥, 𝑐 such that:

• The computation 𝑉 (𝑥, 𝑐) takes polynomial time with respect to ∣𝑥∣, the size of 𝑥.



• If 𝑥 ∈ 𝐿, then there exists some certificate 𝑐 such that 𝑉 (𝑥, 𝑐) accepts.

• If 𝑥 ∉ 𝐿, then 𝑉 (𝑥, 𝑐) rejects for all certificates 𝑐.

This gives rise to the class of languages that are efficiently verifiable.

Definition 101 (NP) NP is the class of efficiently verifiable languages41.

41 Note that NP does not stand for “not polynomial” – rather, it is short for “nondeterministic polynomial,” based on an equivalentdefinition of the class using the computational model of nondeterministic Turing machines. This model is beyond the scope of this text.

Let us return to our first example, that of determining whether a maze has a path from start to finish. A maze can berepresented as an undirected graph, with a vertex for each position in the maze and edges between adjacent positions.Then the problem is to determine whether there exists a path between the start vertex 𝑠 and the end vertex 𝑡. This givesus the language:

MAZE = (𝐺, 𝑠, 𝑡) ∶ 𝐺 is a graph that has a path between 𝑠 and 𝑡

We can define a verifier for MAZE as follows. The certificate is a sequence of vertices, representing a path.

𝑉 on inputs (𝐺 = (𝑉,𝐸), 𝑠, 𝑡), 𝑐 = (𝑣1, . . . , 𝑣𝑚) ∶If 𝑚 > ∣𝑉 ∣ or 𝑣1 ≠ 𝑠 or 𝑣𝑚 ≠ 𝑡, rejectFor 𝑖 = 1 to 𝑚 − 1 ∶

If (𝑣𝑖, 𝑣𝑖+1) ∉ 𝐸, rejectAccept

We analyze this verifier as follows. First, the verifier runs in polynomial time with respect to ∣𝐺∣ – it examines at most∣𝑉 ∣ pairs of adjacent vertices, and each pair of vertices can be examined in polynomial time (the exact time dependson how the graph 𝐺 is represented). As for correctness, we have:

• If (𝐺, 𝑠, 𝑡) ∈ MAZE, there exists some path (𝑠, 𝑣2, . . . , 𝑣𝑚−1, 𝑡) between 𝑠 and 𝑡 in 𝐺. Then𝑉 ((𝐺, 𝑠, 𝑡), (𝑠, 𝑣2, . . . , 𝑣𝑚−1, 𝑡)) accepts.

• If (𝐺, 𝑠, 𝑡) ∉ MAZE, then there is no valid path from 𝑠 to 𝑡 in 𝐺. Any certificate 𝑐 will either not start at 𝑠 andend at 𝑡, or it will have some pair of adjacent vertices 𝑣𝑖, 𝑣𝑖+1 such that no edge connects the two vertices. Thus,𝑉 ((𝐺, 𝑠, 𝑡), 𝑐) will reject.

Thus, 𝑉 is a correct and efficient verifier for MAZE, so MAZE ∈ NP.

Returning to the limited-budget TSP example, we define the corresponding language:

TSP = (𝐺, 𝑘) ∶ 𝐺 is a weighted, complete graph that has a tour that visits

all vertices in 𝐺 and has a total weight at most 𝑘

We take as our certificate the actual tour and define a verifier for TSP as follows:

𝑉 on inputs ((𝐺 = (𝑉,𝐸,𝑤 ∶ 𝐸 → R), 𝑘), 𝑐 = (𝑣1, . . . , 𝑣𝑚) ∶If 𝑚 ≠ ∣𝑉 ∣ + 1 or 𝑣1 ≠ 𝑣𝑚, rejectIf there are duplicates in 𝑣1, . . . , 𝑣𝑚−1, reject𝑡𝑜𝑡𝑎𝑙 ∶= 0

For 𝑖 = 1 to 𝑚 − 1 ∶

If (𝑣𝑖, 𝑣𝑖+1) ∉ 𝐸, reject𝑡𝑜𝑡𝑎𝑙 ∶= 𝑡𝑜𝑡𝑎𝑙 + 𝑤(𝑣1, 𝑣𝑖+1)

If 𝑡𝑜𝑡𝑎𝑙 ≤ 𝑘, acceptElse reject



This verifier runs in time polynomial with respect to ∣𝐺∣ – finding duplicates can be done in polynomial time, e.g. bysorting the vertices, and the verifier additionally examines at most ∣𝑉 ∣ edges. Then if (𝐺, 𝑘) ∈ TSP, the actual tour ofweight at most 𝑘 is a valid certificate. Given this certificate, 𝑉 verifies that it actually is a tour and has weight no morethan 𝑘, so 𝑉 accepts. On the other hand, if (𝐺, 𝑘) ∉ TSP, then 𝑉 rejects all certificates – either the given certificateis not a valid path (e.g. it visits a nonexistent vertex), or it doesn’t start and end at the same vertex, or it doesn’t visitall the vertices, or its total weight is more than 𝑘. Thus, 𝑉 correctly and efficiently verifies TSP, and TSP ∈ NP.

15.2 P Versus NP

We have now defined two distinct classes of languages:

• P is the class of languages that can be decided efficiently

• NP is the class of languages that can be verified efficiently

More formally, for a language 𝐿, we have 𝐿 ∈ P if there exists a polynomial-time algorithm 𝐷 such that:

• if 𝑥 ∈ 𝐿, then 𝐷 accepts 𝑥

• if 𝑥 ∉ 𝐿, then 𝐷 rejects 𝑥

Similarly, we have 𝐿 ∈ NP if there exists a polynomial-time algorithm 𝑉 such that:

• if 𝑥 ∈ 𝐿, then 𝑉 (𝑥, 𝑐) accepts for at least one certificate 𝑐

• if 𝑥 ∉ 𝐿, then 𝑉 (𝑥, 𝑐) rejects for all certificates 𝑐

How are these two classes related? First, if a language is efficiently decidable, it is trivially efficiently verifiable – averifier can just ignore the certificate and determine independently whether the input is in the language or not, and itcan do so efficiently. Thus, we have

P ⊆ NP

This relationship allows two possibilities:

• P ⊂ NP or

• P = NP

The latter possibility would mean that every efficiently verifiable problem is also efficiently decidable. Is that the case?What is the answer to the question

P?= NP

We don’t actually know! This is perhaps the greatest unsolved problem in all of Computer Science.

Consider two of the languages we just saw, MAZE and TSP. We saw that both are in NP, and in fact, the languageshave similar definitions and their verifiers have much in common. However, we know that MAZE ∈ P – a decidercan perform a breadth-first search from 𝑠 to see whether it reaches 𝑡, and this search can be done efficiently. On theother hand, we do not know whether or not TSP is in P – we know of no algorithm that efficiently decides TSP, butwe also do not know that such an algorithm does not exist.

How can we resolve the P?= NP question? If the two are not actually equal, one way we could show this is to find a

language in NP and show that it is not in P. However, this seems exceedingly difficult – we would need to somehowdemonstrate that of all the infinitely many possible algorithms, none of them decide the language in question inpolynomial time. On the other hand, showing that P = NP also seems difficult – we would need to demonstrate thatfor any language in NP, it is possible to write an efficient decider for that language.

15.2. P Versus NP 145


How can we make the task simpler? Suppose we identify a “hard” language 𝐿, such that a solution to 𝐿 would leadto a solution to all languages in NP. Then showing P = NP would only require proving that 𝐿 ∈ P. Similarly, if 𝐿is a “hard” language and is also in NP, it is likely to be easier to prove that 𝐿 ∉ 𝑃 than it would be for an “easier”language in NP.

Informally, we say that a language 𝐿 is NP-Hard if a polynomial-time algorithm for 𝐿 can be converted to yielda polynomial-time algorithm for all languages in NP. We will formalize the definition (page 157) of an NP-Hardlanguage later.

Do NP-Hard languages actually exist? Indeed they do, and we will discuss one such example next.

15.2. P Versus NP 146

CHAPTER

SIXTEEN

THE COOK-LEVIN THEOREM

Our first example of an NP-Hard problem is Boolean satisfiability. Given a Boolean formula such as

𝜑 = (𝑦 ∨ (¬𝑥 ∧ 𝑧) ∨ ¬𝑧) ∧ (¬𝑥 ∨ 𝑧)

we wish to determine whether there is an assignment of its variables such that the formula is true. A formula iscomprised of several components:

• variables such as 𝑥 and 𝑦, which can either be true (often represented as 1) or false (represented as 0)

• literals, which are either a variable or its negation (e.g. 𝑥 or ¬𝑥); each appearance is considered to be a distinctliteral

• operators, which may either be conjunction (and), represented as ∧, or disjunction (or), represented as ∨

The size of a formula is the number of literals it contains; the formula 𝜑 above has size 6.

An assignment is a mapping of the variables in a formula to truth values. We can represent an assignment over 𝑛variables as an 𝑛-tuple, such as (𝑎1, . . . , 𝑎𝑛). For the formula above, the assignment (0, 1, 0) maps 𝑥 to the value 0, 𝑦to the value 1, and 𝑧 to the value 0. We use the notation 𝜑(𝑎1, . . . , 𝑎𝑛) to denote the value of 𝜑 given the assignment(𝑎1, . . . , 𝑎𝑛).

A satisfying assignment is an assignment that causes the formula as a whole to be true. In the example above, (0, 1, 0)is a satisfying assignment:

𝜑(0, 1, 0) = (1 ∨ (¬0 ∧ 0) ∨ ¬0) ∧ (¬0 ∨ 0)= (1 ∨ (1 ∧ 0) ∨ 1) ∧ (1 ∨ 0)= 1 ∧ 1

= 1

On the other hand, (1, 0, 1) is not a satisfying assignment:

𝜑(1, 0, 1) = (0 ∨ (¬1 ∧ 1) ∨ ¬1) ∧ (¬1 ∨ 1)= (0 ∨ (0 ∧ 1) ∨ 0) ∧ (0 ∨ 1)= 0 ∧ 1

= 0

A formula 𝜑 is satisfiable if it has a satisfying assignment. The language corresponding to whether a formula issatisfiable is just the set of satisfiable formulas:

SAT = 𝜑 ∶ 𝜑 is a satisfiable Boolean formula

This language is decidable. A formula 𝜑 of size 𝑛 has at most 𝑛 variables, so there are at most 2𝑛 possible assignments.

A decider can just iterate over each of these assignments, compute the value of the formula on each assignment, acceptif an assignment results in a true value, and reject if none of the assignments results in a true value.

147


SAT can also be efficiently verified. The certificate is just an assignment, and the verifier need only compute the valueof the formula for the given assignment. This can be done in linear time. By definition, a formula 𝜑 is satisfiable if andonly if it has a satisfying assignment. Then if 𝜑 is satisfiable, there is a certificate (a satisfying assignment) that causesthe verifier to accept, and if 𝜑 is not satisfiable, no certificate results in the verifier accepting. Thus, SAT is efficientlyverifiable, and SAT ∈ NP.

Is SAT efficiently decidable? The decider we discussed above is not efficient – it takes time exponential in the numberof variables in the formula, and the size of the formula (i.e. the number of literals) may be the same as the number

of variables. We actually do not know whether SAT is efficiently decidable – the question of whether SAT?∈ P is

unresolved. However, the Cook-Levin theorem states that SAT is NP-Hard; thus, if SAT is actually in P, then P = NP.

Theorem 102 (Cook-Levin) SAT is NP-Hard.

The key idea in the proof of this theorem is that for an arbitrary language 𝐿 ∈ NP and input 𝑥, we can turn an efficientverifier 𝑉 for 𝐿 and the input 𝑥 into a Boolean formula 𝜑𝑉,𝑥 such that:

• if 𝑥 ∈ 𝐿, then 𝜑𝑉,𝑥 ∈ SAT

• if 𝑥 ∉ 𝐿, then 𝜑𝑉,𝑥 ∉ SAT

The size of 𝜑𝑉,𝑥 will be polynomial in the size of 𝑥. Then if SAT has an efficient decider 𝐷SAT, we can decide 𝐿 asfollows:

𝐷𝐿 on input 𝑥 ∶Construct 𝜑𝑉,𝑥

Run 𝐷SAT on 𝜑𝑉,𝑥 and answer as 𝐷SAT

Since 𝜑𝑉,𝑋 can be constructed in polynomial time with respect to the size of 𝑥 and 𝐷SAT runs in time polynomialwith respect to the size of 𝜑𝑉,𝑥, 𝐷𝐿 as a whole takes polynomial time. Thus, 𝐷𝐿 is an efficient decider for 𝐿. So if anefficient decider 𝐷SAT for SAT exists, then 𝐿 ∈ P. Since this is true for an arbitrary language 𝐿 ∈ NP, we have thatif SAT ∈ P, then P = NP.

Before we proceed to the construction of 𝜑𝑉,𝑥, we review the string representation of the configuration of a Turingmachine, which we previously discussed in Wang Tiling (page 111). A configuration encodes the contents of themachine’s tape, the state 𝑞 ∈ 𝑄 in which the machine is, and the position of the head. We represent these details withan infinite string over the alphabet Γ ∪𝑄, where the string contains:

• a symbol for the contents of each cell, in order starting from the leftmost cell

• the state 𝑞 ∈ 𝑄, positioned immediately to the left of the symbol corresponding to the cell on which the head islocated

Then if the input is 𝑥1𝑥2 . . . 𝑥𝑛 and the initial state is 𝑞0, the following encodes the starting configuration of themachine:

𝑞0𝑥1𝑥2 . . . 𝑥𝑛⊥⊥ . . .

Since the head is at the leftmost cell and the state is 𝑞0, the string has 𝑞0 to the left of the leftmost tape symbol. If thetransition function gives 𝛿(𝑞0, 𝑥1) = (𝑞′, 𝑥′, 𝑅), the new configuration is:

𝑥′𝑞′𝑥2 . . . 𝑥𝑛⊥⊥ . . .

The first cell has been modified to 𝑥′, the machine is in state 𝑞

′, and the head is over the second cell, represented hereby writing 𝑞

′ to the left of that cell’s symbol.

For a language 𝐿 ∈ NP, there exists an efficient verifier 𝑉 (𝑥, 𝑐) that runs in time polynomial with respect to the sizeof 𝑥. In other words, there is some fixed 𝑘 such that 𝑉 (𝑥, 𝑐) makes at most ∣𝑥∣𝑘 steps before halting. Since the headstarts on the leftmost cell and can only move one position in each step, this implies that 𝑉 (𝑥, 𝑐) can only read or write

148


the first ∣𝑥∣𝑘 cells on the tape. Thus, we only need a string of size ∣𝑥∣𝑘 to encode the configuration of 𝑉 as opposedto an infinite string.

For an input of size 𝑛, we can represent the configurations of the machine over the (at most) 𝑛𝑘 steps with an 𝑛𝑘 × 𝑛

𝑘

tableau, with one configuration per row. In each row, we place the symbol # at the start and end42.

# 𝑞!" 𝑤# 𝑤$ ⋯ 𝑤% ⊥ ⋯ ⊥ ## ## #

# #

𝑛!

𝑛!

Initial configuration

After 1 step

𝑉 halts after at most 𝑛! steps

Each cell of the tableau contains a symbol out of the set 𝑆 = Γ ∪𝑄 ∪ #, where Γ is the tape alphabet of 𝑉 and 𝑄is the set of states in 𝑉 . For each cell 𝑖, 𝑗 and each symbol 𝑠 ∈ 𝑆, we define a Boolean variable 𝑡𝑖,𝑗,𝑠 such that:

𝑡𝑖,𝑗,𝑠 = 1 if the cell in row 𝑖 and column 𝑗 contains the symbol 𝑠0 otherwise

This gives us a total of ∣𝑆∣ ⋅ 𝑛2𝑘 variables, which is polynomial in 𝑛; 𝑆 does not depend on 𝑛 in any way, so its size isconstant with respect to 𝑛.

Given an input 𝑥, we construct a formula 𝜑𝑉,𝑥 over the ∣𝑆∣ ⋅ 𝑛2𝑘 variables 𝑡𝑖,𝑗,𝑠, where 𝑛 = ∣𝑥∣, such that:

• if 𝑉 (𝑥, 𝑐) accepts for some certificate 𝑐, then 𝜑𝑉,𝑥 ∈ 𝑆𝐴𝑇

• if 𝑉 (𝑥, 𝑐) rejects for all certificates 𝑐, then 𝜑𝑉,𝑥 ∉ 𝑆𝐴𝑇

The formula 𝜑𝑉,𝑥 will be defined as the conjunction of several subformulas:

𝜑𝑉,𝑥 = 𝜑𝑉,𝑥,𝑠𝑡𝑎𝑟𝑡 ∧ 𝜑𝑉,𝑥,𝑐𝑒𝑙𝑙 ∧ 𝜑𝑉,𝑥,𝑎𝑐𝑐𝑒𝑝𝑡 ∧ 𝜑𝑉,𝑥,𝑚𝑜𝑣𝑒

The purpose of these subformulas is as follows:

• 𝜑𝑉,𝑥,𝑠𝑡𝑎𝑟𝑡 enforces the starting configuration in the first row of the tableau

• 𝜑𝑉,𝑥,𝑐𝑒𝑙𝑙 ensures that each cell contains exactly one symbol

• 𝜑𝑉,𝑥,𝑎𝑐𝑐𝑒𝑝𝑡 ensures that 𝑉 reaches an accepting configuration

• 𝜑𝑉,𝑥,𝑚𝑜𝑣𝑒 enforces that each configuration follows from the previous configuration according to the transitionfunction of 𝑉

42 Technically, this means we need 𝑛𝑘 +3 columns to represent a configuration: 𝑛𝑘 tape cells, plus the state 𝑞 and the two # symbols. However,

this technicality is not important to the proof, so we ignore it for the rest of our discussion.

149


16.1 Starting Configuration

For the starting configuration, we assume an encoding of the input (𝑥, 𝑐) such that the symbol $ separates 𝑥 from 𝑐.Let 𝑚 be the size of the certificate. Then the starting configuration is as follows:

# 𝑞!" 𝑥# ⋯ 𝑥$ $ 𝑐# ⋯ 𝑐% ⊥ ⋯ #The configuration starts with the # symbol. The head is over the leftmost tape cell and the machine is in state 𝑞𝑠𝑡𝑎𝑟𝑡,so 𝑞𝑠𝑡𝑎𝑟𝑡 is in the second position of the configuration. The symbols of the input 𝑥 follow, then the $ separator and thesymbols of the certificate 𝑐. The remaining cells are blank, except for the last which contains the symbol #.

For each of the input symbols 𝑥𝑗 , we require the tableau cell at position 1, 𝑗 + 2 to contain the symbol 𝑥𝑗 . Thecorresponding variable 𝑡1,𝑗+2,𝑥𝑗

must be true. Thus, we insure the correct input with:

𝜑𝑖𝑛𝑝𝑢𝑡 = 𝑡1,3,𝑥1∧ 𝑡1,4,𝑥2

∧ ⋅ ⋅ ⋅ ∧ 𝑡1,𝑛+2,𝑥𝑛

For the cells corresponding to the certificate, we do not actually know in advance what certificate causes 𝑉 to accept,or even what the size 𝑚 is (other than that it is less than 𝑛

𝑘). As a result, we allow any symbol 𝛾 ∈ Γ to appear in thepositions after the input 𝑥. Then the portion of the formula corresponding to the certificate is:

𝜑𝑐𝑒𝑟𝑡𝑖𝑓𝑖𝑐𝑎𝑡𝑒 = (⋁𝛾∈Γ

𝑡1,𝑛+4,𝛾) ∧ ⋅ ⋅ ⋅ ∧ (⋁𝛾∈Γ

𝑡1,𝑛𝑘−1,𝛾)

Putting these pieces together with the rest of the starting configuration, we have:

𝜑𝑉,𝑥,𝑠𝑡𝑎𝑟𝑡 = 𝑡1,1,# ∧ 𝑡1,2,𝑞𝑠𝑡𝑎𝑟𝑡∧ 𝜑𝑖𝑛𝑝𝑢𝑡 ∧ 𝑡1,𝑛+3,$ ∧ 𝜑𝑐𝑒𝑟𝑡𝑖𝑓𝑖𝑐𝑎𝑡𝑒 ∧ 𝑡1,𝑛𝑘,#

This formula contains one literal for each column of the input, the state, and the # and $ symbols. It contains ∣Γ∣literals for each possible component of the certificate. Observe that ∣Γ∣ is a constant that depends on the verifier 𝑉 butnot on the input 𝑥. Thus, there are a constant number of literals for each element of the starting configuration, for atotal of O(𝑛𝑘) literals.

16.2 Cell Consistency

To ensure that each cell 𝑖, 𝑗 contains exactly one symbol, we need at least one of the variables 𝑡𝑖,𝑗,𝑠 to be true:

⋁𝑠∈𝑆

𝑡𝑖,𝑗,𝑠

We also cannot have more than one such variable be true – otherwise the cell would contain more than one symbol.For any pair of distinct symbols 𝑠 ≠ 𝑢, we require that at least one of 𝑡𝑖,𝑗,𝑠 or 𝑡𝑖,𝑗,𝑢 be false:

⋀𝑠≠𝑢∈𝑆

(¬𝑡𝑖,𝑗,𝑠 ∨ ¬𝑡𝑖,𝑗,𝑢)

Putting these together for all cells 𝑖, 𝑗, we get:

𝜑𝑉,𝑥,𝑐𝑒𝑙𝑙 = ⋀1≤𝑖,𝑗≤𝑛𝑘

[⋁𝑠∈𝑆

𝑡𝑖,𝑗,𝑠 ∧ ⋀𝑠≠𝑢∈𝑆

(¬𝑡𝑖,𝑗,𝑠 ∨ ¬𝑡𝑖,𝑗,𝑢)]

The formula contains O(∣𝑆∣2) literals for each cell. Since 𝑆 does not depend on the input size, there are a constantnumber of literals per cell, and 𝜑𝑉,𝑥,𝑐𝑒𝑙𝑙 has O(𝑛2𝑘) literals in total.

16.1. Starting Configuration 150


16.3 Accepting Configuration

We ensure that an accepting configuration is reached by requiring at least one cell to have 𝑞𝑎𝑐𝑐𝑒𝑝𝑡 as its symbol:

𝜑𝑉,𝑥,𝑎𝑐𝑐𝑒𝑝𝑡 = ⋁1≤𝑖,𝑗≤𝑛𝑘

𝑡𝑖,𝑗,𝑞𝑎𝑐𝑐𝑒𝑝𝑡

This has one literal per cell, for a total of O(𝑛2𝑘) literals.

16.4 Transitions

To ensure that each row follows from the previous one according to the transition function of 𝑉 , we divide the tableauinto 2-by-3 windows and ensure that each window is valid:

# 𝑞!" 𝑤# 𝑤$ ⋯ 𝑤% ⊥ ⋯ ⊥ ## ## #

? ? ?? ? ?

# #

𝑛!

𝑛!

There are ∣𝑆∣6 distinct possibilities for a 2-by-3 window. However, not all windows are valid according to the behaviorof 𝑉 and the construction of the tableau. We describe the conditions for valid windows below. We use the followingnotation in the descriptions and diagrams:

• 𝛾 for an element of Γ

• 𝑞 for an element of 𝑄

• 𝜌 for an element of Γ ∪𝑄

• 𝜎 for an element of Γ ∪ #.

For a window to be valid with respect to the # symbol, one of the following must hold:

• the window does not contain any # symbols at all

16.3. Accepting Configuration 151


• the window contains the # symbol in the first position of both rows

• the window contains the # symbol in the third position of both rows

The following are valid windows with respect to #:

# 𝜌! 𝜌"# 𝜌# 𝜌$

𝜌! 𝜌" 𝜌#𝜌$ 𝜌% 𝜌&

𝜌! 𝜌" #𝜌# 𝜌$ #

The remaining symbols 𝜌1, . . . , 𝜌6 must be valid according to the rules below.

If a window is not near a state symbol 𝑞 ∈ 𝑄, then a transition does not affect the portion of the configurationcorresponding to the window, so the top and bottom row match:

𝜎! 𝛾 𝜎"𝜎! 𝛾 𝜎"

To reason about what happens to a window that is near a state symbol, we take a look at what happens to the configu-ration as a result of a rightward transition 𝛿(𝑞, 𝛾) = (𝑞′, 𝛾 ′, 𝑅):

𝛾! 𝛾" 𝛾# 𝛾$ 𝑞 𝛾 𝛾% 𝛾& 𝛾'𝛾! 𝛾" 𝛾# 𝛾$ 𝛾( 𝑞( 𝛾% 𝛾& 𝛾'

1 2 3 4

The head moves to the right, so 𝑞′ appears in the bottom row at the column to the right of where 𝑞 appears in the top

row. The symbol 𝛾 is replaced by 𝛾′, though its position moves to the left to compensate for the rightward movement

of the state. There are four windows affected by the transition, labeled above in the leftmost column of the window.We thus have the following four valid 2-by-3 windows corresponding to a rightward transition 𝛿(𝑞, 𝛾) = (𝑞′, 𝛾 ′, 𝑅):

𝜎 𝛾! 𝑞𝜎 𝛾! 𝛾"

𝜎 𝑞 𝛾𝜎 𝛾" 𝑞"

𝑞 𝛾 𝜎𝛾" 𝑞" 𝜎

𝛾 𝛾! 𝜎𝑞" 𝛾! 𝜎

We have used 𝜎 for the edges of the window, as it can actually either be a tape symbol or the tableau-edge marker #.

We now take a look at what happens to the configuration as a result of a leftward transition 𝛿(𝑞, 𝛾) = (𝑞′, 𝛾 ′, 𝐿):

16.4. Transitions 152


𝛾! 𝛾" 𝛾# 𝛾$ 𝑞 𝛾 𝛾% 𝛾& 𝛾'𝛾! 𝛾" 𝛾# 𝑞( 𝛾$ 𝛾( 𝛾% 𝛾& 𝛾'

1 2 3 4 5

The head moves to the left, so 𝑞′ appears in the bottom row at the column to the left of where 𝑞 appears in the top

row. As with a rightward transition, we replace the old symbol 𝛾 with the new symbol 𝛾 ′. This time, however, it isthe symbol to the left of the original position of the head that moves to compensate for the movement of the state. Wenow have five windows affected by this transition, giving us the following five valid 2-by-3 windows for a leftwardtransition 𝛿(𝑞, 𝛾) = (𝑞′, 𝛾 ′, 𝐿):

𝜎 𝛾! 𝛾"𝜎 𝛾! 𝑞#

𝜎 𝛾" 𝑞𝜎 𝑞# 𝛾"

𝑞 𝛾 𝜎𝛾" 𝛾# 𝜎

𝛾 𝛾" 𝜎𝛾# 𝛾" 𝜎

𝛾" 𝑞 𝛾𝑞# 𝛾" 𝛾#

We also need to account for the case where the head is over the leftmost cell on the tape, in which case the head doesnot move:

# 𝑞 𝛾 𝛾! 𝛾" 𝛾## 𝑞$ 𝛾$ 𝛾! 𝛾" 𝛾#1 2 3

Here, we have three windows affected, but the last actually has the same structure as the last window in the normalcase for a leftward transition. Thus, we only need two additional valid windows:

# 𝑞 𝛾# 𝑞! 𝛾!

𝑞 𝛾 𝛾"𝑞! 𝛾! 𝛾"

We need one last set of valid windows to account for the machine reaching the accept state before the last row. Weallow the accepting configuration to repeat indefinitely as needed by including windows with 𝑞𝑎𝑐𝑐𝑒𝑝𝑡 at the sameposition in both rows:

16.4. Transitions 153


𝑞!"" 𝛾 𝜎𝑞!"" 𝛾 𝜎

𝜎# 𝑞!"" 𝜎$𝜎# 𝑞!"" 𝜎$

𝜎 𝛾 𝑞!""𝜎 𝛾 𝑞!""

For each valid window 𝑤 and each starting position 1 ≤ 𝑖 ≤ 𝑛𝑘 − 1, 1 ≤ 𝑗 ≤ 𝑛

𝑘 − 2, we define a clause 𝜑𝑖,𝑗,𝑤

corresponding to that window. Let 𝑠1, . . . , 𝑠6 be the six symbols in 𝑤. The clause is then:

𝜑𝑖,𝑗,𝑤 = (𝑡𝑖,𝑗,𝑠1 ∧ 𝑡𝑖,𝑗+1,𝑠2 ∧ 𝑡𝑖,𝑗+2,𝑠3 ∧ 𝑡𝑖+1,𝑗,𝑠4 ∧ 𝑡𝑖+1,𝑗+1,𝑠5 ∧ 𝑡𝑖+1,𝑗+2,𝑠6)

For each position, we only need one such clause to be satisfied over the set of valid windows. Let 𝑊 be the set of allvalid windows. Then we need:

𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒 = ⋁𝑤∈𝑊

𝜑𝑖,𝑗,𝑤

Finally, we need every window in the tableau to be valid, so we have:

𝜑𝑉,𝑥,𝑚𝑜𝑣𝑒 = ⋀1≤𝑖≤𝑛𝑘−1,1≤𝑗≤𝑛𝑘−2

𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒

The set of valid windows 𝑊 does not depend on the input size 𝑛, so 𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒 has a constant number of literals. Then𝜑𝑉,𝑥,𝑚𝑜𝑣𝑒 has a different copy of 𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒 for each cell window that starts at 𝑖, 𝑗. There are O(𝑛2𝑘) windows, so𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒 has O(𝑛2𝑘) literals in total.

16.5 Conclusion

We have demonstrated how to construct a formula 𝜑𝑉,𝑥 from an efficient verifier 𝑉 for a language 𝐿 ∈ NP and an input𝑥. The total size of 𝜑𝑉,𝑥 is O(𝑛2𝑘) literals, where 𝑛 = ∣𝑥∣. This is polynomial in the size of 𝑥, and the constructioncan be done in polynomial time.

For the formula to be satisfied, 𝜑𝑉,𝑥,𝑎𝑐𝑐𝑒𝑝𝑡 must be satisfied, and it ensures that an accepting configuration is reachedsomewhere in the tableau. The remaining pieces of 𝜑𝑉,𝑥 ensure that the computation leading to that accepting config-uration is valid. If 𝛼 is a satisfying assignment for 𝜑𝑉,𝑥, the certificate 𝑐 that causes 𝑉 (𝑥, 𝑐) to accept is comprised ofthe variables 𝑡1,𝑛+4,𝛾 , . . . 𝑡1,𝑛𝑘−1,𝛾 that are true in 𝛼 (ignoring the trailing⊥ symbols). A satisfying assignment existsif and only if a certificate 𝑐 exists such that 𝑉 (𝑥, 𝑐) accepts.

Thus, if we have an efficient decider 𝐷SAT for SAT, we can construct 𝜑𝑉,𝑥 and run 𝐷SAT on it to decide 𝐿. As aresult, an efficient decider for SAT implies that P = NP, and SAT is an NP-Hard problem.

16.5. Conclusion 154

CHAPTER

SEVENTEEN

NP-COMPLETENESS

The Cook-Levin theorem gives us a process for converting an instance of an arbitrary NP language 𝐿 to an instanceof SAT, such that:

• if 𝑥 ∈ 𝐿, then 𝜑𝑉,𝑥 ∈ SAT

• if 𝑥 ∉ 𝐿, then 𝜑𝑉,𝑥 ∉ SAT

The process for doing so is efficient, taking time polynomial in the size of 𝑥. Thus, if we have an efficient decider forSAT, we can also efficiently decide 𝐿 by performing the Cook-Levin conversion and running the SAT decider on theresulting formula.

This conversion of an instance of one problem to an instance of another is called a mapping reduction. Formally, amapping reduction is defined as follows:

Definition 103 (Mapping Reduction) A mapping reduction from language 𝐴 to language 𝐵 is a computablefunction (page 128) 𝑓 ∶ Σ

∗→ Σ

∗ such that:

• 𝑥 ∈ 𝐴 ⟹ 𝑓(𝑥) ∈ 𝐵

• 𝑥 ∉ 𝐴 ⟹ 𝑓(𝑥) ∉ 𝐵

𝐴 𝐵

Σ∗ Σ∗

𝑥 𝑓(𝑥)

𝑦 𝑓(𝑦)

YES YES

NO NO

A mapping reduction 𝑓 may or may not be computable efficiently – it is only efficient if 𝑓 is polynomial-time com-putable.

Definition 104 (Polynomial-time Computable) A function 𝑓 is polynomial-time computable if 𝑓(𝑥) can com-puted in time polynomial with respect to ∣𝑥∣, the size of 𝑥.

155


Definition 105 (Polynomial-time Mapping Reduction) A polynomial-time mapping reduction, also called apolynomial-time reduction or just a polytime reduction, is a mapping reduction 𝑓 that is polynomial-time com-putable.

A language 𝐴 is polynomial-time reducible, or polytime reducible, to a language 𝐵 if there is a polynomial-timereduction from 𝐴 to 𝐵. We use the notation

𝐴 ≤p 𝐵

to indicate that 𝐴 is polytime reducible to 𝐵.

Observe that there are several differences between Turing reducibility (page 103) and polynomial-time reducibility.

• A Turing reduction 𝐴 ≤T 𝐵 requires demonstrating that a decider for 𝐴 can be written using a black-boxdecider for 𝐵. A polynomial-time mapping reduction 𝐴 ≤p 𝐵 does not involve a black box. Instead, it requiresdemonstrating a computable function that converts instances of 𝐴 to instances of 𝐵 and non-instances of 𝐴 tonon-instances of 𝐵.

• There are no efficiency requirements on a Turing reduction – the black box may be invoked any number oftimes, and the decider that uses the black box may take arbitrary (but finite) time. For a polynomial-timemapping reduction, the conversion function 𝑓(𝑥) must be computable in time polynomial in ∣𝑥∣.

We saw previously that 𝐴 ≤T 𝐵 and 𝐵 decidable implies that 𝐴 is decidable43. A polytime reduction 𝐴 ≤p 𝐵 givesus a similar result with respect to membership in P.

Theorem 106 If 𝐴 ≤p 𝐵 and 𝐵 ∈ P, then 𝐴 ∈ P.

Proof 107 Since 𝐵 ∈ P, there exists a polynomial-time decider 𝐷𝐵 such that:

• if 𝑦 ∈ 𝐵, then 𝐷𝐵 accepts 𝑦 in time O(∣𝑦∣𝑘) for some constant 𝑘 ≥ 1

• if 𝑦 ∉ 𝐵, then 𝐷𝐵 rejects 𝑦 in time O(∣𝑦∣𝑘)Additionally, since 𝐴 ≤p 𝐵, there exists a function 𝑓 such that:

• if 𝑥 ∈ 𝐴, then 𝑓(𝑥) ∈ 𝐵

• if 𝑥 ∉ 𝐴, then 𝑓(𝑥) ∉ 𝐵

Furthermore, 𝑓 is computable in polynomial time, so there exists a Turing machine 𝐶 such that:

• given 𝑥 ∈ 𝐴, then 𝐶 outputs 𝑓(𝑥) ∈ 𝐵 in O(∣𝑥∣𝑚) time for some constant 𝑚 ≥ 1

• given 𝑥 ∉ 𝐴, then 𝐶 outputs 𝑓(𝑥) ∉ 𝐵 in O(∣𝑥∣𝑚) time

We can use 𝐶 and 𝐷𝐵 to construct a polynomial-time decider for 𝐴:

𝐷𝐴 on input 𝑥 ∶Let 𝑦 be the result of 𝐶 on 𝑥

Run 𝐷𝐵 on 𝑦 and answer as 𝐷𝐵

We analyze this decider as follows:

• If 𝑥 ∈ 𝐴, then 𝑦 ∈ 𝐵 since 𝐶 computes 𝑓(𝑥), and 𝑥 ∈ 𝐴 ⟹ 𝑓(𝑥) ∈ 𝐵. Then 𝐷𝐵 accepts 𝑦, so 𝐷𝐴

accepts 𝑥.

• If 𝑥 ∉ 𝐴, then 𝑦 ∉ 𝐵 since 𝑥 ∉ 𝐴 ⟹ 𝑓(𝑥) ∉ 𝐵. Then 𝐷𝐵 rejects 𝑦, so 𝐷𝐴 rejects 𝑥.

43 Denoting the class of decidable languages by R, we have:If 𝐴 ≤T 𝐵 and 𝐵 ∈ R, then 𝐴 ∈ R.

156


Thus, 𝐷𝐴 is a decider for 𝐴. Furthermore, 𝐷𝐴 runs in time polynomial with respect to ∣𝑥∣. The invocation of 𝐶on 𝑥 takes time O(∣𝑥∣𝑚), and the size of its output 𝑦 can be at most O(∣𝑥∣𝑚) – a Turing machine can write atmost one symbol of the output in each step. Then the invocation of 𝐷𝐵 on 𝑦 takes time in O(∣𝑦∣𝑘) = O(∣𝑥∣𝑚𝑘).The total time for 𝐷𝐴 is then O(∣𝑥∣𝑚+ ∣𝑥∣𝑚𝑘) = O(∣𝑥∣𝑚𝑘), which is polynomial in ∣𝑥∣ since 𝑚𝑘 is a constant.Thus, 𝐷𝐴 is an efficient decider for 𝐴.

Since 𝐴 has an efficient decider, we conclude that 𝐴 ∈ P.

The Cook-Levin theorem demonstrates that 𝐿 ≤p SAT for any language 𝐿 ∈ NP. Thus, if SAT ∈ P, all languagesin NP are in P, so P = NP. We stated previously that informally, a language 𝐴 is NP-Hard if 𝐴 ∈ P implies thatP = NP. We now formalize the definition of NP-Hard.

Definition 108 (NP-Hard) A language 𝐴 is NP-Hard if for every language 𝐿 ∈ NP, 𝐿 ≤p 𝐴.

A language need not be in NP or even decidable to be NP-Hard. For instance, 𝐿ACC is an NP-Hard language; weleave the proof as an exercise.

If a language 𝐴 is both in NP and also NP-Hard, then we say that 𝐴 is NP-Complete.

Definition 109 (NP-Complete) A language 𝐴 is NP-Complete if 𝐴 ∈ NP and 𝐴 is NP-Hard.

Since SAT ∈ NP and SAT is NP-Hard, SAT is an NP-Complete language.

The following demonstrates the relationships between P, NP, and NP-Hard and NP-Complete languages, under thetwo possibilities P ⊂ NP and P = NP:

NP-Complete = NP ∩ NP-Hard

NP

P

NP-Hard

P = NP≈ NP-Complete

P ⊂ NP P = NP

NP-Hard

If P = NP, then all languages in NP are NP-Complete, except the two trivial languages Σ∗ and ∅.

To show that a language 𝐴 ≠ SAT is NP-Hard, do we need to repeat the work of the Cook-Levin theorem on 𝐴? Recallthat for showing that a language is undecidable, we could either do so directly like we did for 𝐿ACC, or we could doso via a Turing reduction from a known undecidable language. Similarly, to show that a language is NP-Hard, we caneither do so directly as we did for SAT, or we can do so via a polytime reduction from a known NP-Hard language.

Claim 110 If 𝐴 ≤p 𝐵 and 𝐴 is NP-Hard, then 𝐵 is NP-Hard.

157


This claim follows from the fact that polytime reductions are transitive. By definition, 𝐴 is NP-Hard if for everylanguage 𝐿 ∈ NP, 𝐿 ≤p 𝐴. Since we have 𝐿 ≤p 𝐴 ≤p 𝐵, we have 𝐿 ≤p 𝐵 by transitivity. Thus, for every language𝐿 ∈ NP, 𝐿 ≤p 𝐵, and 𝐵 is NP-Hard.

Combining Theorem 106 and Claim 110, the reduction 𝐴 ≤p 𝐵 gives us the following information:

Known Fact Conclusion𝐴 ∈ P ?𝐴 NP-Hard 𝐵 NP-Hard𝐵 ∈ P 𝐴 ∈ P𝐵 NP-Hard ?

Note, however, that 𝐿 ∈ P and 𝐿 being NP-Hard are only mutually exclusive if P ≠ NP, and we do not know if thatis the case.

Exercise 111 Show that 𝐿ACC is NP-Hard.

Exercise 112 Show that polytime mapping reductions are transitive. That is, if 𝐴 ≤p 𝐵 and 𝐵 ≤p 𝐶, then𝐴 ≤p 𝐶.

Exercise 113 Show that the languages Σ∗ and ∅ cannot be NP-Complete, even if P = NP.

17.1 3SAT

As a second example of an NP-Complete problem, we consider satisfiability of Boolean formulas with a specificstructure. First, we introduce a few more terms that apply to Boolean formulas:

• A clause is a disjunction of literals, such as (𝑎 ∨ 𝑏 ∨ ¬𝑐 ∨ 𝑑).

• A Boolean formula is in conjunctive normal form (CNF) if it consists of a conjunction of clauses. An exampleof a CNF formula is:

𝜑 = (𝑎 ∨ 𝑏 ∨ ¬𝑐 ∨ 𝑑) ∧ (¬𝑎) ∧ (𝑎 ∨ ¬𝑏) ∧ (𝑐 ∨ 𝑑)

• A 3CNF formula is a Boolean formula in conjunctive normal form where each clause has exactly three literals,such as the following:

𝜑 = (𝑎 ∨ 𝑏 ∨ ¬𝑐) ∧ (¬𝑎 ∨ ¬𝑎 ∨ 𝑑) ∧ (𝑎 ∨ ¬𝑏 ∨ 𝑑) ∧ (𝑐 ∨ 𝑑 ∨ 𝑑)

We then define the language 3SAT as the set of satisfiable 3CNF formulas:

3SAT = 𝜑 ∶ 𝜑 is a satisfiable 3CNF formula

Observe that 3SAT contains strictly fewer formulas than SAT, i.e. 3SAT ⊂ SAT. Despite this, we will demonstratethat 3SAT is NP-Complete.

First, we must show that 3SAT ∈ NP. The verifier is the same as for SAT, except that it also checks whether the

17.1. 3SAT 158


formula is in 3CNF:

𝑉 on inputs 𝜑 (a Boolean formula), 𝛼 (an assignment) ∶If 𝜑 is not in 3CNF or 𝛼 is not an assignment for the variables in 𝜑, rejectCompute 𝜑(𝛼)If 𝜑(𝛼) is true, acceptElse reject

The value of 𝜑(𝛼) can be computed in linear time, and we can also check that 𝜑 is in 3CNF and that 𝛼 is a validassignment in linear time, so 𝑉 is efficient in ∣𝜑∣. Furthermore, if 𝜑 is in 3CNF and is satisfiable, there is somecertificate 𝛼 such that 𝑉 (𝜑, 𝛼) accepts, but if 𝜑 is not in 3CNF or is unsatisfiable, 𝑉 (𝜑, 𝛼) rejects for all 𝛼. Thus, 𝑉efficiently verifies 3SAT.

We now show that 3SAT is NP-Hard with the reduction SAT ≤p 3SAT. Given a Boolean formula 𝜑, we need toefficiently compute a new formula 𝑓(𝜑) such that 𝜑 ∈ SAT ⟺ 𝑓(𝜑) ∈ 3SAT. We can do so as follows:

1. First, convert 𝜑 into a formula 𝜑′ in conjunctive normal form. We can do so efficiently, as described in detail

below (page 159).

2. Second, convert the CNF formula 𝜑′ into the 3CNF formula 𝑓(𝜑). We can do so as follows:

a) For each clause consisting of fewer than three literals, repeat the first literal to produce a clause with threeliterals. For example, (𝑎) becomes (𝑎 ∨ 𝑎 ∨ 𝑎), and (¬𝑏 ∨ 𝑐) becomes (¬𝑏 ∨ ¬𝑏 ∨ 𝑐).

b) For each clause with more than three literals, subdivide it into two clauses by introducing a new dummyvariable. In particular, convert the clause

(𝑙1 ∨ 𝑙2 ∨ ⋅ ⋅ ⋅ ∨ 𝑙𝑚)

into

(𝑧 ∨ 𝑙1 ∨ ⋅ ⋅ ⋅ ∨ 𝑙⌊𝑚/2⌋) ∧ (¬𝑧 ∨ 𝑙⌊𝑚/2⌋+1 ∨ ⋅ ⋅ ⋅ ∨ 𝑙𝑚)

Observe that if all 𝑙𝑖 are false, then the original clause is not satisfied. The transformed subformula is notsatisfied for any value of 𝑧, since it would require 𝑧 ∧ ¬𝑧, which is impossible. On the other hand, if atleast one 𝑙𝑖 is true, then the original clause is satisfied. The transformed subformula can be satisfied bychoosing 𝑧 to be true if 𝑖 > ⌊𝑚/2⌋ or 𝑧 to be false if 𝑖 ≤ ⌊𝑚/2⌋. Thus, this transformation preserves thesatisfiability of the original clause.

We repeat the process on the clauses of the transformed subformula, until all clauses have size exactlythree. Each transformation introduces O(1) literals, giving us the (approximate) recurrence

𝑇 (𝑚) = 2𝑇 (𝑚2) +O(1)

By the Master Theorem, we conclude that 𝑇 (𝑚) = O(𝑚), so the recursive transformation increases thesize of the formula by only a linear amount.

Applying both transformations gives us a new formula 𝑓(𝜑) that is polynomial in size with respect to ∣𝜑∣, and thetransformations can be done in polynomial time. The result 𝑓(𝜑) is in 3CNF, and it is satisfiable exactly when 𝜑 is.Thus, we have 𝜑 ∈ SAT ⟺ 𝑓(𝜑) ∈ 3SAT, and 𝑓 is computable in polynomial time, so we have SAT ≤p 3SAT.Since SAT is NP-Hard, so is 3SAT.

Having shown that 3SAT ∈ NP and 3SAT is NP-Hard, we conclude that 3SAT is NP-Complete.

17.1. 3SAT 159


Converting a formula to CNF

If a formula 𝜑 is not in conjunctive normal form, then its structure is as follows:

𝜑 = (𝑋1,1 ∧ ⋅ ⋅ ⋅ ∧𝑋1,𝑚1) ∨

(𝑋2,1 ∧ ⋅ ⋅ ⋅ ∧𝑋2,𝑚2) ∨

⋅ ⋅ ⋅ ∨

(𝑋𝑛,1 ∧ ⋅ ⋅ ⋅ ∧𝑋𝑛,𝑚𝑛)

Here, 𝑋𝑖,𝑘 is an arbitrary subformula. We can transform 𝜑 to the following:

𝜑′= (𝑧1 ∨ 𝑧2 ∨ ⋅ ⋅ ⋅ ∨ 𝑧𝑛) ∧

(¬𝑧1 ∨𝑋1,1) ∧ ⋅ ⋅ ⋅ ∧ (¬𝑧1 ∨𝑋1,𝑚1) ∧

(¬𝑧2 ∨𝑋2,1) ∧ ⋅ ⋅ ⋅ ∧ (¬𝑧2 ∨𝑋2,𝑚2) ∧

⋅ ⋅ ⋅ ∧

(¬𝑧𝑛 ∨𝑋𝑛,1) ∧ ⋅ ⋅ ⋅ ∧ (¬𝑧𝑛 ∨𝑋𝑛,𝑚𝑛)

This transformation preserves satisfiability. To see why this is so, we consider two cases for the original formula 𝜑:

• Case 1: There is an assignment 𝛼 such that all of 𝑋𝑖,1, . . . , 𝑋𝑖,𝑚𝑖are true in 𝜑(𝛼) for some 𝑖. Then the

original formula 𝜑(𝛼) is satisfied by the 𝑖th term in its disjunction. The new formula 𝜑′ is satisfied by the

same assignment for the original variables, but with the addition of 𝑧𝑖 = 1 and 𝑧𝑗≠𝑖 = 0. Specifically, 𝑧𝑖 = 1causes the first clause to be satisfied, 𝑧𝑗≠𝑖 = 0 result in any clause containing an 𝑋𝑗≠𝑖,𝑘 to be satisfied, and𝑋𝑖,𝑘 = 1 result in all clauses containing 𝑋𝑖,𝑘 to be satisfied. Thus, all clauses in 𝜑

′ are satisfied.

• Case 2: For all assignments 𝛼, at least one of 𝑋𝑖,1, . . . , 𝑋𝑖,𝑚𝑖is false in 𝜑(𝛼) for all 𝑖. Then all terms of the

disjunction in the original formula are false, so 𝜑(𝛼) as a whole is unsatisfied. Consider an assignment 𝛼′ tothe new formula 𝜑

′ such that 𝛼 and 𝛼′ assign the same values to the original variables in 𝜑. Let 𝑋𝑖,𝑘𝑖

be afalse term in 𝜑(𝛼) for each 𝑖. Then the clause (¬𝑧𝑖 ∨𝑋𝑖,𝑘𝑖

) in 𝜑′(𝛼′) can only be satisfied by setting 𝑧𝑖 = 0.

However, if all 𝑧𝑖 = 0, then the first clause in 𝜑′(𝛼′) is unsatisfied, so the entire formula is unsatisfied.

We see that if 𝜑 is satisfiable by an assignment 𝛼, then the new formula 𝜑′ is satisfiable by a modified assignment 𝛼′

as described above. However, if 𝜑 is unsatisfiable, then so is 𝜑′. Thus, this transformation preserves the satisfiabilityof the original formula.

The transformation only increases the size of the formula by a linear amount – the new formula 𝜑′ contains an

additional literal ¬𝑧𝑖 for each subformula 𝑋𝑖,𝑘, plus another 𝑛 literals for its first clause, where 𝑛 is the number ofterms in the outer disjunction of 𝜑. The number of literals in 𝜑

′ is at most triple the number in 𝜑.

We can repeat this transformation on the subformulas 𝑋𝑖,𝑘 to convert them into CNF formulas 𝑋 ′𝑖,𝑘, followed by

an application of the distributive law

𝑎 ∨ (𝑏1 ∧ ⋅ ⋅ ⋅ ∧ 𝑏𝑛) = (𝑎 ∨ 𝑏1) ∧ ⋅ ⋅ ⋅ ∧ (𝑎 ∨ 𝑏𝑛)

to turn (¬𝑧𝑖∨𝑋′𝑖,𝑘) into CNF. The latter doubles the size of the formula. Overall, we do at most a linear number of

transformations, each of which increases the size of the formula by a linear amount, so the total increase is at mostquadratic.

Modifying the proof of Cook-Levin to show that 3SAT is NP-Hard

We can modify our proof of the Cook-Levin theorem to directly show that 3SAT is NP-Hard. In particular, wedemonstrate how to construct 𝜑𝑉,𝑥 so that it is in conjunctive normal form. We can then apply the transformation

17.1. 3SAT 160


discussed above to turn a CNF formula into a 3CNF one.

We first observe that a CNF formula can be constructed recursively from smaller subformulas:

• A formula 𝜑 = 𝑥 with a single variable is in CNF – it has the single clause (𝑥).

• A formula 𝜑 = 𝜑1 ∧ 𝜑2 is in CNF if 𝜑1 and 𝜑2 are in CNF – it consists of the combination of the clauses in𝜑1 and 𝜑2.

• A formula 𝜑 = 𝜑1 ∨ 𝜑2 is only in CNF if 𝜑1 and 𝜑2 each have a single clause – 𝜑 is then just a single clausethat combines the literals in 𝜑1 and 𝜑2.

Examining each piece of 𝜑𝑉,𝑥, we see:

• The starting configuration

𝜑𝑉,𝑥,𝑠𝑡𝑎𝑟𝑡 = 𝑡1,1,# ∧ 𝑡1,2,𝑞𝑠𝑡𝑎𝑟𝑡∧ 𝜑𝑖𝑛𝑝𝑢𝑡 ∧ 𝑡1,𝑛+3,$ ∧ 𝜑𝑐𝑒𝑟𝑡𝑖𝑓𝑖𝑐𝑎𝑡𝑒 ∧ 𝑡1,𝑛𝑘,#

is in CNF – it consists of 𝑛𝑘 clauses, each with either a single literal (e.g. (𝑡1,1,#)) or ∣Γ∣ literals (for eachclause in 𝜑𝑐𝑒𝑟𝑡𝑖𝑓𝑖𝑐𝑎𝑡𝑒).

• The cell-consistency subformula

𝜑𝑉,𝑥,𝑐𝑒𝑙𝑙 = ⋀1≤𝑖,𝑗≤𝑛𝑘

[⋁𝑠∈𝑆

𝑡𝑖,𝑗,𝑠 ∧ ⋀𝑠≠𝑢∈𝑆

(¬𝑡𝑖,𝑗,𝑠 ∨ ¬𝑡𝑖,𝑗,𝑢)]

is in CNF. The inner conjunction is a conjunction of disjunctions, so it is clearly in CNF. The inner disjunctionis a single clause, so it is in CNF. We combine the two with a conjunction, and the result is also in CNFsince the two subformulas are in CNF. Finally, the outer conjunction also produces a CNF formula since theindividual pieces are in CNF.

• The acceptance subformula

𝜑𝑉,𝑥,𝑎𝑐𝑐𝑒𝑝𝑡 = ⋁1≤𝑖,𝑗≤𝑛𝑘

𝑡𝑖,𝑗,𝑞𝑎𝑐𝑐𝑒𝑝𝑡

is a single clause in CNF.

• The transition formula for a window that starts at 𝑖, 𝑗

𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒 = ⋁𝑤∈𝑊

𝜑𝑖,𝑗,𝑤

is not in CNF. The individual 𝜑𝑖,𝑗,𝑤 consist of a sequence of conjunctions, so 𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒 is a disjunction ofconjunctions, rather than a conjunction of disjunctions required for CNF. However, we can convert 𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒

to CNF by applying the distributive law for ∨ and ∧. In particular, we have

(⋀𝑖

𝑎𝑖) ∨ (⋀𝑗

𝑏𝑗) =⋀𝑖,𝑗

(𝑎𝑖 ∨ 𝑏𝑗)

We can see that this law holds if we consider two cases:

– If all 𝑎𝑖 are true, then the left-hand side is true. The right-hand side is also true – each clause (𝑎𝑖 ∨ 𝑏𝑗)contains an 𝑎𝑖, so each clause is satisfied, which makes the formula as a whole true.

This same reasoning applies for the case when all 𝑏𝑗 are true.

– If at least one 𝑎𝑖 is false and at least one 𝑏𝑗 is false, then the left-hand side is false. The right-hand sideis also false, since there is a clause (𝑎𝑖 ∨ 𝑏𝑗) where both 𝑎𝑖 and 𝑏𝑗 are false.

17.1. 3SAT 161


How does the size of the right-hand formula compare to the left-hand one? If the two conjunctions on the leftcontain 𝑚 and 𝑛 literals, respectively, then the left-hand side has 𝑚+𝑛 total literals. The right-hand side has𝑚 ⋅ 𝑛 clauses, each with two literals, for a total size of 2𝑚𝑛. When 𝑚 = 𝑛, we go from 2𝑚 literals to a sizeof 2𝑚

2.

Suppose we have 𝑐 conjunctions on the left-hand side, each with 𝑚 literals. Then repeated application of thedistributive law takes us from a formula with 𝑐𝑚 literals to one with 𝑐𝑚

𝑐. For 𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒, we go from a size of6∣𝑊 ∣ in its original form to a size of ∣𝑊 ∣ ⋅ 6

∣𝑊 ∣ in CNF. This is exponential in the number of distinct validwindows ∣𝑊 ∣, but this number is a characteristic of the verifier 𝑉 , not the input 𝑥. So even though it is alarger constant than before, it is still a constant.

Once we have 𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒 in CNF, then

𝜑𝑉,𝑥,𝑚𝑜𝑣𝑒 = ⋀1≤𝑖≤𝑛𝑘−1,1≤𝑗≤𝑛𝑘−2

𝜑𝑖,𝑗,𝑚𝑜𝑣𝑒

is also in CNF. And while its size is larger than previously, it is only by a constant factor, and its total size isstill O(𝑛2𝑘) literals.

Since 𝜑𝑉,𝑥 is just a conjunction of the individual pieces above, it is in CNF if those pieces are each in CNF. We canthen apply the CNF-to-3CNF transformation to obtain a formula in 3CNF. Thus, if we can decide 3SAT, we candecide an arbitrary language in NP, and 3SAT is NP-Hard.

17.2 Vertex Cover

NP-Completeness is not exclusive to Boolean satisfiability problems. Rather, there is a wide range of problems acrossall fields that are NP-Complete. As an example, we take a look at the vertex-cover problem.

Given a graph 𝐺 = (𝑉,𝐸), a vertex cover is a subset of the vertices 𝐶 ⊆ 𝑉 such that every edge in 𝐸 is covered by avertex in 𝐶. An edge 𝑒 = (𝑢, 𝑣) is covered by 𝐶 if at least one of the endpoints 𝑢, 𝑣 is in 𝐶. In other words, 𝐶 is avertex cover for 𝐺 = (𝑉,𝐸) if 𝐶 ⊆ 𝑉 and

∀𝑒 = (𝑢, 𝑣) ∈ 𝐸. 𝑢 ∈ 𝐶 ∨ 𝑣 ∈ 𝐶

The following are two vertex covers for the same graph:

The cover on the left has a size of five vertices, while the cover on the right has a size of four.

As a motivating example of the vertex-cover problem, consider a museum structured as a set of interconnected hall-ways, with exhibits on the walls. The museum needs to hire guards to ensure the security of the exhibits, but guards

17.2. Vertex Cover 162


are expensive, so it seeks to minimize the number of guards hired while still ensuring that each hallway is coveredby a guard. The museum layout can be represented as a graph with the hallways as edges and the vertices as hallwayendpoints. Then the museum just needs to figure out what the minimal vertex cover is to determine how many guardsto hire (or security cameras/motion detectors to install) and where to place them.

As we did for the traveling salesperson problem (page 142), we define a limited-budget, decision version of vertexcover:

VERTEX-COVER = (𝐺, 𝑘) ∶ 𝐺 is an undirected graph that has a vertex cover of size 𝑘

Here, we have defined VERTEX-COVER with an exact budget 𝑘 rather than an upper bound – observe that if a graph𝐺 = (𝑉,𝐸) has a vertex cover 𝐶 of size 𝑚, then it has one for any size 𝑚 ≤ 𝑘 ≤ ∣𝑉 ∣ by adding arbitrary verticesfrom 𝑉 \ 𝐶 into 𝐶. Put in museum terms, if the museum’s budget allows, it can hire more guards than are strictlynecessary to cover all hallways.

We now demonstrate that VERTEX-COVER is NP-Hard, leaving the proof that VERTEX-COVER ∈ NP as anexercise. Combining the two, we will be able to conclude that VERTEX-COVER is NP-Complete.

Exercise 114 Show that VERTEX-COVER ∈ NP.

We show that VERTEX-COVER is NP-Hard with a polytime reduction from the known NP-Hard 3SAT, i.e.3SAT ≤p VERTEX-COVER. To do so, we must define a polynomial-time computable function 𝑓 that convertsa 3CNF formula 𝜑 into an instance (𝐺, 𝑘) of VERTEX-COVER. More specifically, we require:

• 𝜑 ∈ 3SAT ⟹ 𝑓(𝜑) ∈ VERTEX-COVER

• 𝜑 ∉ 3SAT ⟹ 𝑓(𝜑) ∉ VERTEX-COVER

Given a 3CNF formula 𝜑 with 𝑛 variables and 𝑚 clauses, we construct a graph 𝐺 corresponding to that formula. Thegraph is comprised of gadgets that are subgraphs, representing individual variables and clauses. We will then connectthe gadgets together such that the resulting graph 𝐺 has a vertex cover of size 𝑛 + 2𝑚 exactly when 𝜑 is satisfiable.

The following are our gadgets for variables and clauses:

Variable gadget for 𝑥 Clause gadget for (𝑥 ∨ 𝑦 ∨ 𝑧)

𝑥 ¬𝑥

𝑧

𝑥 𝑦

A gadget for variable 𝑥 is a “barbell,” consisting of two vertices labeled by 𝑥 and ¬𝑥 connected by an edge. A gadgetfor a clause 𝑥∨ 𝑦∨ 𝑧 is a triangle, consisting of a vertex for each literal, with the three vertices all connected together.

As a concrete example, consider the formula

𝜑 = (𝑥 ∨ 𝑦 ∨ 𝑧) ∧ (¬𝑥 ∨ ¬𝑦 ∨ ¬𝑧) ∧ (𝑥 ∨ ¬𝑦 ∨ 𝑧) ∧ (¬𝑥 ∨ 𝑦 ∨ ¬𝑧)

We start building the corresponding graph 𝐺 by incorporating gadgets for each clause and variable:



𝑛 variables

𝑚 clauses

𝑥 ¬𝑥 𝑦 ¬𝑦

𝑧

𝑥 𝑦

¬𝑧

¬𝑥 ¬𝑦

𝑧

𝑥 ¬𝑦

¬𝑧

¬𝑥 𝑦

𝑧 ¬𝑧

Here, we have placed the variable gadgets at the top of the figure and the clause gadgets at the bottom.

The next step in the construction is to connect each vertex in a variable gadget to the vertices in the clause gadgets thathave the same literal as the label. For instance, we connect 𝑥 in the respective variable gadget to each vertex labeledby 𝑥 in the clause gadgets, ¬𝑥 in the variable gadget to each vertex labeled by ¬𝑥 in the clause gadgets, and so on.The result is as follows:

𝑛 variables

𝑚 clauses


𝑧

𝑥 𝑦

¬𝑧

¬𝑥 ¬𝑦

𝑧

𝑥 ¬𝑦

¬𝑧

¬𝑥 𝑦

𝑧 ¬𝑧

This completes the construction of 𝐺, and we have 𝑓(𝜑) = (𝐺,𝑛 + 2𝑚). There are 3𝑚 + 2𝑛 vertices in 𝐺, 𝑛 edgeswithin the variable gadgets, 3𝑚 edges within the clause gadgets, and 3𝑚 edges that cross between the clause andvariable gadgets, for a total of O(𝑚 + 𝑛) edges. Thus, 𝐺 has size O(𝑛 +𝑚), and it can be constructed efficiently.

We proceed to show that if 𝜑 ∈ 3SAT, then 𝑓(𝜑) = (𝐺,𝑛 + 2𝑚) ∈ VERTEX-COVER. Since 𝜑 ∈ 3SAT, there issome satisfying assignment 𝛼 such that 𝜑(𝛼) is true. Then 𝐶 is a vertex cover for 𝐺, where:

• For the variable gadget corresponding to each variable 𝑥, 𝑥 is in 𝐶 if 𝛼(𝑥) = 1, otherwise ¬𝑥 is in 𝐶 (i.e. if𝛼(𝑥) = 0).



𝑛 variables

𝑚 clauses


𝑧

𝑥 𝑦

¬𝑧

¬𝑥 ¬𝑦

𝑧

𝑥 ¬𝑦

¬𝑧

¬𝑥 𝑦

𝑧 ¬𝑧

Observe that every edge internal to a variable gadget is covered. Furthermore, since 𝛼 is a satisfying assignment,each clause has at least one literal that is true. Then the edge connecting the vertex in the corresponding clausegadget to the same literal in a variable gadget is already covered – the vertex corresponding to that literal in thevariable gadget is in 𝐶.

Thus, every clause gadget has at least one out of its three “crossing” edges covered by a vertex in a variablegadget, and so far, we have 𝑛 vertices in 𝐶.

• For each uncovered crossing edge (𝑢, 𝑣), where 𝑢 is in a variable gadget and 𝑣 is in a clause gadget, we have𝑣 ∈ 𝐶. There are at most two such crossing edges per clause. If a clause has only one uncovered crossingedge, then pick another arbitrary vertex in the clause to be in 𝐶. Similarly, if a clause has no uncovered crossingedges, then pick two of its vertices arbitrarily to be in 𝐶.

𝑛 variables

𝑚 clauses


𝑧

𝑥 𝑦

¬𝑧

¬𝑥 ¬𝑦

𝑧

𝑥 ¬𝑦

¬𝑧

¬𝑥 𝑦

𝑧 ¬𝑧

Now all crossing edges are covered by a vertex in a variable gadget, or one in a clause gadget, or both. Fur-thermore, since we pick exactly two vertices out of each clause gadget, the edges internal to each clause gadgetare also covered. Thus, all edges are covered, and 𝐶 is a vertex cover. This second step adds two vertices perclause, or 2𝑚 vertices. Combined with the previous vertices, 𝐶 has 𝑛+ 2𝑚 vertices, and we have shown that 𝐺has a vertex cover of size 𝑛 + 2𝑚.

We now show that 𝜑 ∉ 3SAT ⟹ (𝐺,𝑛 + 2𝑚) ∉ VERTEX-COVER. More precisely, we demonstrate thecontrapositive (𝐺,𝑛 + 2𝑚) ∈ VERTEX-COVER ⟹ 𝜑 ∈ 3SAT. In other words, if 𝐺 has a vertex cover of size𝑛 + 2𝑚, then 𝜑 has a satisfying assignment.

A vertex cover of 𝐺 of size 𝑛 + 2𝑚 must include the following:

• exactly one vertex from each variable gadget to cover the edge internal to the variable gadget



• exactly two vertices from each clause gadget to cover the edges internal to the clause gadget

This brings us to our total of 𝑛+ 2𝑚 vertices in the cover. While all internal edges are covered, we must also considerthe edges that cross between clause and variable gadgets. With the two vertices from each clause gadget, only two ofthe three crossing edges are covered by those vertices. The remaining crossing edge must be covered by a vertex froma clause gadget. Thus, if 𝐺 has a vertex cover 𝐶 of size 𝑛 + 2𝑚, each clause 𝑘 has a crossing edge (𝑢𝑙, 𝑣𝑘) where:

• 𝑢𝑙 is a variable vertex with label 𝑙 for some literal 𝑙

• 𝑣𝑘 is a vertex in the clause gadget with the same label 𝑙

• 𝑢𝑙 ∈ 𝐶, so that the crossing edge is covered

Then the assignment 𝛼 such that 𝛼(𝑙) = 1 if 𝑢𝑙 ∈ 𝐶 is a satisfying assignment – for each clause 𝑘, we have the trueliteral 𝑙 corresponding to the vertex 𝑣𝑘. Since all clauses are satisfied, 𝜑 as a whole is satisfied.

This completes our proof that 3SAT ≤p VERTEX-COVER. We conclude from the facts that 3SAT is NP-Hard andVERTEX-COVER ∈ NP that VERTEX-COVER is NP-Complete.

Exercise 115 a) Define the language

INDEPENDENT-SET = (𝐺, 𝑘) ∶ 𝐺 is an undirected graph with an independent set of size 𝑘

An independent set of a graph 𝐺 = (𝑉,𝐸) is a subset of the vertices 𝐼 ⊆ 𝑉 such that no two vertices in 𝐼share an edge. Show that INDEPENDENT-SET is NP-Complete.

Hint: If 𝐺 has a vertex cover of size 𝑚, what can be said about the vertices that are not in the cover?

b) Define the language

CLIQUE = (𝐺, 𝑘) ∶ 𝐺 is an undirected graph that has a clique of size 𝑘

A clique of a graph 𝐺 = (𝑉,𝐸) is a subset of the vertices 𝐾 ⊆ 𝑉 such that every pair of vertices in 𝐾has an edge between the two vertices. In other words, 𝐾 and its associated edges comprise a completesubgraph of 𝐺. Show that CLIQUE is NP-Complete.

17.3 Set Cover

Consider another problem, that of hiring workers for a project. To successfully complete the project, we need workerswith some combined set of skills 𝑆. For instance, we might need an architect, a general contractor, an engineer, anelectrician, a plumber, and so on for a construction project. We have a candidate pool of 𝑛 workers, where worker 𝑖has a set of skills 𝑆𝑖. As an example, there may be a single worker who is proficient in plumbing and electrical work.Our goal is to hire the minimal team of workers to cover all the required skills.

We formalize this problem as follows. We are given a set of elements 𝑆, representing the required skills. We are alsogiven 𝑛 subsets 𝑆𝑖 ⊆ 𝑆, representing the set of skills that each worker 𝑖 has. We want to find the smallest collection of𝑆𝑖’s that cover the set 𝑆. In other words, we wish to find the smallest set of indices 𝐶 such that

𝑆 = ⋃𝑖∈𝐶

𝑆𝑖

17.3. Set Cover 166


As a concrete example, let 𝑆 and 𝑆1, . . . , 𝑆6 be as follows:

𝑆 = 1, 2, 3, 4, 5, 6, 7𝑆1 = 1, 2, 3𝑆2 = 3, 4, 6, 7𝑆3 = 1, 4, 7𝑆4 = 1, 2, 6𝑆5 = 3, 5, 7𝑆6 = 4, 5

There are no set covers of size two, but there are several with size three. One example is 𝐶 = 1, 2, 6, which gives us

⋃𝑖∈𝐶

𝑆𝑖 = 𝑆1 ∪ 𝑆2 ∪ 𝑆6

= 1, 2, 3 ∪ 3, 4, 6, 7 ∪ 4, 5= 1, 2, 3, 4, 5, 6, 7= 𝑆

We proceed to define a language corresponding to the set-cover problem. As with vertex cover, we include a budget 𝑘for the size of the cover 𝐶.

SET-COVER = (𝑆, 𝑆1, 𝑆2, . . . , 𝑆𝑛, 𝑘) ∶ 𝑆𝑖 ⊆ 𝑆 for each 𝑖, and there is a collectionof 𝑘 sets 𝑆𝑖 that cover 𝑆

It is clear that SET-COVER ∈ NP – we can use as a certificate a set of indices 𝐶, and the verifier need only checkthat 𝐶 is a size-𝑘 subset of [1, 𝑛], and that 𝑆 = ∪𝑖∈𝐶𝑆𝑖. This can be done efficiently.

We now perform the reduction VERTEX-COVER ≤p SET-COVER to demonstrate that SET-COVER is NP-Hard.We need to convert an instance (𝐺 = (𝑉,𝐸), 𝑘) to an instance (𝑆, 𝑆1, . . . , 𝑆𝑛, 𝑘

′) such that 𝐺 has a vertex cover ofsize 𝑘 exactly when 𝑆 has a set cover of size 𝑘

′. Observe that an element of a vertex cover is a vertex 𝑣, and it coverssome subset of the edges in 𝐸. Similarly, a set cover consists of sets 𝑆𝑖, each of which cover some subset of the itemsin 𝑆. Thus, an edge corresponds to a skill, while a vertex corresponds to a worker.

Given a graph 𝐺 = (𝑉,𝐸) and a budget 𝑘, we define the function 𝑓 such it translates the set of edges 𝐸 to the setof skills 𝑆, and each vertex 𝑣𝑖 ∈ 𝑉 to a set 𝑆𝑖 consisting of the edges incident to 𝑣𝑖. Formally, we have 𝑓(𝐺, 𝑘) =(𝑆, 𝑆1, . . . , 𝑆∣𝑉 ∣, 𝑘), where 𝑆 = 𝐸 and:

𝑆𝑖 = 𝑒 ∈ 𝐸 ∶ 𝑒 is incident to 𝑣𝑖, i.e. 𝑒 = (𝑢, 𝑣𝑖) for some 𝑢 ∈ 𝑉

This transformation can be done efficiently, as it is a direct translation of the graph 𝐺.

As an example, suppose that the graph 𝐺 is as follows:

17.3. Set Cover 167


1

23

4

5

6 78

9

10

11

𝑆!

𝑆" 𝑆#

𝑆$

𝑆%

𝑆&

𝑆'

This graph has a cover of size four consisting of the vertices labeled 𝑆2, 𝑆4, 𝑆5, 𝑆6. The transformation 𝑓(𝐺, 𝑘)produces the following sets:

𝑆 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11𝑆1 = 2, 5, 6𝑆2 = 3, 7, 8, 9𝑆3 = 9, 10𝑆4 = 4, 6, 8, 11𝑆5 = 5, 7, 10, 11𝑆6 = 1, 2, 3, 4𝑆7 = 1

Then 𝐶 = 2, 4, 5, 6 is a set cover of size four.

We now demonstrate that in the general case, (𝐺 = (𝑉,𝐸), 𝑘) ∈ VERTEX-COVER exactly when 𝑓(𝐺, 𝑘) =

(𝑆, 𝑆1, . . . , 𝑆∣𝑉 ∣, 𝑘) ∈ SET-COVER.

• If (𝐺, 𝑘) ∈ VERTEX-COVER, then 𝐺 has a vertex cover 𝐶 ⊆ 𝑉 of size 𝑘 that covers all the edges in 𝐸. Thismeans that

𝐸 = ⋃𝑣𝑖∈𝐶

𝑒 ∈ 𝐸 ∶ 𝑒 is incident to 𝑣𝑖

Let 𝐶 ′= 𝑖 ∶ 𝑣𝑖 ∈ 𝐶. Then we have

⋃𝑖∈𝐶 ′

𝑆𝑖 = ⋃𝑖∈𝐶 ′


= 𝐸 = 𝑆

Thus, 𝐶 ′ covers 𝑆, and 𝑆 has a set cover of size 𝑘.

• If (𝐺, 𝑘) ∉ VERTEX-COVER, then 𝐺 does not have a set cover of size 𝑘. Either 𝑘 > ∣𝑉 ∣, or for any subsetof vertices 𝐶 ⊆ 𝑉 , of size 𝑘, we have

𝐸 ≠ ⋃𝑣𝑖∈𝐶


17.3. Set Cover 168


Then for any subset 𝐶 ′⊆ 𝑆1, . . . , 𝑆∣𝑉 ∣ of size 𝑘, we also have

⋃𝑖∈𝐶 ′

𝑆𝑖 = ⋃𝑖∈𝐶 ′


≠ 𝐸 = 𝑆

Thus 𝑆 does not have a set cover of size 𝑘.

This completes our reduction VERTEX-COVER ≤p SET-COVER.

We conclude our discussion of set cover by observing that a vertex cover is actually a special case of a set cover. Whenwe transform an instance of the vertex-cover problem to one of set cover, each edge in a graph is interpreted as a skilland each vertex as worker, which means that each skill is shared by exactly two workers. This is more restrictivethen the general case for set cover, where any number of workers (including none) may have a particular skill. Thereduction VERTEX-COVER ≤p SET-COVER just exemplifies that if we can solve the general case of a problemefficiently, we can also solve special cases efficiently. Conversely, if a special case is “hard,” then so is the generalcase44.

17.4 Hamiltonian Cycle

Given a graph 𝐺, a Hamiltonian path from 𝑠 to 𝑡 is a path that starts at 𝑠, ends at 𝑡, and visits each vertex exactlyonce. A Hamiltonian cycle is a path that starts and ends at the same vertex and visits all other vertices exactly once.For example, 𝐺1 below has a Hamiltonian cycle as well as a Hamiltonian path between 𝑠1 and 𝑡, but it does not havea Hamiltonian path between 𝑠2 and 𝑡. On the right, 𝐺2 has a Hamiltonian path between 𝑠 and 𝑡, but it does not have aHamiltonian cycle.

𝑠

𝑡𝐺!

𝑠"

𝑡

𝑠!

𝐺"We define the languages corresponding to the existence of a Hamiltonian path or cycle as:

HAMPATH = (𝐺, 𝑠, 𝑡) ∶ 𝐺 is an undirected graph with a Hamiltonian path between 𝑠 and 𝑡HAMCYCLE = 𝐺 ∶ 𝐺 is an undirected graph with a Hamiltonian cycle

Both languages are in NP – we take the actual Hamiltonian path or cycle as a certificate, and a verifier need onlyfollow the path to check that it is valid as in the verifier for TSP (page 142). We claim without proof that HAMPATHis NP-Hard – it is possible to reduce 3SAT to HAMPATH, but the reduction is quite complicated, so we will notconsider it here. Instead, we perform the reduction HAMPATH ≤p HAMCYCLE to show that HAMCYCLE is alsoNP-Hard. We need to demonstrate an efficient function 𝑓 such that (𝐺, 𝑠, 𝑡) ∈ HAMPATH if and only if 𝑓(𝐺, 𝑠, 𝑡)has a Hamiltonian cycle.

44 Since both VERTEX-COVER and SET-COVER are NP-Complete, they are actually equally hard, so an instance of either can be con-verted to an instance of the other. However, the conversion of an instance of set cover to vertex cover is not as straightforward as that of vertex coverto set cover.

17.4. Hamiltonian Cycle 169


Our first attempt is to produce a new graph 𝐺′ that is the same as 𝐺, except that it has an additional edge between

𝑠 and 𝑡. Then if 𝐺 has a Hamiltonian path from 𝑠 to 𝑡, 𝐺′ has a Hamiltonian cycle that follows that same pathfrom 𝑠 to 𝑡 and returns to 𝑠 through the additional edge. While this construction works in mapping yes instances ofHAMPATH to yes instances of HAMCYCLE, it does not properly map no instances of HAMPATH to no instancesof HAMCYCLE. The following illustrates the construction on both a no and a yes instance of HAMPATH, with thedotted line representing the added edge:

𝑠

𝑡

𝑠!

𝑡In the original graph on the left, there is no Hamiltonian path from 𝑠2 to 𝑡, yet the new graph does have a Hamiltoniancycle. Since the function maps a no instance to a yes instance, it does not demonstrate a valid mapping reduction.

The key problem with the transformation above is that it does not force a Hamiltonian cycle to go through the additionaledge. If the cycle did go through the additional edge, we could remove that edge from the cycle to obtain a Hamiltonianpath in the original graph. To force the cycle through the additions to the graph, we need to add a new vertex, not justan edge. Thus, in place of the single edge, we add two edges with a vertex between them, as in the following:

𝑠

𝑡

𝑢

𝑠!

𝑡𝑢

Now the generated graph on the left does not have a Hamiltonian cycle, but the new graph on the right still does. Thus,this procedure looks like it should work to map yes instances to yes instances and no instances to no instances.

Formally, given (𝐺 = (𝑉,𝐸), 𝑠, 𝑡), we have

𝑓(𝐺, 𝑠, 𝑡) = 𝐺′= (𝑉 ∪ 𝑢, 𝐸 ∪ (𝑠, 𝑢), (𝑡, 𝑢))

The function is clearly efficient, as it just adds one edge and two vertices to the input graph. We prove that it is alsocorrect:

• Suppose 𝐺 has a Hamiltonian path from 𝑠 to 𝑡. This path visits all the vertices in 𝐺, and it has the structure(𝑠, 𝑣2, . . . , 𝑣𝑛−1, 𝑡). The same path exists in 𝐺

′, and it visits all vertices except the added vertex 𝑢. The modifiedpath (𝑠, 𝑣2, . . . , 𝑣𝑛−1, 𝑡, 𝑢, 𝑠) with the additions of the edges (𝑡, 𝑢) and (𝑢, 𝑠) visits all the vertices and returnsto 𝑠, so it is a Hamiltonian cycle, and 𝐺

′∈ HAMCYCLE.

• Suppose 𝐺′ has a Hamiltonian cycle. Since it is a cycle, we can start from any vertex in the cycle and return

back to it. Consider the specific path that starts at and returns to 𝑠. It must also go through 𝑢 along the path, so ithas the structure (𝑠, . . . , 𝑢, . . . , 𝑠). Since the only edges incident to 𝑢 are (𝑠, 𝑢) and (𝑡, 𝑢) the path must eitherenter 𝑢 from 𝑡 and exit to 𝑠, or it must enter from 𝑠 and exit to 𝑡. We consider the two cases separately.

– Case 1: 𝑡 precedes 𝑢. Then the path must have the form (𝑠, . . . , 𝑡, 𝑢, 𝑠). Since the subpath (𝑠, . . . , 𝑡) visitsall vertices except 𝑢 and only includes edges from 𝐺, it constitutes a Hamiltonian path in 𝐺 from 𝑠 to 𝑡.

17.4. Hamiltonian Cycle 170


– Case 2: 𝑡 follows 𝑢. Then the path has the form (𝑠, 𝑢, 𝑡, . . . , 𝑠). Since the subpath (𝑡, . . . , 𝑠) visits allvertices except 𝑢 and only includes edges from 𝐺, it constitutes a Hamiltonian path in 𝐺 from 𝑡 to 𝑠. Thegraph is undirected, so following the edges in reverse order gives a Hamiltonian path from 𝑠 to 𝑡.

Thus, 𝐺 has a Hamiltonian path from 𝑠 to 𝑡.

We have shown that 𝐺 has a Hamiltonian path from 𝑠 to 𝑡 exactly when 𝐺′= 𝑓(𝐺, 𝑠, 𝑡) has a Hamiltonian cy-

cle. This demonstrates that 𝑓 is a valid mapping from instances of HAMPATH to instances of HAMCYCLE, soHAMPATH ≤p HAMCYCLE. We conclude that HAMCYCLE is NP-Hard.

Exercise 116 Define the language

LONG-PATH = (𝐺, 𝑘) ∶ 𝐺 is an undirected graph with a simple path of length 𝑘

A simple path is a path without any cycles. Show that LONG-PATH is NP-Complete.

Exercise 117 Recall the TSP language:

TSP = (𝐺, 𝑘) ∶ 𝐺 is a weighted, complete graph that has a tour that visits

all vertices in 𝐺 and has a total weight at most 𝑘

Show that TSP is NP-Hard with the reduction HAMCYCLE ≤p TSP.

17.5 Subset Sum

The subset-sum problem is the task of finding a subset of set of integers 𝑆 that add up to a target value 𝑘. The decisionversion of this problem is as follows:

SUBSET-SUM = (𝑆, 𝑘) ∶ 𝑆 is a set of integers that has a subset 𝑆∗ ⊆ 𝑆 such that ∑𝑠∈𝑆∗

𝑠 = 𝑘

This language is in NP – we can take the actual subset 𝑆∗ ⊆ 𝑆 as a certificate, and a verifier can efficiently check thatit is a valid subset of 𝑆 and that its elements sum to the value 𝑘. We show 3SAT ≤p SUBSET-SUM to demonstratethat SUBSET-SUM is also NP-Hard: we define a polytime computable function 𝑓 that takes a 3CNF formula 𝜑 with𝑛 variables and 𝑚 clauses and produces a set of integers 𝑆 and target value 𝑘 such that

𝜑 ∈ 3SAT ⟺ 𝑓(𝜑) = (𝑆, 𝑘) ∈ SUBSET-SUM

A satisfying assignment 𝛼 of 𝜑 has two key properties:

• for each variable 𝑥𝑖 ∈ 𝜑, exactly one of 𝛼(𝑥𝑖) or 𝛼(¬𝑥𝑖) is 1

• for each clause 𝐶𝑗 = (𝑙𝑗,1 ∨ 𝑙𝑗,2 ∨ 𝑙𝑗,3) ∈ 𝜑, at least one of 𝛼(𝑙𝑗,1), 𝛼(𝑙𝑗,2), 𝛼(𝑙𝑗,3) is 1

The set of integers 𝑆 and value 𝑘 produced by 𝑓(𝜑) are carefully designed to correspond to these properties. Specifi-cally, the set 𝑆 consists of decimal (base-10) integers that are 𝑛+𝑚 digits long – the first 𝑛 digits correspond to the 𝑛variables, while the last 𝑚 digits correspond to the clauses. For each variable 𝑥𝑖, 𝑆 contains two numbers 𝑣𝑖 and 𝑤𝑖:

• 𝑣𝑖 is 1 in digit 𝑖 and in digit 𝑛 + 𝑗 for each clause 𝐶𝑗 that contains the literal 𝑥𝑖, 0 in all other digits

• 𝑤𝑖 is 1 in digit 𝑖 and in digit 𝑛 + 𝑗 for each clause 𝐶𝑗 that contains the literal ¬𝑥𝑖, 0 in all other digits

We also set the 𝑖th digit of the target value 𝑘 to be 1 for 1 ≤ 𝑖 ≤ 𝑛. For a subset 𝑆∗ to add up to 𝑘, it must containexactly one of 𝑣𝑖 or 𝑤𝑖 for each 1 ≤ 𝑖 ≤ 𝑛, since only those two numbers contain a 1 in the 𝑖th digit. (Since we areusing decimal numbers, there aren’t enough numbers in 𝑆 to get a 1 in the 𝑖th digit through a carry from less-significant

17.5. Subset Sum 171


digits.) Thus, a subset 𝑆∗ that adds up to 𝑘 corresponds to an actual assignment of values to the variables, enforcingthe first property above.

For a clause digit 𝑛+ 𝑗, 1 ≤ 𝑗 ≤ 𝑚, the subset 𝑆∗ can produce one of the following values corresponding to the clause𝐶𝑗 = (𝑙𝑗,1 ∨ 𝑙𝑗,2 ∨ 𝑙𝑗,3):

• 0 if 𝑆∗ contains none of the numbers corresponding to 𝑙𝑗,1, 𝑙𝑗,2, 𝑙𝑗,3 being true (e.g. if 𝑙𝑗,1 = 𝑥𝑖, then 𝑆∗ does

not contain 𝑣𝑖, and if 𝑙𝑗,1 = ¬𝑥𝑖, then 𝑆∗ does not contain 𝑤𝑖)

• 1 if 𝑆∗ contains exactly one number corresponding to 𝑙𝑗,1, 𝑙𝑗,2, 𝑙𝑗,3 being true

• 2 if 𝑆∗ contains exactly two numbers corresponding to 𝑙𝑗,1, 𝑙𝑗,2, 𝑙𝑗,3 being true

• 3 if 𝑆∗ contains all three numbers corresponding to 𝑙𝑗,1, 𝑙𝑗,2, 𝑙𝑗,3 being true

For the clause 𝐶𝑗 to be satisfied by the assignment corresponding to 𝑆∗, all but the first case work. We need to add

more numbers to 𝑆 and set digit 𝑛 + 𝑗 of 𝑘 such that the three latter cases are all viable. To do so, for each clause 𝐶𝑗 :

• set the value for digit 𝑛 + 𝑗 to be 3 in the target number 𝑘

• add two numbers 𝑦𝑗 and 𝑧𝑗 to 𝑆 that have a 1 in digit 𝑛 + 𝑗 and 0 in all other digits

This completes our construction of 𝑆 and 𝑘. This can be done efficiently – we generate 2𝑛+ 2𝑚+ 1 numbers, each ofwhich is 𝑛+𝑚 digits long, and we can do so in quadratic time with respect to ∣𝜑∣ = O(𝑛+𝑚). For a subset 𝑆∗ ⊆ 𝑆to add up to 𝑘, it must contain:

• exactly one of 𝑣𝑖 or 𝑤𝑖 for each variable 𝑥𝑖, 1 ≤ 𝑖 ≤ 𝑛, corresponding to an assignment to the variables

• none, one, or both of 𝑦𝑗 and 𝑧𝑗 for each clause 𝐶𝑗 , 1 ≤ 𝑗 ≤ 𝑚, depending on whether the clause 𝐶𝑗 has three,two, or one literal(s) satisfied by the corresponding assignment

As a concrete example, consider the formula

𝜑(𝑥1, 𝑥2, 𝑥3, 𝑥4) = (𝑥1 ∨ 𝑥2 ∨ 𝑥3) ∧ (¬𝑥2 ∨ ¬𝑥3 ∨ ¬𝑥4) ∧ (¬𝑥1 ∨ 𝑥2 ∨ 𝑥4)

We generate the following numbers, with columns corresponding to digits and unspecified digits implicitly 0:

𝑥1 𝑥2 𝑥3 𝑥4 𝐶1 𝐶2 𝐶3

𝑣1 1 1𝑣2 1 1 1𝑣3 1 1𝑣4 1 1𝑤1 1 1𝑤2 1 1𝑤3 1 1𝑤4 1 1𝑦1 1𝑦2 1𝑦3 1𝑧1 1𝑧2 1𝑧3 1𝑘 1 1 1 1 3 3 3

Consider the satisfying assignment 𝛼(𝑥1, 𝑥2, 𝑥3, 𝑥4) = (0, 1, 0, 1). The corresponding subset 𝑆∗ contains

𝑤1, 𝑣2, 𝑤3, 𝑣4, which add up to 1111113 – clauses 𝐶1 and 𝐶2 are satisfied by one literal each (𝑥2 and ¬𝑥3, re-spectively), while clause 𝐶3 is satisfied by all three literals. To add up to 𝑘 = 1111333, 𝑆∗ also contains 𝑦1, 𝑧1, 𝑦2, 𝑧2.



Now consider the assignment 𝛼′(𝑥1, 𝑥2, 𝑥3, 𝑥4) = (0, 0, 0, 1) that does not satisfy the formula. The correspondingsubset 𝑆∗ contains 𝑤1, 𝑤2, 𝑤3, 𝑣4, which add up to 1111022. The highest value we can get in digit 5 (correspondingto clause 𝐶1) is 2, by adding both 𝑦1 and 𝑧1 to the subset. This does not get us to the target value of 3.

Formally, for an arbitrary formula 𝜑 with 𝑛 variables and 𝑚 clauses and resulting 𝑓(𝜑) = (𝑆, 𝑘), we have:

• If 𝜑 ∈ 3SAT, then it has an assignment 𝛼 that satisfies all clauses, meaning that at least one literal in each clauseis true. The subset

𝑆∗𝑎 = 𝑣𝑖 ∶ 𝛼(𝑥𝑖) = 1 ∪ 𝑤𝑖 ∶ 𝛼(𝑥𝑖) = 0

sums to 1 in each of the first 𝑛 digits, and at least 1 in each of the last 𝑚 digits – each clause 𝐶𝑗 is satisfied by atleast one literal, and the digit 𝑛+ 𝑗 is set to 1 in the number corresponding to that literal in the subset 𝑆∗𝑎 . Thus,the full subset

𝑆∗= 𝑆

∗𝑎 ∪

𝑦𝑗 ∶ 𝜑(𝛼) sets ≤ 2 literals of 𝐶𝑗 to 1 ∪𝑧𝑗 ∶ 𝜑(𝛼) sets 1 literal of 𝐶𝑗 to 1

adds up to the target value 𝑘, so 𝑓(𝜑) = (𝑆, 𝑘) ∈ SUBSET-SUM.

• If 𝑓(𝜑) = (𝑆, 𝑘) ∈ SUBSET-SUM, then there is some subset 𝑆∗ of the numbers 𝑆 = 𝑣𝑖∪𝑤𝑖∪𝑦𝑗∪𝑧𝑗that adds up to 𝑘. This subset 𝑆∗ must contain exactly one of 𝑣𝑖 or 𝑤𝑖 for each 1 ≤ 𝑖 ≤ 𝑛 so that the 𝑖th digit inthe sum is the target 1. This corresponds to the assignment

𝛼(𝑥𝑖) = 1 if 𝑣𝑖 ∈ 𝑆∗

0 if 𝑤𝑖 ∈ 𝑆∗

For each digit 𝑛 + 𝑗, 1 ≤ 𝑗 ≤ 𝑚, there must be some 𝑣𝑖 or 𝑤𝑖 in 𝑆∗ that has that digit set to 1 – there are only

two numbers among the 𝑦𝑗 and 𝑧𝑗 that have this digit set to 1, so they cannot add up to the target value of 3 ontheir own. If there is a 𝑣𝑖 ∈ 𝑆

∗ that has the (𝑛+ 𝑗)th digit set to 1, then by construction, the clause 𝐶𝑗 containsthe literal 𝑥𝑖, so it is satisfied by the assignment 𝛼 since 𝛼(𝑥𝑖) = 1. On the other hand, if there is a 𝑤𝑖′ ∈ 𝑆

∗ thathas the (𝑛 + 𝑗)th digit set to 1, then the clause 𝐶𝑗 contains the literal ¬𝑥𝑖′ , and it is satisfied by the assignment𝛼 since 𝛼(𝑥𝑖′) = 0. In either case, the clause 𝐶𝑗 is satisfied. Since this reasoning applies to all clauses, theassignment 𝛼 satisfies the formula as a whole, and 𝜑 ∈ 3SAT.

We have shown that 𝑓 is efficiently computable and that 𝜑 ∈ 3SAT ⟺ 𝑓(𝜑) ∈ SUBSET-SUM, so we have3SAT ≤p SUBSET-SUM. We can conclude that SUBSET-SUM is NP-Hard, and since it is also in NP, that it isNP-Complete.

Exercise 118 Define the language KNAPSACK as follows:

KNAPSACK =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

(𝑉,𝑊, 𝑛, 𝑣, 𝐶) ∶ ∣𝑉 ∣ = ∣𝑊 ∣ = 𝑛 and there exist indices

𝑖 ∈ 1, . . . , 𝑛 such that ∑𝑖

𝑉 [𝑖] ≥ 𝑣 and

∑𝑖

𝑊 [𝑖] ≤ 𝐶

⎫⎪⎪⎪⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎪⎪⎪⎭This corresponds to the following problem:

• There are 𝑛 items.

• Each item 𝑖 has a value 𝑉 (𝑖) and a weight 𝑊 (𝑖), where 𝑉 (𝑖) and 𝑊 (𝑖) are integers.

• We have a target value of at least 𝑣, where 𝑣 is an integer.

• We have a weight capacity limit of at most 𝐶, where 𝐶 is an integer.

• Is it possible to select a subset of the items such that the total value of the subset is at least 𝑣



but the total weight of the subset is at most 𝐶?

Show that KNAPSACK is NP-Complete.

17.6 Concluding Remarks

We have explored only the tip of the iceberg when it comes to NP-Complete problems. Such problems are everywhere,including constraint satisfaction (SAT, 3SAT), covering problems (vertex cover, set cover), resource allocation (knap-sack, subset sum), scheduling, graph colorability, model-checking, social networks (clique, maximal cut), routing(HAMPATH, TSP), games (Sudoku, Battleship, Super Mario Brothers, Pokémon), and so on. An efficient algorithmfor any of these problems admits an efficient solution for all of them.

The skills to reason about a problem and determine whether it is NP-Hard are critical. Researchers have been workingfor decades to find an efficient solution for NP-Complete problems to no avail. This makes it exceedingly unlikelythat we will be able to do so. Instead, when we encounter such a problem, a better path is to focus on approximationalgorithms, which we turn to next.

17.6. Concluding Remarks 174

CHAPTER

EIGHTEEN

SEARCH AND APPROXIMATION

Thus far, we have focused our attention on decision problems, formalized as languages. We now turn to functional orsearch problems, those that do not have a yes/no answer. Restricting ourselves to decision problems does not leave usany room to find an answer that is “close” to correct, but a larger codomain will enable us to consider approximationalgorithms for NP-Hard problems. We previously encountered Kolmogorov complexity (page 128) as an example of anuncomputable functional problem. Here, we consider problems that are computable, with the intent of finding efficientalgorithms to solve such problems. Some examples of search problems are:

• Given an array of integers, produce a sorted version of that array.

• Given a Boolean formula, find a satisfiable assignment.

• Given a graph, find a minimal vertex cover.

In general, a search problem may be:

• a maximization problem, such as finding a maximal clique, a fully connected subgraph, in a graph

• a minimization problem, such as finding a minimal vertex cover

• an exact problem, such as finding a satisfying assignment or a Hamiltonian path

Maximization and minimization problems are also called optimization problems.

For each kind of search problem, we can define a corresponding decision problem45. For maximization and mini-mization, we introduce a limited budget. Specifically, does a solution exist that meets the budget? The following areexamples:

VERTEX-COVER = (𝐺, 𝑘) ∶ 𝐺 is an undirected graph that has a vertex cover of size 𝑘CLIQUE = (𝐺, 𝑘) ∶ 𝐺 is an undirected graph that has a clique of size 𝑘

For an exact problem, the corresponding decision problem is whether a solution exists that meets the required criteria.The following is an example:

SAT = 𝜑 ∶ 𝜑 is a satisfiable Boolean formula

We have seen that the languages VERTEX-COVER and SAT are NP-Complete, and CLIQUE too is NP-Complete,though we do not demonstrate that here. How do the search problems compare in difficulty to the decision problems?

It is clear that given an efficient algorithm to compute the answer to a search problem, we can efficiently decide thecorresponding language. For instance, a decider for VERTEX-COVER can invoke the algorithm to find a minimal

45 We previously saw that a functional problem has an equivalent formulation as a decision problem (page 128). However, this correspondencewas based on translating a functional problem to a sequence of decision problems on the binary representation of the answer. This is different thanwhat we are considering now, which is between a search problem and a “natural” formulation of a corresponding decision problem.

175


vertex cover and then check whether the result has size no more than 𝑘. Thus, the decision problem is no harder thanthe search problem.

What if we have an efficient decider 𝐷 for VERTEX-COVER? Can we then efficiently find a minimal vertex cover?It turns out that we can. Given a graph 𝐺 = (𝑉,𝐸), we do the following:

• Find the size of a minimal vertex cover by running 𝐷 on (𝐺, 𝑘) for each 𝑘 from 1 to ∣𝑉 ∣, stopping on the first𝑘 for which 𝐷 accepts.

• Pick a vertex 𝑣. If it is in a minimal vertex cover of size 𝑚, we can remove it and all adjacent edges, and theresulting graph has a vertex cover of size 𝑚 − 1. If 𝑣 is not in a minimal vertex cover, the resulting graph willnot have a vertex cover of size 𝑚 − 1. Thus, we can use a call to 𝐷 on the new graph to determine whether ornot 𝑣 is in a minimal cover. If it is, we repeat this process on the smaller graph until we discover all the verticesin a minimal cover.

The full search algorithm is as follows:

𝑆 on input 𝐺 = (𝑉,𝐸) ∶For 𝑘 = 1 to ∣𝑉 ∣ ∶

If 𝐷(𝐺, 𝑘) accepts, let 𝑚 ∶= 𝑘 and exit this loop𝐶 ∶= ∅

While 𝑚 > 0 ∶

For each 𝑣 ∈ 𝑉 ∶

𝐺′∶= (𝑉 \ 𝑣, 𝐸 \ (𝑢, 𝑣) ∈ 𝐸) [remove 𝑣 and all adjacent edges]

If 𝐷(𝐺′,𝑚 − 1) accepts ∶

𝐶 ∶= 𝐶 ∪ 𝑣𝐺 ∶= 𝐺

′

𝑚 ∶= 𝑚 − 1

Exit this inner loopReturn 𝐶

The first loop invokes 𝐷 at most ∣𝑉 ∣ times, and since 𝐷 is efficient, the loop completes in polynomial time. Thewhile loop invokes 𝐷 at most 𝑚 ⋅ ∣𝑉 ∣ ≤ ∣𝑉 ∣2 times, and 𝐺

′ can be constructed efficiently, so the loop as a wholeis efficient. Thus, the algorithm runs in polynomial time with respect to ∣𝐺∣. We conclude that if we can decideVERTEX-COVER efficiently, we can find an actual minimal vertex cover efficiently.

Exercise 119 Given an efficient decider 𝐷 for SAT, demonstrate an efficient algorithm that finds a satisfyingassignment for a Boolean formula 𝜑, if such an assignment exists.

We have shown that the search problem of finding a minimal vertex cover has the same difficulty as the decisionproblem VERTEX-COVER. Since VERTEX-COVER is NP-Complete, this means that an efficient solution tothe search problem leads to P = NP. Informally, we say that the search problem is NP-Hard, much as we did fora decision problem for which an efficient solution implies P = NP. (Note, however, that the search problem is notNP-Complete, as the class NP consists only of decision problems.)

We claim without proof that an NP-Hard search problem has an efficient algorithm if and only if its correspondingdecision problem is efficiently solvable. Thus, we are unlikely to construct efficient algorithms to find exact solutionsto NP-Hard search problems. Instead, we focus our attention on designing efficient algorithms to approximate NP-Hard optimization problems.

Given an input 𝑥, an optimization problem seeks to find an output 𝑦 such that 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦) is optimized for some metric𝑣𝑎𝑙𝑢𝑒. For a maximization problem, the goal is to find an optimal value 𝑦

∗ such that:

∀𝑦. 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦∗) ≥ 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦)

176


That is, 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦∗) is at least as large as 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦) for all other outputs 𝑦, so that 𝑦∗ maximizes the metric for agiven input 𝑥. Similarly, in a minimization problem, the goal is to find 𝑦

∗ such that:

∀𝑦. 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦∗) ≤ 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦)

Here, 𝑦∗ produces the minimal value for the metric 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦) over all 𝑦. In both cases, we define 𝑂𝑃𝑇 (𝑥) =𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦∗) as the optimal value of the metric for an input 𝑥. We then define an approximation as follows:

Definition 120 (𝛼-approximation) Given an input 𝑥, a solution 𝑦 is an 𝛼-approximation if:

• For a maximization problem,

𝛼 ⋅𝑂𝑃𝑇 (𝑥) ≤ 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦) ≤ 𝑂𝑃𝑇 (𝑥)

where 𝛼 < 1.

• For a minimization problem,

𝑂𝑃𝑇 (𝑥) ≤ 𝑣𝑎𝑙𝑢𝑒(𝑥, 𝑦) ≤ 𝛼 ⋅𝑂𝑃𝑇 (𝑥)

where 𝛼 > 1.

In both cases, 𝛼 is the approximation ratio.

The closer 𝛼 is to one, the better the approximation.

Definition 121 (𝛼-approximation Algorithm) An 𝛼-approximation algorithm is an algorithm 𝐴 such that forall inputs 𝑥, the output 𝐴(𝑥) is an 𝛼-approximation.

18.1 Approximation for Minimal Vertex Cover

Let us return to the problem of finding a minimal vertex cover. Can we design an efficient algorithm to approximatethe minimal cover with a good approximation ratio 𝛼? Our first attempt is the cover-and-remove algorithm, whichrepeatedly finds an edge, removes one of its endpoints, and removes all edges incident to that endpoint:

Algorithm 𝐶𝑜𝑣𝑒𝑟-𝑎𝑛𝑑-𝑅𝑒𝑚𝑜𝑣𝑒(𝐺) ∶𝐶 ∶= ∅

While 𝐺 = (𝐸, 𝑉 ) has at least one edge ∶Pick an edge 𝑒 = (𝑢, 𝑣) ∈ 𝐸

𝐺 ∶= (𝑉 \ 𝑣, 𝐸 \ (𝑢, 𝑣) ∈ 𝐸) [remove 𝑣 and all adjacent edges]𝐶 ∶= 𝐶 ∪ 𝑣

Return 𝐶

How well does this algorithm perform? Consider the following star-shaped graph:

1

23

4

18.1. Approximation for Minimal Vertex Cover 177


On this graph, it is possible for the algorithm to pick the vertices on the outside of the star to remove rather than theone in the middle. The result would be a cover of size four, while the optimal cover has size one (just the vertex inthe middle), giving an approximation ratio of 4 for this graph. However, larger star-shaped graphs can give rise toarbitrarily larger approximation ratios – there is no constant 𝛼 such that the cover-and-remove algorithm achieves aratio of 𝛼 on all possible input graphs.

Observe that in the star-shaped graph, the vertex in the center has the largest degree, the number of edges that areincident to it. If we pick that vertex first, we cover all edges, achieving the optimal cover of just a single vertex for thestar-shaped graph. This motivates a greedy strategy that repeatedly adds the vertex with the highest remaining degreeto the cover:

Algorithm 𝐺𝑟𝑒𝑒𝑑𝑦-𝐶𝑜𝑣𝑒𝑟(𝐺) ∶𝐶 ∶= ∅

While 𝐺 = (𝐸, 𝑉 ) has at least one edge ∶Pick the vertex 𝑣 ∈ 𝑉 with the most incident edges𝐺 ∶= (𝑉 \ 𝑣, 𝐸 \ (𝑢, 𝑣) ∈ 𝐸) [remove 𝑣 and all adjacent edges]𝐶 ∶= 𝐶 ∪ 𝑣

Return 𝐶

While this algorithm works well for a star-shaped graph, there are other graphs for which it does not achieve anapproximation within a constant 𝛼. Consider the following bipartite graph:

1 2 3 4 5 6 7 8

The optimal cover consists of the six vertices at the top, while the algorithm instead chooses the eight vertices at thebottom. The vertex at the bottom left has a degree of six, compared to the maximum degree of five among the topvertices, so the bottom-left vertex is picked first. Then the second vertex on the bottom has degree five, while the topvertices now have degree no more than four. Once the second vertex is removed, the vertex with highest degree isthe third on the bottom, and so on until all the bottom vertices have been chosen for the cover. While the algorithmachieves a ratio of 4/3 on this graph, larger graphs can be constructed with a similar structure, with 𝑘 vertices at thetop and ~𝑘 log 𝑘 at the bottom. The algorithm produces a cover with ratio log 𝑘 compared to the optimal, and that ratiocan be made arbitrarily large by increasing 𝑘.

The problem with both the cover-and-remove and greedy algorithms is the absence of a benchmark, a way of compar-ing the algorithm against the optimal result. Before we proceed to design another algorithm, we establish a measurefor a minimal vertex cover that we can benchmark against, using a set of pair-wise disjoint edges from the input graph.

Given a graph 𝐺 = (𝑉,𝐸), a subset of edges 𝑆 ⊆ 𝐸 is pair-wise disjoint if no two edges in 𝑆 share a common vertex.Given any pair-wise disjoint set of edges 𝑆 from 𝐺, any vertex cover 𝐶 must have at least ∣𝑆∣ vertices – since no twoedges in 𝑆 share a common vertex, we must choose a distinct vertex to cover each edge in 𝑆. This gives us a wayof benchmarking an approximation algorithm – ensure that the algorithm picks no more than a constant number ofvertices for each edge in some pair-wise disjoint set of edges 𝑆.

18.1. Approximation for Minimal Vertex Cover 178


In fact, only a small modification to the cover-and-remove algorithm is necessary to meet this benchmark. Ratherthan choosing one of the two endpoints when we pick an edge, we can include both of them in the cover, removingthe two endpoints and all their incident edges before proceeding. The remaining edges do not share a vertex with theedge we picked, so they are each disjoint with respect to that edge. By induction, the full set of edges picked by thisdouble-cover algorithm is pair-wise disjoint. The algorithm is as follows:

Algorithm 𝐷𝑜𝑢𝑏𝑙𝑒-𝐶𝑜𝑣𝑒𝑟(𝐺) ∶𝐶 ∶= ∅

While 𝐺 = (𝐸, 𝑉 ) has at least one edge ∶Pick an edge 𝑒 = (𝑢, 𝑣) ∈ 𝐸

𝐺 ∶= (𝑉 \ 𝑢, 𝑣, 𝐸 \ (𝑠, 𝑡) ∈ 𝐸 ∶ 𝑠 ∈ 𝑢, 𝑣) [remove 𝑢 and 𝑣 and all adjacent edges]𝐶 ∶= 𝐶 ∪ 𝑢, 𝑣

Return 𝐶

Claim 122 The edges picked by the double-cover algorithm are pair-wise disjoint for any input graph.

Proof 123 Let 𝑆𝑖 be the set of edges picked by the double-cover algorithm in the first 𝑖 iterations on an arbitrarygraph 𝐺. We show by induction that 𝑆𝑖 is pair-wise disjoint.

• Base case: 𝑆1 contains only a single edge, so it is trivially pair-wise disjoint.

• Inductive step: Assume 𝑆𝑖 is pair-wise disjoint. When the algorithm picks an edge (𝑢, 𝑣), it removesboth 𝑢 and 𝑣 and all edges incident to either 𝑢 or 𝑣. Thus, the graph that remains after iteration 𝑖 does notcontain any of the vertices 𝑢, 𝑣 for each edge (𝑢, 𝑣) ∈ 𝑆𝑖. The edges that remain in the graph all haveendpoints that are distinct from those that appear in 𝑆𝑖, so all remaining edges are disjoint from any edgein 𝑆𝑖. Then no matter what edge 𝑒 the algorithm picks next, it will be disjoint from any edge in 𝑆𝑖. Since𝑆𝑖 itself is pair-wise disjoint, 𝑆𝑖+1 = 𝑆𝑖 ∪ 𝑒 is also pair-wise disjoint.

How good is this algorithm? Let 𝑆 be the full set of edges picked by the algorithm on an input graph 𝐺. Since 𝑆 ispair-wise disjoint, the minimal vertex cover 𝐶∗ of 𝐺 must have at least ∣𝑆∣ vertices. The cover 𝐶 produced by thedouble-cover algorithm has 2∣𝑆∣ vertices. We have:

𝑣𝑎𝑙𝑢𝑒(𝐺,𝐶) = 2∣𝑆∣ ≤ 2 ⋅ 𝑣𝑎𝑙𝑢𝑒(𝐺,𝐶∗)

Thus, the resulting cover is a 2-approximation, and double-cover is a 2-approximation algorithm.

18.2 Maximal Cut

As another example, we design an approximation algorithm for the NP-Hard problem of finding the maximal cut ofan undirected graph. A cut of a graph 𝐺 = (𝑉,𝐸) is a partition of the vertices 𝑉 into two disjoint sets 𝑆 and 𝑇 , sothat 𝑇 = 𝑉 \𝑆. The size of a cut is the number of edges that cross between the partitions 𝑆 and 𝑇 . More formally, wedefine the set of edges that cross between 𝑆 and 𝑇 as:

𝐶(𝑆, 𝑇 ) = 𝑒 = (𝑢, 𝑣) ∶ 𝑒 ∈ 𝐸 and 𝑢 ∈ 𝑆 and 𝑣 ∈ 𝑇

Then the size of the cut 𝑆, 𝑇 is ∣𝐶(𝑆, 𝑇 )∣. The following is an example of a cut, with the crossing edges 𝐶(𝑆, 𝑇 )represented as dashed lines:

18.2. Maximal Cut 179


The maximal cut of a graph is the cut 𝑆, 𝑇 that maximizes the size ∣𝐶(𝑆, 𝑇 )∣. Finding a maximal cut has importantapplications to circuit design, combinatorial optimization, and statistical physics.

The limited-budget decision version of this problem is defined as:

MAX-CUT = (𝐺, 𝑘) ∶ 𝐺 has a cut of size 𝑘

This is an NP-Complete problem, though we do not prove it here. This implies that the search problem of finding amaximal cut is NP-Hard. Rather than trying to find an exact solution to the search problem, we design an approxima-tion algorithm. We start by identifying an appropriate benchmark.

Given a graph 𝐺 = (𝑉,𝐸), we define the set 𝐼𝑣 of edges incident to vertex 𝑣 ∈ 𝑉 as:

𝐼𝑣 = 𝑒 = (𝑢, 𝑣) ∶ 𝑢 ∈ 𝑉 and 𝑒 ∈ 𝐸

In other words, 𝐼𝑣 is the set of edges that have 𝑣 as an endpoint. Let 𝑆∗, 𝑇∗ be a maximal cut of 𝐺. We claim thefollowing:

Claim 124 For each vertex 𝑣 ∈ 𝑉 in a graph 𝐺 = (𝑉,𝐸), at least half the edges in 𝐼𝑣 must be in the crossingset 𝐶(𝑆∗, 𝑇∗).

Proof 125 Assume for the purposes of contradiction that fewer than half the edges in 𝐼𝑣 are in 𝐶(𝑆∗, 𝑇∗) forsome vertex 𝑣 ∈ 𝑉 . Without loss of generality, assume that 𝑣 ∈ 𝑆

∗. Then the edges 𝐼𝑣 can be partitioned intotwo subsets:

𝐼𝑣,𝑐 = 𝑒 = (𝑢, 𝑣) ∈ 𝐼𝑣 ∶ 𝑢 ∈ 𝑇∗

𝐼𝑣,𝑛𝑐 = 𝑒 = (𝑢, 𝑣) ∈ 𝐼𝑣 ∶ 𝑢 ∈ 𝑆∗

𝐼𝑣,𝑐 ⊆ 𝐶(𝑆∗, 𝑇∗) consists of edges that cross between 𝑆∗ and 𝑇

∗, while 𝐼𝑣,𝑛𝑐 consists of those that are internalto 𝑆

∗. By assumption, we have ∣𝐼𝑣,𝑐∣ < ∣𝐼𝑣,𝑛𝑐∣.If we move 𝑣 from 𝑆

∗ into 𝑇∗, the edges in 𝐼𝑣,𝑛𝑐 become crossing edges, while those in 𝐼𝑣,𝑐 become internal

edges, as illustrated below.



Cut size: 4

𝑣

Cut size: 3

𝑣

Thus, we have a net gain of ∣𝐼𝑣,𝑛𝑐∣− ∣𝐼𝑣,𝑐∣ > 0 crossing edges, which means that the cut 𝑆∗ \ 𝑣, 𝑇∗ ∪ 𝑣 hasmore crossing edges than 𝑆

∗, 𝑇

∗. This is a contradiction, since 𝑆∗, 𝑇

∗ is a maximal cut. Thus, our assumptionthat such a 𝑣 exists is false, and 𝐼𝑣 must have at least as many crossing edges as internal edges for all 𝑣 ∈ 𝑉 .

Corollary 126 The size of a maximal cut 𝑆∗, 𝑇∗ of a graph 𝐺 = (𝑉,𝐸) is at least 12∣𝐸∣, i.e. ∣𝐶(𝑆∗, 𝑇∗)∣ ≥

12∣𝐸∣.

Proof 127 We first observe that

∣𝐸∣ = 1

2∑𝑣∈𝑉

∣𝐼𝑣∣

since each edge appears in exactly two sets 𝐼𝑣 . Similarly, we have

∣𝐶(𝑆∗, 𝑇∗∣ = 1

2∑𝑣∈𝑉

∣𝐼𝑣,𝑐∣

where 𝐼𝑣,𝑐 is the set of crossing edges incident to 𝑣. By Claim 124, we have ∣𝐼𝑣,𝑐∣ ≥ 12∣𝐼𝑣∣, which gives us:

∣𝐶(𝑆∗, 𝑇∗∣ = 1

2∑𝑣∈𝑉

∣𝐼𝑣,𝑐∣

≥1

2∑𝑣∈𝑉

∣𝐼𝑣∣2

=1

2∣𝐸∣

Thus, at least half the edges in 𝐸 are in the crossing set 𝐶(𝑆∗, 𝑇∗).

We have placed a lower bound on the number of edges in a maximal cut. An obvious upper bound is the full set ofedges 𝐸; a cut can only contain edges that are in the graph. Thus, we have:

1

2∣𝐸∣ ≤ ∣𝐶(𝑆∗, 𝑇∗)∣ ≤ ∣𝐸∣

Having established bounds on the number of edges in a maximal cut, we design an algorithm that produces a cut ofsize at least 1

2∣𝐸∣, giving us a 1/2-approximation of the optimal. We take our inspiration from our argument in Proof

125 – repeatedly find a vertex such that moving it to the opposite partition increases the cut size, until no such verticesexist. The resulting cover will have at least half the edges in 𝐸. We refer to this as the local-search algorithm.



Algorithm 𝐿𝑜𝑐𝑎𝑙-𝑆𝑒𝑎𝑟𝑐ℎ(𝐺 = (𝑉,𝐸)) ∶𝑆 ∶= ∅

𝑇 ∶= 𝑉

Repeat ∶For each 𝑣 ∈ 𝑆 ∶

If ∣𝐶(𝑆 \ 𝑣, 𝑇 ∪ 𝑣)∣ > ∣𝐶(𝑆, 𝑇 )∣ ∶ [moving 𝑣 to 𝑇 increases the cut size]𝑆 ∶= 𝑆 \ 𝑣𝑇 ∶= 𝑇 ∪ 𝑣

For each 𝑢 ∈ 𝑇 ∶

If ∣𝐶(𝑆 ∪ 𝑢, 𝑇 \ 𝑢)∣ > ∣𝐶(𝑆, 𝑇 )∣ ∶ [moving 𝑢 to 𝑆 increases the cut size]𝑆 ∶= 𝑆 ∪ 𝑢𝑇 ∶= 𝑇 \ 𝑢

If no move was made in this iteration, exit this loopReturn 𝑆, 𝑇

The algorithm terminates when there is no vertex such that moving it to the opposite partition increases the cut size.By the reasoning in Proof 125, this means that for every vertex 𝑣, at least half the edges in 𝐼𝑣 are edges that crossbetween 𝑆 and 𝑇 . Thus, 𝐶(𝑆, 𝑇 ) contains at least half the edges in 𝐸, and 𝑆, 𝑇 is a 1/2-approximation.

Since the algorithm makes at least one move in each iteration (except the last) of the outer loop, and each moveincreases the size of the cut, we can argue by the potential method (page 7) that it makes at most ∣𝐸∣ + 1 iterations (acut can have no more than ∣𝐸∣ edges).

The following illustrates the execution of the local-search algorithm on a sample graph:

Cut size: 0 Cut size: 3 Cut size: 5

Initially, all the vertices are in 𝑇 , illustrated above as red, solid circles. The algorithm picks the bottom-right vertexto swap, since doing so increases the cut size from zero to three. We represent its membership in 𝑆 by depicting thevertex as a blue, open circle. The algorithm then happens to pick the vertex at the middle left – it has three adjacentinternal edges and only one adjacent crossing edge, so moving it to 𝑆 results in one internal edge and three crossingedges, increasing the cut size to five. At this point, all vertices have at least as many adjacent crossing edges as internaledges, so the algorithm terminates. (The maximal cut for this graph has six edges – can you find it?)



18.3 Knapsack

The knapsack problem involves selecting a subset of items that maximizes their total value, subject to a weight con-straint. More specifically, we have 𝑛 items, and each item 𝑖 has a value 𝑣𝑖 and a weight 𝑤𝑖. We have a weight capacity𝐶, and we wish to find a subset 𝑆 of the items that maximizes the total value

𝑣𝑎𝑙𝑢𝑒(𝑆) =∑𝑖∈𝑆

𝑣𝑖

without exceeding the weight limit:

∑𝑖∈𝑆

𝑤𝑖 ≤ 𝐶

In this problem formulation, we are not permitted to select the same item more than once; for each item, we either selectit to be in 𝑆 or we do not. Thus, this formulation is also known as the 0-1 knapsack problem. For simplicity, we assumethat each item’s weight is no more than 𝐶 – otherwise, we can immediately remove this item from consideration.

A natural dynamic-programming algorithm to compute the maximal set of items does 𝑂(𝑛𝐶) operations, given 𝑛items and a weight limit 𝐶. Assuming that the weights and values of each item are numbers that are similar in size to(or smaller than) 𝐶, the total input size is 𝑂(𝑛 log𝐶+ log𝐶) = 𝑂(𝑛 log𝐶), with all numbers encoded in a non-unarybase46. Thus, the dynamic-programming solution is not efficient with respect to the input size. In fact, this problem isknown to be NP-Hard, so we are unlikely to find an efficient solution to solve it exactly.

Instead, we attempt to approximate the optimal solution to knapsack with a greedy algorithm. Our first attempt is therelatively greedy algorithm, where we select the items with largest value relative to their weight. In other words, wechoose items in order of the ratio 𝑣𝑖/𝑤𝑖.

Algorithm 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒𝑙𝑦-𝐺𝑟𝑒𝑒𝑑𝑦(𝑉 = (𝑣1, . . . , 𝑣𝑛),𝑊 = (𝑤1, . . . , 𝑤𝑛), 𝐶) ∶Let 𝑠𝑜𝑟𝑡𝑒𝑑 be the indices 1, . . . , 𝑛 sorted by 𝑣𝑖/𝑤𝑖

𝑆 ∶= ∅

𝑤𝑒𝑖𝑔ℎ𝑡 ∶= 0


If 𝑤𝑒𝑖𝑔ℎ𝑡 + 𝑤𝑠𝑜𝑟𝑡𝑒𝑑[𝑖] ≤ 𝐶 ∶

𝑆 ∶= 𝑆 ∪ 𝑠𝑜𝑟𝑡𝑒𝑑[𝑖]𝑤𝑒𝑖𝑔ℎ𝑡 ∶= 𝑤𝑒𝑖𝑔ℎ𝑡 + 𝑤𝑠𝑜𝑟𝑡𝑒𝑑[𝑖]

Return 𝑆

This algorithm runs in O(𝑛 log 𝑛 log𝐶) time, where 𝑛 = ∣𝑉 ∣; the time is dominated by sorting, which does O(𝑛 log 𝑛)comparisons (e.g. using merge sort (page 12)), and each comparison takes O(log𝐶) time for numbers that are similarin size to 𝐶. Thus, this algorithm is efficient with respect to the input size.

Unfortunately, the relatively greedy algorithm is not guaranteed to achieve any approximation ratio 𝛼. Consider whathappens when we have two items, the first with value 𝑣1 = 1 and weight 𝑤1 = 1, and the second with value 𝑣2 = 100and weight 𝑤2 = 200. Then item 1 has a value-to-weight ratio of 1, while item 2 has a value-to-weight ratio of 0.5.This means that the relatively greedy algorithm will always try to pick item 1 before item 2. If the weight capacityis 𝐶 = 200, the algorithm produces 𝑆 = 1, with a total value of 𝑣𝑎𝑙𝑢𝑒(𝑆) = 1. However, the optimal solution is𝑆∗= 2, with a total value of 𝑣𝑎𝑙𝑢𝑒(𝑆∗) = 100. This gives us an approximation ratio of 0.01, but we can easily find

other examples with arbitrarily smaller ratios.

46 An algorithm is said to run in pseudopolynomial time if its runtime complexity is polynomial in the value of the integers in the input, asopposed to the size of the integers. Another way of defining pseudopolynomial time is for an algorithm to run in polynomial time with respect tothe input when all numbers in the input are encoded in unary. The dynamic-programming algorithm for knapsack takes pseudopolynomial time.

18.3. Knapsack 183


The main problem in the example above is that the relatively greedy algorithm ignores the most valuable item in favorof one with a better value-to-weight ratio. A dumb-greedy algorithm that just picks the single item with the highestvalue actually produces the optimal result for the example above. This algorithm is as follows:

Algorithm 𝐷𝑢𝑚𝑏-𝐺𝑟𝑒𝑒𝑑𝑦(𝑉 = (𝑣1, . . . , 𝑣𝑛),𝑊 = (𝑤1, . . . , 𝑤𝑛), 𝐶) ∶Let 𝑠𝑜𝑟𝑡𝑒𝑑 be the indices 1, . . . , 𝑛 sorted by 𝑣𝑖

Return 𝑠𝑜𝑟𝑡𝑒𝑑[1]

This algorithm has the same O(𝑛 log 𝑛 log𝐶) runtime as the relatively greedy algorithm. While it achieves the optimalresult for our previous example, it is straightforward to construct examples where the dumb-greedy algorithm fails toachieve any approximation ratio 𝛼. Suppose we have 201 items, where all but the last item have value 1 and weight 1,while the last item has value 2 and weight 200. In other words, we have

𝑣1 = 𝑣2 = ⋅ ⋅ ⋅ = 𝑣200 = 1

𝑤1 = 𝑤2 = ⋅ ⋅ ⋅ = 𝑤200 = 1

𝑣201 = 2

𝑤201 = 200

With a weight limit 𝐶 = 200, the optimal solution is to pick items 1, . . . , 200 for a total value of 200, but the dumb-greedy algorithm picks the single item 201, for a value of 2. Thus, it achieves a ratio of just 0.01 on this instance ofthe problem. Moreover, the example can be modified to make the ratio arbitrarily small.

Observe that for this second example, the relatively greedy algorithm actually would produce the optimal solution. Itseems like the dumb-greedy algorithm does well where the relatively greedy algorithm fails, and the relatively greedyalgorithm performs well when the dumb-greedy algorithm does not. This motivates us to construct a smart-greedyalgorithm that runs both the relatively greedy and dumb-greedy algorithm and picks the better of the two solutions. Itcan be shown that the smart-greedy algorithm 1/2-approximates the optimal solution for any instance of the knapsackproblem.

Algorithm 𝑆𝑚𝑎𝑟𝑡-𝐺𝑟𝑒𝑒𝑑𝑦(𝑉 = (𝑣1, . . . , 𝑣𝑛),𝑊 = (𝑤1, . . . , 𝑤𝑛), 𝐶) ∶𝑆1 ∶= 𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒𝑙𝑦-𝐺𝑟𝑒𝑒𝑑𝑦(𝑉,𝑊,𝐶)𝑆2 ∶= 𝐷𝑢𝑚𝑏-𝐺𝑟𝑒𝑒𝑑𝑦(𝑉,𝑊,𝐶)If 𝑣𝑎𝑙𝑢𝑒(𝑆1) > 𝑣𝑎𝑙𝑢𝑒(𝑆2), return 𝑆1

Else return 𝑆2

Exercise 128 In this exercise, we will prove that the smart-greedy algorithm is a 1/2-approximation algorithmfor the 0-1 knapsack problem.

a) Define the fractional knapsack problem as a variant that allows taking any partial amount of each item. (Forexample, we can take 3/7 of item 𝑖, for a value of (3/7)𝑣𝑖, at a weight of (3/7)𝑤𝑖.) Prove that for thisfractional variant, the optimal value is no smaller than for the original 0-1 variant (for the same weights,values, and knapsack capacity).

b) Prove that the sum of the values obtained by the relatively greedy and dumb-greedy algorithm is at leastthe optimal value for the fractional knapsack problem.

c) Using the previous parts, prove that the smart-greedy algorithm is a 1/2-approximation algorithm for the0-1 knapsack problem.

18.3. Knapsack 184


18.4 Other Approaches to NP-Hard Problems

While we do not know how to efficiently find exact solutions to NP-Hard problems, we can apply heuristics to theseproblems. There are two general kinds of heuristics:

• Those that are guaranteed to get the right answer, but may take a long time on some inputs. However, they maybe efficient enough for inputs that we care about. Examples include the dynamic-programming algorithm for0-1 knapsack where the weight capacity 𝐶 is small compared to the number of items 𝑛, as well as SAT solversfor Boolean formulas that emerge from software verification.

• Those that are guaranteed to run fast, but may produce an incorrect or suboptimal answer. The 𝛼-approximationalgorithms we considered above are examples of this, as well as greedy algorithms for the traveling salespersonproblem, which has no 𝛼-approximation algorithm for general instances of the problem.

Some algorithms are only applicable to special cases of a problem. For instance, there are efficient algorithms forfinding a maximal cut on planar graphs, which can be depicted on a plane without any crossing edges.

Another general approach is randomized algorithms, which may be able to find good solutions with high probability.We will consider such algorithms in the next unit.

18.4. Other Approaches to NP-Hard Problems 185

Part IV

Randomness

186

CHAPTER

NINETEEN

RANDOMIZED ALGORITHMS

The algorithms we have seen thus far have been deterministic, meaning that they execute the same steps each timethey are run and produce the same result. In this unit, we consider how randomness can be applied to computation.We will exploit randomness to design simple, probabilistic algorithms, and we will discuss the tools that are necessaryto analyze randomized algorithms.

As a first example, consider a game of rock-paper-scissors between two players. In this game, both players simultane-ously choose one of three options: rock, paper, or scissors. The game is a tie if both players choose the same option.If they make different choices, then rock beats scissors, scissors beats paper, and paper beats rock:

Player A

Rock Paper Scissors

Play

er B Rock Tie A wins B wins

Paper B wins Tie A wins

Scissors A wins B wins Tie

Suppose we write a program to act as a player in a best-out-of-five game of rock-paper-scissors. If we hardcode astrategy such as playing rock in the first two moves, then paper, then scissors, and then paper, what is the probabilitythat the program wins a game against an opponent? The program is deterministic, so a smart opponent would beable to observe the program’s strategy once and defeat it every single time thereafter. If an opponent is able to readthe program’s source code, then a preliminary observation isn’t even necessary. This illustrates the problem with adeterministic strategy – it makes the program’s actions predictable and thus easily countered.

What if we change the program to choose a random action in each move, with equal probability for rock, paper, orscissors? Now even if an opponent knows the program’s strategy, the opponent cannot know which action the programwill take and therefore cannot universally counter that action. In fact, no matter what strategy the opponent uses, theprobability the opponent wins an individual round is always 1

3. Before we see why, we review some basic tools we

use to analyze probabilistic outcomes.

187


19.1 Review of Probability

A random variable is a variable whose value is dependent on the outcome of a random phenomenon. The set ofoutcomes associated with a random phenomenon is called the sample space, and a random variable 𝑋 can be regardedas a function from the sample space Ω to the set of real numbers, i.e. 𝑋 ∶ Ω→ R. Thus, the random variable associatesa numerical value with each outcome.

A random variable 𝑋 can be described with a probability distribution – for each possible value 𝑎𝑖, the random variablehas a probability 𝑝𝑖 of taking on the value 𝑎𝑖:

Pr[𝑋 = 𝑎𝑖] = 𝑝𝑖

The probability 𝑝𝑖 must be in the interval [0, 1], i.e. 0 ≤ 𝑝𝑖 ≤ 1. Furthermore, the probabilities 𝑝𝑖 over all possible 𝑖must add up to 1:

∑𝑖

𝑝𝑖 = 1

A random variable 𝑋 takes on the value 𝑎𝑖 when the outcome 𝜔 of a random experiment has associated value 𝑋(𝜔) =𝑎𝑖. Thus, 𝑋 = 𝑎𝑖 is an event, a subset of the sample space of an experiment. More specifically, it is the event:

(𝑋 = 𝑎𝑖) = 𝜔 ∈ Ω ∶ 𝑋(𝜔) = 𝑎𝑖

The probability of an event can be computed as the sum of the probabilities for each outcome in the event. Thus, theprobability Pr[𝑋 = 𝑎𝑖] is:

Pr[𝑋 = 𝑎𝑖] = ∑𝜔∈Ω∶𝑋(𝜔)=𝑎𝑖

Pr[𝜔]

If each outcome has equal probability 1∣Ω∣ , we get

Pr[𝑋 = 𝑎𝑖] =∣𝜔 ∈ Ω ∶ 𝑋(𝜔) = 𝑎𝑖∣

∣Ω∣

Example 129 Suppose we roll a pair of distinguishable, six-sided dice. Let 𝑋 be a random variable correspond-ing to the sum of the values of the dice – if the values are (𝑖, 𝑗), then 𝑋 = 𝑖 + 𝑗. The events corresponding toeach value of 𝑋 are as follows:

(𝑋 = 2) = (1, 1)(𝑋 = 3) = (1, 2), (2, 1)(𝑋 = 4) = (1, 3), (2, 2), (3, 1)(𝑋 = 5) = (1, 4), (2, 3), (3, 2), (4, 1)(𝑋 = 6) = (1, 5), (2, 4), (3, 3), (4, 2), (5, 1))(𝑋 = 7) = (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)(𝑋 = 8) = (2, 6), (3, 5), (4, 4), (5, 3), (6, 2)(𝑋 = 9) = (3, 6), (4, 5), (5, 4), (6, 3)

(𝑋 = 10) = (4, 6), (5, 5), (6, 4)(𝑋 = 11) = (5, 6), (6, 5)(𝑋 = 12) = (6, 6)

If the dice are fair, each outcome has equal probability. This gives us the following probability distribution for

19.1. Review of Probability 188


𝑋:

Pr[𝑋 = 2] = 1

36

Pr[𝑋 = 3] = 2

36=

1

18

Pr[𝑋 = 4] = 3

36=

1

12

Pr[𝑋 = 5] = 4

36=

1

9

Pr[𝑋 = 6] = 5

36

Pr[𝑋 = 7] = 6

36=

1

6

Pr[𝑋 = 8] = 5

36

Pr[𝑋 = 9] = 4

36=

1

9

Pr[𝑋 = 10] = 3

36=

1

12

Pr[𝑋 = 11] = 2

36=

1

18

Pr[𝑋 = 12] = 1

36

The probability distribution provides full detail about a random variable. However, we often are interested in moresuccinct information that in a sense summarizes the details. One such piece of information is the expectation orexpected value of a random variable, which corresponds to the “average” value that the random variable takes on. Wecompute the expectation E[𝑋] as follows:

E[𝑋] =∑𝑖

𝑎𝑖 ⋅ Pr[𝑋 = 𝑎𝑖]

Thus, the expectation is the average value of the variable weighted by the probability of obtaining each value.

Recall that

Pr[𝑋 = 𝑎𝑖] = ∑𝜔∈Ω∶𝑋(𝜔)=𝑎𝑖

Pr[𝜔]

Plugging this into the formula for expectation, we get

E[𝑋] =∑𝑖


=∑𝑖

𝑎𝑖 ⋅⎛⎜⎝

∑𝜔∈Ω∶𝑋(𝜔)=𝑎𝑖

Pr[𝜔]⎞⎟⎠

=∑𝑖

∑𝜔∈Ω∶𝑋(𝜔)=𝑎𝑖

𝑋(𝜔) ⋅ Pr[𝜔]

= ∑𝜔∈Ω


as an alternate expression of E[𝑋].Often, we can express a random variable 𝑋 in terms of other random variables 𝑋𝑖. We can then apply linearity ofexpectation to compute the expectation of 𝑋 from the expectations of each 𝑋𝑖.



Theorem 130 (Linearity of Expectation) Let 𝑋 be a weighted sum of random variables 𝑋𝑖, each multiplied bya constant 𝑐𝑖:

𝑋 =∑𝑖

𝑐𝑖 ⋅𝑋𝑖

Then E[𝑋] is:

E[𝑋] =∑𝑖

𝑐𝑖 ⋅ E[𝑋𝑖]

Proof 131 We prove the case where 𝑋 is the weighted sum of two variables

𝑋 = 𝑐1𝑋1 + 𝑐2𝑋2

From the alternate formulation of expectation, we have

E[𝑋] = ∑𝜔∈Ω


= ∑𝜔∈Ω

(𝑐1𝑋1(𝜔) + 𝑐2𝑋2(𝜔)) ⋅ Pr[𝜔]

= ∑𝜔∈Ω

𝑐1𝑋1(𝜔) ⋅ Pr[𝜔] + 𝑐2𝑋2(𝜔) ⋅ Pr[𝜔]

= (∑𝜔∈Ω

𝑐1𝑋1(𝜔) ⋅ Pr[𝜔]) + (∑𝜔∈Ω

𝑐2𝑋2(𝜔) ⋅ Pr[𝜔])

= 𝑐1 ⋅ (∑𝜔∈Ω

𝑋1(𝜔) ⋅ Pr[𝜔]) + 𝑐2 ⋅ (∑𝜔∈Ω

𝑋2(𝜔) ⋅ Pr[𝜔])

= 𝑐1 ⋅ E[𝑋1] + 𝑐2 ⋅ E[𝑋2]

Example 132 Suppose we roll a pair of distinguishable, six-sided dice. Let 𝑋1 be the value of the first die and𝑋2 the value of the second. If the dice are fair, then each outcome is equally likely, so we have

E[𝑋1] = E[𝑋2] =6

∑𝑖=1

𝑖 ⋅1

6

= 3.5

Let 𝑋 = 𝑋1 +𝑋2 be the sum of the values of the two dice. By linearity of expectation, we have

E[𝑋] = E[𝑋1] + E[𝑋2] = 7

Computing the expectation from the probability distribution of 𝑋 gives us the same result, but it requires muchmore work to do so.

A sequence of random variables 𝑋1, 𝑋2, . . . , 𝑋𝑛 is independent if for every possible set of values 𝑏1, 𝑏2, . . . , 𝑏𝑛, wehave:

Pr[𝑋1 = 𝑏1, 𝑋2 = 𝑏2, . . . , 𝑋𝑛 = 𝑏𝑛] =∏𝑖

Pr[𝑋𝑖 = 𝑏𝑖]

The notation Pr[𝑋1 = 𝑏1, 𝑋2 = 𝑏2, . . . , 𝑋𝑛 = 𝑏𝑛] represents the joint probability of the events 𝑋1 = 𝑏1, 𝑋2 = 𝑏2,. . ., 𝑋𝑛 = 𝑏𝑛 occurring simultaneously. The random variables are independent if the joint probability of them takingon any sequence of values is the same as the product of each of them individually taking on the respective value.



A common type of random variable is an indicator random variable, also called a 0-1 random variable. As the lattername suggests, it is a random variable that only takes on the values 0 or 1. If 𝑋 is an indicator and Pr[𝑋 = 1] = 𝑝,we have Pr[𝑋 = 0] = 1 − 𝑝, and

E[𝑋] = 0 ⋅ Pr[𝑋 = 0] + 1 ⋅ 𝑃𝑟[𝑋 = 1]= Pr[𝑋 = 1] = 𝑝

If we can define a random variable 𝑌 as the sum of 𝑛 indicator random variables 𝑋𝑖, each with the same probabilityPr[𝑋𝑖 = 1] = 𝑝, then:

E[𝑌 ] =∑𝑖

E[𝑋𝑖]

=∑𝑖

Pr[𝑋𝑖 = 1]

= 𝑛𝑝

Example 133 Suppose we repeatedly flip a biased coin with probability 𝑝 of heads. What is the expected numberof heads over 𝑛 flips?

Define 𝑋𝑖 to be 0 if the 𝑖th flip is tails and 1 if it is heads. Then 𝑋𝑖 is an indicator with E[𝑋𝑖] = Pr[𝑋𝑖] = 𝑝. Let𝑌 be the sum of the 𝑋𝑖. Since 𝑌 is the sum of 𝑛 indicators 𝑋𝑖, each with probability Pr[𝑋𝑖 = 1] = 𝑝, we haveE[𝑌 ] = 𝑛𝑝. Observe also that 𝑌 is the total number of heads, since 𝑋𝑖 contributes 1 to the value of 𝑌 when the𝑖th flip is heads and 0 otherwise. Thus, the expected number of heads is 𝑛𝑝.

Returning back to the rock-paper-scissors example, let 𝑅, 𝑃 , and 𝑆 be arbitrary numbers corresponding to rock, paper,and scissors. Let 𝐴 be a random variable representing the action taken by our randomized program, 𝐵 be a randomvariable representing the opponent’s action. Let 𝑊 be the event that the opponent wins – this is when the programchooses rock and the opponent paper, or the program paper and the opponent scissors, or the program scissors and theopponent rock. These are disjoint events, so the probability of any one of them happening is the sum of their individualprobabilities. Then we have:

Pr[𝑊 ] = Pr[𝐴 = 𝑅,𝐵 = 𝑃 ] + Pr[𝐴 = 𝑃,𝐵 = 𝑆] + Pr[𝐴 = 𝑆,𝐵 = 𝑅]= Pr[𝐴 = 𝑅] ⋅ Pr[𝐵 = 𝑃 ] + Pr[𝐴 = 𝑃 ] ⋅ Pr[𝐵 = 𝑆] + Pr[𝐴 = 𝑆] ⋅ Pr[𝐵 = 𝑅]

=1

3(Pr[𝐵 = 𝑃 ] + Pr[𝐵 = 𝑆] + Pr[𝐵 = 𝑅])

=1

3

In the second step, we make use of the fact that the program’s action and the opponent’s are independent. Then weapply the fact that the program chooses each action with equal probability 1

3. Finally, the opponent must choose one

of the three possible actions, so regardless of how the opponent makes their choice, the probabilities sum to one.The conclusion is that the probability the opponent wins is 1

3no matter what strategy they use. This is a significant

improvement over the deterministic program, where the opponent could always win once they discover the program’sstrategy.

The rock-paper-scissors example is illustrative of a more general notion of considering algorithms as games. We allowone player to choose an algorithm to solve a problem. A second, more powerful player known as the adversary hasthe ability to choose an input. The goal of the adversary is to maximize the number of steps taken by the chosenalgorithm on the given input. The game is unfair – the adversary knows the process by which the first player choosesan algorithm (e.g. from prior observations, or from reading the source code for the first player). If the first playerchooses the algorithm deterministically, the adversary can always find an input that maximizes the number of steps.On the other hand, if the first player chooses an algorithm randomly according to some distribution, the adversary canno longer always find a worst input, even if the adversary knows the distribution the first player uses.



19.2 Randomized Approximations

We can exploit randomness to construct simple algorithms that produce an 𝛼-approximation (page 177) in expectation,meaning that the expected value of the output is within a factor 𝛼 of the value of the optimal solution. As an example,we consider the problem of satisfying as many clauses as possible given an exact 3CNF formula. Such a formula is in3CNF (page 158), with the added stipulation that each clause has three distinct variables. For example, the clause

(𝑥 ∨ ¬𝑦 ∨ ¬𝑧)

is in exact 3CNF, while the clause

(𝑥 ∨ ¬𝑦 ∨ ¬𝑥)

is not, since it only has two distinct variables.

The problem of finding an assignment that maximizes the number of satisfied clauses in an exact 3CNF formula isknown as Max-3SAT (also called Max-E3SAT). This is an NP-Hard optimization problem. However, a simple,randomized algorithm that assigns the truth value of each variable randomly (with equal probability for true or false)achieves a 7/8-approximation in expectation. We demonstrate this as follows.

Suppose 𝜑 is an exact 3CNF formula with 𝑛 variables and 𝑚 clauses. Let 𝑎 be a random assignment of the variables,and let 𝑋(𝑎) be the number of satisfied clauses for this assignment. Then 𝑋 is a random variable, since it associateseach output with a number. We define indicator random variables 𝑋𝑖, where 𝑋𝑖 = 1 if clause 𝑖 is satisfied and 𝑋𝑖 = 0if it is not. Then we have

𝑋 = 𝑋1 + ⋅ ⋅ ⋅ +𝑋𝑚

An exact 3CNF clause has three literals, each with a distinct variable, and the probability that an individual literal issatisfied by a random assignment to its variable is 1

2; a literal of the form 𝑥 is satisfied when 𝑥 = 1, which occurs with

probability 12

, and a literal of the form ¬𝑥 is satisfied when 𝑥 = 0, which also occurs with probability 12

. A clause isunsatisfied when all three of its literals are false. Since the literals have distinct variables, the events that each is falseare independent. Thus, we have

Pr[all three literals in clause 𝑖 are false] =3

∏𝑗=1

Pr[literal 𝑗 in clause 𝑖 is false]

= (1

2)3

=1

8

If even a single literal is true, the clause is satisfied. Thus, the probability that clause 𝑖 is satisfied with a randomassignment is

Pr[𝑋𝑖 = 1] = 1 − Pr[all three literals in clause 𝑖 are false]

= 1 −1

8=

7

8

We then have E[𝑋𝑖] = 78

, so by linearity of expectation,

E[𝑋] = E[𝑋1] + ⋅ ⋅ ⋅ + E[𝑋𝑚]

=7

8𝑚

Thus, the expected number of satisfied clauses is 78𝑚. Since the optimal assignment can satisfy at most 𝑚 clauses, the

random assignment is a 7/8-approximation, in expectation, of the optimal.

We can further conclude from this result that any exact 3CNF formula has an assignment that satisfies at least 78

of itsclauses. We demonstrate this via contradiction. Suppose that a formula 𝜑 does not have an assignment that satisfies

19.2. Randomized Approximations 192


at least 78

of its clauses. Let 𝑚 be the number of clauses in 𝜑, and let 𝑋(𝑎) be the number of clauses satisfied by anassignment 𝑎. We showed above that E[𝑋] = 7

8𝑚. By the alternate formulation of expectation, we have

E[𝑋] = ∑𝑎∈Ω

𝑋(𝑎) ⋅ Pr[𝑎]

where Ω is the set of all possible assignments. By assumption, we have 𝑋(𝑎) < 78𝑚 for all 𝑎 ∈ Ω. This gives us

E[𝑋] < ∑𝑎∈Ω

7

8𝑚 ⋅ Pr[𝑎]

=7

8𝑚 ⋅ ∑

𝑎∈Ω

Pr[𝑎]

=7

8𝑚

However, this contradicts our result that E[𝑋] = 78𝑚. Thus, our assumption that no assignment exists that satisfies 7

8of the clauses must be false.

More generally, the reasoning above gives us the following for an arbitrary random variable 𝑋:

Claim 134 Suppose 𝑋 is a random variable over a sample space Ω. Then:

• There is at least one outcome 𝜔 ∈ Ω such that

𝑋(𝜔) ≥ E[𝑋]

• There is at least one outcome 𝜔 ∈ Ω such that

𝑋(𝜔) ≤ E[𝑋]

Applying the above, given 𝑚 objects, if a property holds for each object with probability 𝑝, then there must be a casewhere it holds for at least 𝑚𝑝 objects simultaneously. This is because the expected number of objects for which theproperty holds is 𝑚𝑝, and by Claim 134, there must be some case where it holds for at least 𝑚𝑝 objects.

Another question we may have is, what is the probability that a random assignment satisfies at least half the clauses ofan exact 3CNF formula? In terms of the random variable 𝑋 , what is Pr[𝑋 ≥

12𝑚] for a formula with 𝑚 clauses? We

can apply Markov’s inequality to place a bound on this probability.

Theorem 135 (Markov’s Inequality) Let 𝑋 be a nonnegative random variable (i.e. 𝑋 never takes on a negativevalue), and let 𝑣 > 0. Then

Pr[𝑋 ≥ 𝑣] ≤ E[𝑋]𝑣

Proof 136 We prove Markov’s inequality for a discrete random variable 𝑋 , one that can only take on a countable



number of values. By definition of expectation, we have:

E[𝑋] =∑𝑖


= (∑𝑎𝑖<𝑣

𝑎𝑖 ⋅ Pr[𝑋 = 𝑎𝑖]) + (∑𝑎𝑖≥𝑣

𝑎𝑖 ⋅ Pr[𝑋 = 𝑎𝑖])

≥ ∑𝑎𝑖≥𝑣


≥ ∑𝑎𝑖≥𝑣

𝑣 ⋅ Pr[𝑋 = 𝑎𝑖]

= 𝑣 ⋅ Pr[𝑋 ≥ 𝑣]

The third step follows from the second because 𝑋 is nonnegative, so 𝑎𝑖 ≥ 0 and therefore the left-hand summationis nonnegative.

Dividing both sides of the result by 𝑣, we get

E[𝑋]𝑣 ≥ Pr[𝑋 ≥ 𝑣]

In the case of Max-3SAT, 𝑋 is a count of the number of satisfied clauses, so it never takes on a negative value. Thus,we can apply Markov’s inequality with 𝑣 = 1

2to obtain:

Pr [𝑋 ≥1

2𝑚] ≤ (7

8𝑚) / (1

2𝑚) = 7

4

Unfortunately, this doesn’t tell us anything we didn’t know – we already know by definition of probability that Pr[𝑋 ≥12𝑚] ≤ 1. Furthermore, we would like a lower bound on the probability that at least half the clauses are satisfied, not

an upper bound. We need another tool to reason about this.

Theorem 137 (Reverse Markov’s Inequality) Let 𝑋 be a random variable that is never larger than some value𝑏, and let 𝑣 < 𝑏. Then

Pr[𝑋 > 𝑣] ≥ E[𝑋] − 𝑣

𝑏 − 𝑣

More loosely, we have

Pr[𝑋 ≥ 𝑣] ≥ E[𝑋] − 𝑣

𝑏 − 𝑣

Proof 138 Define 𝑌 = 𝑏 −𝑋 . Since 𝑋 is never larger than 𝑏, 𝑌 is a nonnegative random variable. We have

Pr[𝑋 ≤ 𝑣] = Pr[𝑏 − 𝑌 ≤ 𝑣]= Pr[𝑌 ≥ 𝑏 − 𝑣]

Applying Markov’s inequality, we get:

Pr[𝑌 ≥ 𝑏 − 𝑣] ≤ E[𝑌 ]𝑏 − 𝑣

=𝑏 − E[𝑋]𝑏 − 𝑣



Then we have

Pr[𝑋 > 𝑣] = 1 − Pr[𝑋 ≤ 𝑣]= 1 − Pr[𝑌 ≥ 𝑏 − 𝑣]

≥ 1 −𝑏 − E[𝑋]𝑏 − 𝑣

=E[𝑋] − 𝑣

𝑏 − 𝑣

To obtain the looser formulation, we note that

Pr[𝑋 ≥ 𝑣] = Pr[𝑋 > 𝑣] + Pr[𝑋 = 𝑣]≥ Pr[𝑋 > 𝑣]

Combining this with the previous result, we obtain

Pr[𝑋 ≥ 𝑣] ≥ E[𝑋] − 𝑣

𝑏 − 𝑣

Observe that for Max-3SAT, we have an upper bound of 𝑚 on the value of 𝑋 , where 𝑚 is the number of clauses.Setting 𝑣 = 1

2𝑚 < 𝑚, we apply the reverse Markov’s inequality to get

Pr [𝑋 ≥1

2𝑚] ≥ (7

8𝑚 −

1

2𝑚) / (𝑚 −

1

2𝑚)

= (3

8) / (1

2) = 3

4

Thus, the randomized algorithm produces an assignment that satisfies at least half the clauses with probability ≥ 34

.

Exercise 139 The following is a randomized algorithm for producing a cut in an undirected graph 𝐺:

Algorithm 𝑅𝑎𝑛𝑑𝑜𝑚-𝐶𝑢𝑡(𝐺 = (𝑉,𝐸)) ∶𝑆 ∶= ∅

For each 𝑣 ∈ 𝑉 ∶

Add 𝑣 to 𝑆 with probability 1/2

Return 𝑆, 𝑇 = 𝑉 \ 𝑆

Recall that 𝐶(𝑆, 𝑇 ) is the set of crossing edges for the cut 𝑆, 𝑇 , i.e., those that have one endpoint in 𝑆 and theother endpoint in 𝑇 . The size of the cut is ∣𝐶(𝑆, 𝑇 )∣.

a) Prove that the randomized algorithm is a 1/2-approximation in expectation for finding the maximal cut ofan undirected graph. That is, the expected size of the returned cut is at least half the size of a maximal cut.

Hint: Use linearity of expectation to show that in expectation, half the edges of the graph are in 𝐶(𝑆, 𝑇 ).

b) Argue that, for any undirected graph 𝐺 = (𝑉,𝐸), there exists a cut of size at least 12∣𝐸∣.



19.3 Primality Testing

A key component of many cryptography (page 238) algorithms is finding large prime numbers, with hundreds oreven thousands of digits. A typical approach is to randomly choose a large number and then apply a primality test todetermine whether it is prime – the prime number theorem47 states that approximately 1/𝑚 of the 𝑚-bit numbers areprime, so we only expect to check 𝑚 numbers before finding one that is prime48. As long as the primality test itself isefficient with respect to 𝑚, the expected time to find a prime number with 𝑚 bits is polynomial in 𝑚.

Formally, we wish to decide the following language:

PRIMES = 𝑚 ∈ N ∶ 𝑚 is prime

A positive integer 𝑚 is a member of PRIMES if it has no positive factors other than 1 and 𝑚.

Is the language PRIMES in P? Let us consider the following simple algorithm to decide the language:

Algorithm 𝑃𝑟𝑖𝑚𝑒-𝑇𝑒𝑠𝑡(𝑚) ∶For 𝑖 = 2 to 𝑚 − 1 ∶

If 𝑚 mod 𝑖 ≡ 0, rejectAccept

This algorithm does O(𝑚) mod operations. Is this efficient? The size of the input 𝑚 is the number of bits required torepresent 𝑚, which is O(log𝑚). Thus, the number of operations is O(𝑚) = O(2log𝑚), which is exponential in thesize of 𝑚.

We can improve the algorithm by observing that if 𝑚 has a factor within the interval [2,𝑚 − 1], it must have a factorwithin [2,√𝑚] – otherwise, we would obtain a product larger than 𝑚 when we multiplied its factors together. Thus,we need only iterate up to

√𝑚:

Algorithm 𝑃𝑟𝑖𝑚𝑒-𝑇𝑒𝑠𝑡(𝑚) ∶For 𝑖 = 2 to

√𝑚 ∶

If 𝑚 mod 𝑖 ≡ 0, rejectAccept

This algorithm does O(√𝑚) mod operations. However, this still is not polynomial; we have:

O(√𝑚) = O(𝑚1/2) = O(21/2⋅log𝑚)

To be efficient, a primality-testing algorithm must have runtime O(log𝑘𝑚) for some constant 𝑘. Neither of the

algorithms above meet this threshold.

In fact, there is a known efficient algorithm for primality testing, the AKS primality test50. Thus, it is indeed thecase that PRIMES ∈ P. However, this algorithm is somewhat complicated, and its running time is high enough topreclude it from being used in practice. Instead, we consider a randomized primality test that is efficient and workswell in practice for most inputs. The algorithm we construct relies on the extended Fermat’s little theorem.

Theorem 140 (Extended Fermat’s Little Theorem) Let 𝑛 ∈ N be a natural number such that 𝑛 ≥ 2. Let 𝑎 bea witness in the range 1 ≤ 𝑎 ≤ 𝑛 − 1. Then:

• If 𝑛 is prime, then 𝑎𝑛−1

≡ 1 (mod 𝑛) for any witness 𝑎.

47 https://en.wikipedia.org/wiki/Prime_number_theorem48 This follows from the fact that the expected value of a geometric distribution49 with parameter 𝑝 is 1/𝑝.49 https://en.wikipedia.org/wiki/Geometric_distribution50 https://en.wikipedia.org/wiki/AKS_primality_test

19.3. Primality Testing 196

https://en.wikipedia.org/wiki/Prime_number_theorem

https://en.wikipedia.org/wiki/AKS_primality_test

https://en.wikipedia.org/wiki/Geometric_distribution


• If 𝑛 is composite and 𝑛 is not a Carmichael number, then 𝑎𝑛−1

≡ 1 (mod 𝑛) for at most half the witnesses1 ≤ 𝑎 ≤ 𝑛 − 1.

We postpone discussion of Carmichael numbers for the moment. Instead, we take a look at some small cases ofcomposite numbers to see that the extended Fermat’s little theorem holds. We first consider 𝑛 = 6. We have:

𝑎 = 1 ∶ 15

≡ 1 (mod 6)𝑎 = 2 ∶ 2

5= 32 = 2 + 5 ⋅ 6 ≡ 2 (mod 6)

𝑎 = 3 ∶ 35= 243 = 3 + 40 ⋅ 6 ≡ 3 (mod 6)

𝑎 = 4 ∶ 45= 1024 = 4 + 170 ⋅ 6 ≡ 4 (mod 6)

𝑎 = 5 ∶ 55= 3125 = 5 + 520 ⋅ 6 ≡ 5 (mod 6)

We see that 𝑎𝑛−1 ≡ 1 (mod 𝑛) for only the single witness 𝑎 = 1. Similarly, we consider 𝑛 = 9:

𝑎 = 1 ∶ 18

≡ 1 (mod 9)𝑎 = 2 ∶ 2

8= 256 = 4 + 28 ⋅ 9 ≡ 4 (mod 9)

𝑎 = 3 ∶ 38= 6561 = 0 + 729 ⋅ 9 ≡ 0 (mod 9)

𝑎 = 4 ∶ 48= 65536 = 7 + 7281 ⋅ 9 ≡ 7 (mod 9)

𝑎 = 5 ∶ 58= 390625 = 7 + 43402 ⋅ 9 ≡ 7 (mod 9)

𝑎 = 6 ∶ 68= 1679616 = 0 + 186624 ⋅ 9 ≡ 0 (mod 9)

𝑎 = 7 ∶ 78= 5764801 = 4 + 640533 ⋅ 9 ≡ 4 (mod 9)

𝑎 = 8 ∶ 88= 16777216 = 1 + 1864135 ⋅ 9 ≡ 1 (mod 9)

Here, there are two witnesses 𝑎 where 𝑎𝑛−1

≡ 1 (mod 𝑛), out of eight total. Finally, we consider 𝑛 = 7:

𝑎 = 1 ∶ 16

≡ 1 (mod 7)𝑎 = 2 ∶ 2

6= 64 = 1 + 9 ⋅ 7 ≡ 1 (mod 7)

𝑎 = 3 ∶ 36= 729 = 1 + 104 ⋅ 7 ≡ 1 (mod 7)

𝑎 = 4 ∶ 46= 4096 = 1 + 585 ⋅ 7 ≡ 1 (mod 7)

𝑎 = 5 ∶ 56= 15625 = 1 + 2232 ⋅ 7 ≡ 1 (mod 7)

𝑎 = 6 ∶ 66= 46656 = 1 + 6665 ⋅ 7 ≡ 1 (mod 7)

Since 7 is prime, we have 𝑎𝑛−1

≡ 1 (mod 𝑛) for all witnesses 𝑎.

The extended Fermat’s little theorem leads directly to a simple, efficient randomized algorithm for primality testing.This Fermat primality test is as follows:

Algorithm 𝐹𝑒𝑟𝑚𝑎𝑡-𝑇𝑒𝑠𝑡(𝑚) ∶Pick a random witness 1 ≤ 𝑎 ≤ 𝑚 − 1

If 𝑎𝑚−1 mod 𝑚 ≡ 1, acceptElse reject

The modular exponentiation in this algorithm can be done with O(log𝑚) multiplications using a divide and conquer(page 12) strategy, and each multiplication can be done efficiently. Thus, this algorithm is polynomial with respect tothe size of 𝑚.

As for correctness, we have:

• If 𝑚 is prime, then the algorithm always accepts 𝑚. In other words:

Pr[the Fermat test accepts 𝑚] = 1



• If 𝑚 is composite and not a Carmichael number, then the algorithm rejects 𝑚 with probability at least 12

. Inother words:

Pr[the Fermat test rejects 𝑚] ≥ 1

2

Thus, if the algorithm accepts 𝑚, we can be fairly confident that 𝑚 is prime. And as we will see shortly, we can repeatthe algorithm to obtain higher confidence that we get the right answer.

We now return to the problem of Carmichael numbers. A Carmichael number is a composite number 𝑛 such that𝑎𝑛−1

≡ 1 (mod 𝑛) for all witnesses 𝑎 that are relatively prime to 𝑛, i.e. gcd(𝑎, 𝑛) = 1. This implies that for aCarmichael number, the Fermat test reports with high probability that the number is prime, despite it being composite.We call a number that passes the Fermat test with high probability a pseudoprime, and the Fermat test is technically arandomized algorithm for deciding the following language:

PSEUDOPRIMES = 𝑚 ∈ N ∶ 𝑎𝑚−1 mod 𝑚 ≡ 1 for at least halfthe witnesses 1 ≤ 𝑎 ≤ 𝑚 − 1

Carmichael numbers are much rarer than prime numbers, so for many applications, the Fermat test is sufficient todetermine with high confidence that a randomly chosen number is prime. On the other hand, if the number is chosenby an adversary, then the Fermat test is unsuitable, and a more complex randomized algorithm such as the Miller-Rabinprimality test must be used instead.

19.3.1 The Miller-Rabin Test

The Miller-Rabin test is designed around the fact that for prime 𝑚, the only square roots of 1 modulo 𝑚 are -1 and 1.More specifically, if 𝑥 is a square root of 1 modulo 𝑚, we have

𝑥2≡ 1 (mod 𝑚)

By definition of modular arithmetic (page 239), this means that

𝑥2− 1 = 𝑘𝑚

for some integer 𝑘 (i.e. 𝑚 evenly divides the difference of 𝑥2 and 1). We can factor 𝑥2 − 1 to obtain:

(𝑥 + 1)(𝑥 − 1) = 𝑘𝑚

When 𝑚 is composite, so that 𝑚 = 𝑝𝑞 for integers 𝑝, 𝑞 > 1, then it is possible for 𝑝 to divide 𝑥 + 1 and 𝑞 to divide𝑥 − 1, in which case 𝑝𝑞 = 𝑚 divides their product (𝑥 + 1)(𝑥 − 1). However, when 𝑚 is prime, the only way for theequation to hold is if 𝑚 divides either 𝑥 + 1 or 𝑥 − 1; otherwise, the prime factorization of (𝑥 + 1)(𝑥 − 1) does notcontain 𝑚, and by the fundamental theorem of arithmetic51, it cannot be equal to a number whose prime factorizationcontains 𝑚52. Thus, we have either 𝑥 + 1 = 𝑎𝑚 for some integer 𝑎, or 𝑥 − 1 = 𝑏𝑚 for some 𝑏 ∈ Z. The former givesus 𝑥 ≡ −1 (mod 𝑚), while the latter results in 𝑥 ≡ 1 (mod 𝑚)., This, if 𝑚 is prime and 𝑥

2≡ 1 (mod 𝑚), then

either 𝑥 ≡ 1 (mod 𝑚) or 𝑥 ≡ −1 (mod 𝑚).

The Miller-Rabin test starts with the Fermat test: choose a witness 1 ≤ 𝑎 ≤ 𝑚 − 1 and check whether 𝑎𝑚−1 ≡ 1(mod 𝑚). If this does not hold, then 𝑚 fails the Fermat test and therefore is not prime. If it does indeed hold, thenthe test checks the square root 𝑎

12(𝑚−1) of 𝑎𝑚−1 to see if it is 1 or -1. If it is 1, the test checks the square root 𝑎

14(𝑚−1)

of 𝑎12(𝑚−1) to see if it is 1 or -1, and so on. The termination conditions are as follows:

51 https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic52 Euclid’s lemma53 more directly states that if 𝑚 is prime and 𝑚 divides a product 𝑎𝑏, then 𝑚 must divide either 𝑎 or 𝑏.53 https://en.wikipedia.org/wiki/Euclid%27s_lemma


https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic

https://en.wikipedia.org/wiki/Euclid%27s_lemma


• The test finds a square root of 1 that is not -1 or 1. By the reasoning above, 𝑚 must be composite.

• The test finds a square root of 1 that is -1. The reasoning above only holds for square roots of 1, so the testcannot continue by computing the square root of -1. In this case, 𝑚 may be prime or composite.

• The test reaches some 𝑟 for which 1/2𝑟 ⋅(𝑚−1) is no longer an integer. Then it cannot compute 𝑎 to this power

modulo 𝑚. In this case, 𝑚 may be prime or composite.

To compute these square roots, we first extract powers of 2 from the number 𝑚−1 (which is even for odd 𝑚, the casesof interest):

Algorithm 𝐸𝑥𝑡𝑟𝑎𝑐𝑡𝑃𝑜𝑤𝑒𝑟𝑠𝑂𝑓2(𝑥) ∶If 𝑥 mod 2 = 1 return (𝑥, 0)(𝑑, 𝑠′) ∶= 𝐸𝑥𝑡𝑟𝑎𝑐𝑡𝑃𝑜𝑤𝑒𝑟𝑠𝑂𝑓2(𝑥/2)Return (𝑑, 𝑠′ + 1)

Then 𝐸𝑥𝑡𝑟𝑎𝑐𝑡𝑃𝑜𝑤𝑒𝑟𝑠𝑂𝑓2(𝑚) computes 𝑑 and 𝑠 such that

𝑚 − 1 = 2𝑠𝑑

where gcd(𝑑, 2) = 1. Then we have 𝑎2𝑠−1

𝑑 is the square root of 𝑎2𝑠𝑑.

The full Miller-Rabin algorithm is as follows:

Algorithm 𝑀𝑖𝑙𝑙𝑒𝑟-𝑅𝑎𝑏𝑖𝑛(𝑚) ∶Pick a random witness 1 ≤ 𝑎 ≤ 𝑚 − 1

Compute 𝑠, 𝑑 such that 𝑚 − 1 = 2𝑠𝑑 and gcd(𝑑, 2) = 1

Compute 𝑎𝑑, 𝑎

2𝑑, 𝑎

4𝑑, . . . , 𝑎

2𝑠−1

𝑑, 𝑎

2𝑠𝑑

If 𝑎2𝑠𝑑 /≡ 1 (mod 𝑚), reject (Fermat test)

For 𝑡 = 𝑠 − 1, 𝑠 − 2, . . . , 0 ∶

If 𝑎2𝑡𝑑≡ −1 (mod 𝑚), accept

Else If 𝑎2𝑡𝑑 /≡ 1 (mod 𝑚), reject

Accept

This algorithm is efficient: 𝑎𝑑 can be computing using O(log𝑚) multiplications, and it takes another O(log𝑚)

multiplications to compute 𝑎2𝑑, 𝑎

4𝑑, . . . , 𝑎

2𝑠𝑑 since 𝑠 = O(log𝑚). Each multiplication is efficient as it is done

modulo 𝑚, so the entire algorithm takes polynomial time in the size of 𝑚.

As for correctness, we have:

• If 𝑚 is prime, then the algorithm accepts 𝑚 with probability 1.

• If 𝑚 is composite, then the algorithm rejects 𝑚 with probability at least 34

.

The latter is due to the fact that when 𝑚 is composite, 𝑚 passes the Miller-Rabin test for at most 14

of the witnesses1 ≤ 𝑎 ≤ 𝑚− 1. Unlike the Fermat test, there are no exceptions to this, making the Miller-Rabin test a better choice inmany applications.

Fast Modular Exponentiation

There are several ways to compute a modular exponentiation 𝑎𝑏 (mod 𝑛) efficiently, and we take a look at two

here.



The first is to apply a top-down, divide-and-conquer strategy. We have:

𝑎𝑏=

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

1 if 𝑏 = 0

(𝑎𝑏/2)2

if 𝑏 > 0 is even

𝑎 ⋅ (𝑎(𝑏−1)/2)2

if 𝑏 > 0 is odd

This leads to the following algorithm:

Algorithm 𝑃𝑜𝑤𝑒𝑟(𝑎, 𝑏, 𝑛) ∶If 𝑏 = 0, return 1

𝑚 ∶= 𝑃𝑜𝑤𝑒𝑟(𝑎, ⌊𝑏/2⌋ , 𝑛)𝑚 ∶= 𝑚 ⋅𝑚 mod 𝑛

If 𝑏 mod 2 ≡ 1 ∶

𝑚 ∶= 𝑎 ⋅𝑚 mod 𝑛

Return 𝑚

This gives us the following recurrence for the number of multiplications and modulo operations:

𝑇 (𝑏) = 𝑇 (𝑏/2) +O(1)

Applying the Master theorem (page 13), we get 𝑇 (𝑏) = O(log 𝑏).

Furthermore, the numbers in this algorithm are always computed modulo 𝑛, so they are at most as large as 𝑛. Thus,each multiplication and modulo operation can be done efficiently with respect to O(log𝑚), and the algorithm as awhole is efficient.

An alternative method is to apply a bottom-up strategy. Here, we make use of the binary representation of 𝑏,

𝑏 = 𝑏𝑟 ⋅ 2𝑟+ 𝑏𝑟−1 ⋅ 2

𝑟−1+ ⋅ ⋅ ⋅ + 𝑏0 ⋅ 2

0

where 𝑏𝑖 is either 0 or 1. Then

𝑎𝑏= 𝑎

𝑏𝑟⋅2𝑟+𝑏𝑟−1⋅2

𝑟−1+⋅⋅⋅+𝑏0⋅20

= 𝑎𝑏𝑟⋅2

𝑟

× 𝑎𝑏𝑟−1⋅2

𝑟−1

× ⋅ ⋅ ⋅ × 𝑎𝑏0⋅2

0

Thus, we can compute 𝑎2𝑖

for each 0 ≤ 𝑖 ≤ 𝑟, where 𝑟 = ⌊log 𝑏⌋. We do so as follows:

Algorithm 𝐵𝑜𝑡𝑡𝑜𝑚𝑈𝑝𝑃𝑜𝑤𝑒𝑟(𝑎, 𝑏, 𝑛) ∶𝑝𝑜𝑤𝑒𝑟𝑠 ∶= [] (an empty associative container)𝑝𝑜𝑤𝑒𝑟𝑠[0] ∶= 𝑎

For 𝑖 = 1 to ⌊log 𝑏⌋ ∶𝑝𝑜𝑤𝑒𝑟𝑠[𝑖] = 𝑝𝑜𝑤𝑒𝑟𝑠[𝑖 − 1] ⋅ 𝑝𝑜𝑤𝑒𝑟𝑠[𝑖 − 1] mod 𝑛

𝑝𝑟𝑜𝑑𝑢𝑐𝑡 ∶= 1

For 𝑖 = 0 to ⌊log 𝑏⌋ ∶If ⌊𝑏/2

𝑖⌋ mod 2 ≡ 1 ∶

𝑝𝑟𝑜𝑑𝑢𝑐𝑡 ∶= 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 ⋅ 𝑝𝑜𝑤𝑒𝑟𝑠[𝑖] mod 𝑛

Return 𝑝𝑟𝑜𝑑𝑢𝑐𝑡

The operation ⌊𝑏/2𝑖⌋ is a right shift on the binary representation of 𝑏, and it can be done in linear time. As with the

previous algorithm, we perform O(log 𝑏) multiplication and modulo operations, each on numbers that are O(log 𝑛)in size. Thus, the runtime is efficient in the size of the input.



Exercise 141 Polynomial identity testing is the problem of determining whether two polynomials 𝑝(𝑥) and 𝑞(𝑥)are the same, or equivalently, whether 𝑝(𝑥) − 𝑞(𝑥) is the zero polynomial.

a) Let 𝑑 be an upper bound on the degree of 𝑝(𝑥) and 𝑞(𝑥). Show that for a randomly chosen integer𝑎 ∈ [1, 4𝑑], if 𝑝(𝑥) ≠ 𝑞(𝑥), then Pr[𝑝(𝑎) ≠ 𝑞(𝑎)] ≥ 3

4.

Hint: A nonzero polynomial 𝑟(𝑥) of degree at most 𝑑 can have at most 𝑑 roots 𝑠𝑖 such that 𝑟(𝑠𝑖) = 0.

b) Devise an efficient, randomized algorithm 𝐴 to determine whether 𝑝(𝑥) and 𝑞(𝑥) are the same, with thebehavior that:

• if 𝑝(𝑥) = 𝑞(𝑥), then

Pr[𝐴 accepts 𝑝(𝑥), 𝑞(𝑥)] = 1

• if 𝑝(𝑥) ≠ 𝑞(𝑥), then

Pr[𝐴 rejects 𝑝(𝑥), 𝑞(𝑥)] ≥ 3

4

Assume that a polynomial can be efficiently evaluated on a single input.

19.4 Skip Lists

Randomness can also be utilized to construct simple data structures that provide excellent expected performance. Asan example, we consider a set abstract data type, which is a container that supports inserting, deleting, and searchingfor elements of some mutually comparable type. There are many ways to implement a set abstraction, using a varietyof underlying data structures including linked lists, arrays, and binary search trees. However, simple, deterministicimplementations suffer a linear runtime in the worst case for some operations. For example, if an adversary inserts theitems 1, 2, . . . , 𝑛 into a non-balancing binary search tree, the insertions take O(𝑛) time each, for a total time of O(𝑛2).The usual way to avoid this is to perform complicated rebalancing operations, and there are a variety of schemes todo so (e.g. AVL trees54, red-black trees55, scapegoat trees56, and so on). Can we use randomness instead to design asimple data structure that provides expected O(log 𝑛) runtime for all operations on a set of size 𝑛?

One solution is a skip list, which is essentially a linked list with multiple levels, where each level contains about halfthe elements of the level below (ratios other than 1

2may be used instead, trading off the space and time overheads).

Whether or not an element is duplicated to the next level is a random decision, which prevents an adversary fromcoming up with a sequence of operations that degrades performance. The following is an illustration of a skip list:

head

head

head

head

-7

-7 -3

1

1

1

1 4

9

9 13

17

17

17 22

∅

∅

∅

∅

In the scheme illustrated above, each level has a sentinel head and is terminated by a null pointer. An element mayappear at multiple levels, and the list may be traversed by following links rightward or by moving downward at aduplicated element.

54 https://en.wikipedia.org/wiki/AVL_tree55 https://en.wikipedia.org/wiki/Red%E2%80%93black_tree56 https://en.wikipedia.org/wiki/Scapegoat_tree

19.4. Skip Lists 201

https://en.wikipedia.org/wiki/AVL_tree

https://en.wikipedia.org/wiki/Red%E2%80%93black_tree

https://en.wikipedia.org/wiki/Scapegoat_tree


To search for an item, we start at the sentinel node at the top left. We maintain the invariant that we are always locatedat a node whose value is less than the item we are looking for (or is a sentinel head node). Then in each step, we checkthe node to the right. If its value is greater than the item we are looking for, or if the right pointer is null, we movedownward; however, if we are on the lowest level, we cannot move downward, so the item is not in the list. If the valueof the node to the right is less than the item, we move to the right. If the value is equal to the item, we have found theitem and terminate the search. The following illustrates a search for the item 0 in the list above:

head

head

head

head

-7

-7 -3

1

1

1

1 4

9

9 13

17

17

17 22

∅

∅

∅

∅

The search terminates in failure, since we reach the value -3 in the bottom level with the value 1 to the right. Thefollowing illustrates searching for the item 9:

head

head

head

head

-7

-7 -3

1

1

1

1 4

9

9 13

17

17

17 22

∅

∅

∅

∅

To insert an item, we first do a search for the item. If the item is in the list, then no further work is necessary. If it isnot, the search terminates in the bottom level at the maximal element that is less than the item. Therefore, we insertthe item immediately to the right. We then make a random decision of whether or not to duplicate the element to thenext level. If the answer is yes, we make another random decision for the next level, and so on until the answer is no.If the probability of duplicating an item is 1

2, the expected number of copies of an item is 2. Thus, insertion only takes

an expected constant amount of time in addition to the search.

We proceed to determine the expected time of a search on a skip list. We start by determining the expected number ofelements on each level 𝑖. Suppose the skip list contains 𝑛 elements total. Let 𝑛𝑖 be the number of elements on level 𝑖.We have 𝑛1 = 𝑛, since the first level contains all elements. For 𝑖 > 1, define 𝑋𝑖𝑗 to be an indicator random variablefor whether or not the 𝑗th element is replicated to level 𝑖. We have

Pr[𝑋𝑖𝑗 = 1] = 1/2𝑖−1

since it takes 𝑖− 1 “yes” decisions for the element to make it to level 𝑖, and each decision is made independently withprobability 1/2. Thus, E[𝑋𝑖𝑗] = Pr[𝑋𝑖𝑗 = 1] = 1/2

𝑖−1.

The total number of elements 𝑛𝑖 on level 𝑖 is just the sum of the indicators 𝑋𝑖𝑗 . This gives us:

E[𝑛𝑖] = E [∑𝑗

𝑋𝑖𝑗]

=∑𝑗

E[𝑋𝑖𝑗]

=∑𝑗

1

2𝑖−1

=𝑛

2𝑖−1



Next, we consider how many levels we expect the list to contain. Suppose we only maintain levels that have at leastone element. From our previous result, the expected number of elements is 1 for level 𝑖 when

𝑛

2𝑖−1= 1

𝑛 = 2𝑖−1

log 𝑛 = 𝑖 − 1

𝑖 = log 𝑛 + 1

Intuitively, this implies we should expect somewhere around log 𝑛 levels. To reason about the expected number moreformally, consider the number of elements on some level 𝑖 = log 𝑛 + 𝑘. By Markov’s inequality, we have

Pr [𝑛log𝑛+𝑘 ≥ 1] ≤ E [𝑛log𝑛+𝑘] /1

=𝑛

2log𝑛+𝑘−1

=𝑛

𝑛2𝑘−1

=1

2𝑘−1

Then let 𝑍𝑘 be an indicator for whether 𝑛log𝑛+𝑘 ≥ 1. The number of levels 𝐿 is at most

𝐿 ≤ log 𝑛 +∞

∑𝑘=1

𝑍𝑘

We have

E[𝑍𝑘] = Pr[𝑍𝑘 = 1]= Pr [𝑛log𝑛+𝑘 ≥ 1]

≤1

2𝑘−1

Then by linearity of expectation,

E[𝐿] ≤ E [log 𝑛 +∞

∑𝑘=1

𝑍𝑘]

= log 𝑛 +∞

∑𝑘=1

E[𝑍𝑘]

≤ log 𝑛 +∞

∑𝑘=1

1

2𝑘−1

= log 𝑛 + 2

Thus, the expected number of levels is O(log 𝑛).

Finally, we reason about how many nodes a search encounters on each level 𝑖. Suppose we are looking for someelement 𝑦. Let 𝑤 be the maximal element on level 𝑖 that is less than 𝑦 (or the sentinel head if no such element exists),and let 𝑧 be the minimal element on level 𝑖 that is greater than or equal to 𝑦 (or the null sentinel if no such elementexists). Let 𝑣1, 𝑣2, . . . be the elements on level 𝑖 that are less than 𝑤, with 𝑣1 the largest, 𝑣2 the next largest, and so on.

⋯ 𝑣! 𝑣" 𝑣# 𝑤 𝑧 ⋯

⋯ 𝑤 ⋯ 𝑦 ⋯ 𝑧 ⋯level 𝑖 − 1

level 𝑖



The search encounters the nodes corresponding to 𝑤 and 𝑧. It does not touch nodes on level 𝑖 that come after 𝑧 – either𝑦 = 𝑧, in which case the search terminates, or 𝑧 > 𝑦 (or 𝑧 is null), in which case the search moves down to level 𝑖− 1.Thus, we need only consider whether the search encounters the nodes corresponding to 𝑣1, 𝑣2, . . ..

We observe that each element on level 𝑖 is replicated on level 𝑖 + 1 with probability 1/2.

⋯ 𝑣! 𝑣" 𝑣# 𝑤 𝑧 ⋯

⋯ 𝑤 ⋯ 𝑦 ⋯ 𝑧 ⋯

each replicated independently with probability !

"

𝑣! 𝑣" 𝑣# 𝑤

level 𝑖 − 1

level 𝑖 + 1

level 𝑖

There are two possible routes the search can take to get to 𝑤 on level 𝑖:

• The search can come down from level 𝑖 + 1 to level 𝑖 at 𝑤. This happens exactly when 𝑤 is replicated on level𝑖 + 1. The search process proceeds along the same level until it reaches a node that is greater than the targetvalue 𝑦, and since 𝑤 < 𝑦, it would not come down to level 𝑖 before 𝑤 if 𝑤 appears on level 𝑖 + 1. Thus, thispossibility occurs with probability 1/2.

• The search can come from the node corresponding to 𝑣1 on level 𝑖. This occurs when 𝑤 is not replicated onlevel 𝑖 + 1. In that case, the search must come down to level 𝑖 before 𝑤, in which case it goes through 𝑣1. Thispossibility also occurs with probability 1/2.

These two possibilities are illustrated below:

⋯ 𝑣! 𝑣" 𝑣# 𝑤 𝑧 ⋯

⋯ 𝑤 ⋯ 𝑦 ⋯ 𝑧 ⋯level 𝑖 − 1

𝑣! 𝑣" 𝑣# 𝑤level 𝑖 + 1

level 𝑖

𝐏𝐫[𝐬𝐞𝐚𝐫𝐜𝐡𝐭𝐚𝐤𝐞𝐬𝐫𝐨𝐮𝐭𝐞] =𝟏𝟐

𝐏𝐫[𝐬𝐞𝐚𝐫𝐜𝐡𝐭𝐚𝐤𝐞𝐬𝐫𝐨𝐮𝐭𝐞] =𝟏𝟒

𝐏𝐫[𝐬𝐞𝐚𝐫𝐜𝐡𝐭𝐚𝐤𝐞𝐬𝐫𝐨𝐮𝐭𝐞] =𝟏𝟖

Thus, 𝑣1 is encountered on level 𝑖 exactly when 𝑤 is not replicated to level 𝑖+1, which occurs with probability 1/2. Wecan inductively repeat the same reasoning for preceding elements. The search only encounters 𝑣2 on level 𝑖 if neither𝑤 nor 𝑣1 are replicated on level 𝑖+1. Since they are replicated independently with probability 1/2, the probability thatneither is replicated is 1/4. In general, the search encounters 𝑣𝑗 on level 𝑖 exactly when none of 𝑤, 𝑣1, 𝑣2, . . . , 𝑣𝑗−1are replicated to level 𝑖 + 1, which happens with probability 1/2

𝑗 .

We can now compute the expected number of nodes encountered on level 𝑖. Let 𝑌𝑖 be this number, and let 𝑉𝑖𝑗 be anindicator for whether or not element 𝑣𝑗 is encountered. Then we have

𝑌𝑖 = 2 +∑𝑗

𝑉𝑖𝑗

The first term 2 is due to the fact that the search encounters the nodes corresponding to 𝑤 and 𝑧. From our reasoningabove, we have

E[𝑉𝑖𝑗] = Pr[𝑉𝑖𝑗 = 1] = 1

2𝑗



Then by linearity of expectation,

E[𝑌𝑖] = E [2 +∑𝑗

𝑉𝑖𝑗]

= 2 +∑𝑗

E[𝑉𝑖𝑗]

= 2 +∑𝑗

1

2𝑗

< 2 +∞

∑𝑘=1

1

2𝑘

= 3

The second-to-last step follows from the fact that ∑𝑗12𝑗

has finitely many terms since there are only finitely manyelements less than 𝑦. Thus, the expected number of nodes encountered on a level 𝑖 is O(1).

Combining this with our previous result that the expected number of levels is O(log 𝑛), the expected total number ofnodes encountered across all levels is O(log 𝑛). Thus, a search on a skip list does an expected O(log 𝑛) comparisons.Since insertion does only a constant number of operations in addition to the search, it too does O(log 𝑛) operations, andthe same reasoning holds for deletion as well. We achieve this expected runtime with a much simpler implementationthan a self-balancing search tree.

19.5 Quick Sort

Another example of taking advantage of randomness in algorithm design is quick sort. The core algorithm requiresselecting a pivot element from a sequence, and then partitioning the sequence into those elements that are less than thepivot and those that are greater than (or equal to) the pivot. The two partitions are recursively sorted. The following isan illustration of how this algorithm works, with the recursive steps elided:

99 6 15 70 52 37 86 4 0

6 15 4 0 37 99 70 52 86 0 4 6 15 37 52 70 86 99

partition

recursively sort partitions

It analyzing the runtime, we focus on the number of comparisons between elements – they dominate the runtime,and other aspects of the algorithm are dependent on choice of data structure. Partitioning an array of 𝑛 elementsrequires comparing each element to the pivot, for a total of 𝑛 − 1 = 𝑂(𝑛) comparisons in each recursive step. If thetwo partitions corresponding to the elements less than and greater than the pivot are approximately equal in size, therecurrence for the number of comparisons is:

𝑇 (𝑛) = 2𝑇 (𝑛2) +𝑂(𝑛)

This is the same recurrence as merge sort (page 12), and the algorithm achieves a runtime complexity of O(𝑛 log 𝑛)comparisons for a sequence of 𝑛 elements. One way to accomplish this balanced partitioning is to find the medianelement – this can be done in linear time57. However, a simpler solution is to just pick a random element as the pivot.We proceed to analyze this version of the algorithm, which is as follows:

57 https://en.wikipedia.org/wiki/Median_of_medians

19.5. Quick Sort 205

https://en.wikipedia.org/wiki/Median_of_medians


Algorithm 142 (Quick Sort)

Algorithm 𝑄𝑢𝑖𝑐𝑘𝑆𝑜𝑟𝑡(𝐴[1..𝑛] ∶ array of 𝑛 integers) ∶If 𝑛 = 1 return 𝐴

𝑝 ∶= index chosen uniformly at random from 1, . . . , 𝑛𝐿,𝑅 ∶= 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛(𝐴, 𝑝)Return 𝑄𝑢𝑖𝑐𝑘𝑆𝑜𝑟𝑡(𝐿) +𝐴[𝑝] +𝑄𝑢𝑖𝑐𝑘𝑆𝑜𝑟𝑡(𝑅)

Subroutine 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛(𝐴[1..𝑛], 𝑝) ∶𝐿 ∶= empty array𝑅 ∶= empty arrayFor 𝑖 = 1 to 𝑛 ∶

If 𝑖 ≠ 𝑝 and 𝐴[𝑖] < 𝐴[𝑝] ∶𝐿 ∶= 𝐿 +𝐴[𝑖]

Else If 𝑖 ≠ 𝑝 ∶

𝑅 ∶= 𝑅 +𝐴[𝑖]Return 𝐿,𝑅

Let 𝑋 be a random variable corresponding to the rank of the chosen pivot – the pivot has rank 𝑖 if it is the 𝑖th largestelement. The algorithm still works well when 𝑛/6 ≤ 𝑋 ≤ 5𝑛/6. By examining the probability distribution of 𝑋 , weobserve that:

E[𝑋] = 𝑛

2

Pr [𝑛6≤ 𝑋 ≤

5𝑛

6] ≈ 2

3

This follows from the fact that the element is chosen uniformly at random – half the outcomes have rank above 𝑛2

andhalf below, and two-thirds have rank between 𝑛/6 and 5𝑛/6. Thus, we obtain a good pivot both with high probabilityand in expectation. Over the full course of the algorithm, most of the pivots chosen are good, and the expected runtime,in terms of number of comparisons, is still O(𝑛 log 𝑛).

For a more detailed analysis of the algorithm, we first modify it so that all the randomness is fixed at the beginningof the algorithm. Given 𝑛 elements, we start by choosing a permutation of the numbers 1, 2, . . . , 𝑛 uniformly atrandom to be the priorities of the elements. Then when choosing a pivot for a subset of the elements, we pick theelement that has the lowest priority over that subset. The following illustrates an execution of this modified algorithm:



99 6 15 70 52 37 86 4 07 8 3 9 4 1 2 5 6

6 15 4 08 3 5 6

99 70 52 867 9 4 2

371

153

6 4 08 5 6

371

70 529 4

862

997

153

371

68

45

06

524

709

862

997

itemspriorities

In the first step, the element 37 has the lowest priority, so it is chosen as the pivot. At the second level, the element 15has the minimal priority in the first recursive call, while the element 86 has the lowest priority in the second call. Atthe third level of the recursion, the element 4 has lowest priority over its subset, and similarly the element 52 for thesubset 70, 52.

The full definition of the modified algorithm is as follows:



Algorithm 143 (Modified Quick Sort)

Algorithm 𝑀𝑜𝑑𝑖𝑓𝑖𝑒𝑑𝑄𝑢𝑖𝑐𝑘𝑆𝑜𝑟𝑡(𝐴[1..𝑛] ∶ array of 𝑛 integers) ∶𝑃 ∶= a random permutation of 1, . . . , 𝑛, chosen uniformly at randomReturn 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑖𝑧𝑒𝑑𝑆𝑜𝑟𝑡(𝐴,𝑃 )

Subroutine 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑖𝑧𝑒𝑑𝑆𝑜𝑟𝑡(𝐴[1..𝑛], 𝑃 [1..𝑛]) ∶If 𝑛 = 1 return 𝐴

𝑝 ∶= index of the minimal element in 𝑃

𝐿, 𝑃𝐿,𝑅, 𝑃𝑅 ∶= 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛(𝐴,𝑃, 𝑝)Return 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑖𝑧𝑒𝑑𝑆𝑜𝑟𝑡(𝐿,𝑃𝐿) +𝐴[𝑝] + 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑖𝑧𝑒𝑑𝑆𝑜𝑟𝑡(𝑅,𝑃𝑅)

Subroutine 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛(𝐴[1..𝑛], 𝑃 [1..𝑛], 𝑝) ∶𝐿 ∶= empty array𝑃𝐿 ∶= empty array𝑅 ∶= empty array𝑃𝑅 ∶= empty arrayFor 𝑖 = 1 to 𝑛 ∶

If 𝑖 ≠ 𝑝 and 𝐴[𝑖] < 𝐴[𝑝] ∶𝐿 ∶= 𝐿 +𝐴[𝑖]𝑃𝐿 ∶= 𝑃𝐿 + 𝑃 [𝑖]

Else If 𝑖 ≠ 𝑝 ∶

𝑅 ∶= 𝑅 +𝐴[𝑖]𝑃𝑅 ∶= 𝑃𝑅 + 𝑃 [𝑖]

Return 𝐿,𝑃𝐿,𝑅, 𝑃𝑅

This modified algorithm matches the behavior of the original randomized quick sort in two important ways:

1. In both algorithms, when a pivot is chosen among a subset consisting of 𝑚 elements, each element is chosenwith a uniform probability 1/𝑚. This is explicitly the case in the original version. In the modified version, thepriorities assigned to each of the 𝑚 elements are unique, and for any set of 𝑚 priorities, each element receivesthe lowest priority in exactly 1/𝑚 of the permutations of those priorities.

2. The two algorithms compare elements in exactly the same situation – when partitioning the array, each elementis compared to the pivot. (The modified version also compares priorities to find the minimal, but we can justignore that in our analysis. Even if we did consider them, they just increase the number of comparisons by aconstant factor of two.)

Thus, the modified algorithm models the execution of the original algorithm in both the evolution of the recursion aswell as the element comparisons. We proceed to analyze the modified algorithm, and the result will equally apply tothe original.

For simplicity, we assume for the purposes of analysis that the initial array does not contain any duplicate elements. Wehave defined our two versions of quick sort to be stable, meaning that they preserve the relative ordering of duplicateelements. Thus, even when duplicates are present, the algorithm distinguishes between them, resulting in the samebehavior as if there were no duplicates.

Claim 144 Let 𝐴 be an array of 𝑛 elements, and let 𝑆 be the result of sorting 𝐴. Let 𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦(𝑆[𝑖]) be thepriority that was assigned to the element 𝑆[𝑖] in the modified quick-sort algorithm. Then 𝑆[𝑖] and 𝑆[𝑗] were



compared by the algorithm if and only if 𝑖 ≠ 𝑗, and one of the following holds:

𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦(𝑆[𝑖]) = min𝑖≤𝑘≤𝑗

𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦(𝑆[𝑘]), or

𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦(𝑆[𝑗]) = min𝑖≤𝑘≤𝑗

𝑝𝑟𝑖𝑜𝑟𝑖𝑡𝑦(𝑆[𝑘])

In other words, 𝑆[𝑖] and 𝑆[𝑗] are compared exactly when the lowest priority among the elements 𝑆[𝑖], 𝑆[𝑖 +1], . . . , 𝑆[𝑗 − 1], 𝑆[𝑗] is that of either 𝑆[𝑖] or 𝑆[𝑗].

The claim above holds regardless of the original order of the elements in 𝐴. For example, take a look at the result fromour illustration of the modified algorithm above:

0 4 6 15 37 52 70 86 996 5 8 3 1 4 9 2 7

compared not compared

sorted resultpriorities

As stated above, two elements are compared only in the partitioning step, and one of the two must be the pivot.Referring to the execution above, we can see that 0 and 15 were compared in the second level of the recursion,whereas 52 and 99 were never compared to each other. Now consider a different initial ordering of the elements, butwith the same priorities assigned to each element:

52 37 99 6 4 86 70 15 04 1 7 8 5 2 9 3 6

6 4 15 08 5 3 6

52 99 86 704 7 2 9

371

153

6 4 08 5 6

371

52 704 9

862

997

153

371

68

45

06

524

709

862

997

itemspriorities

The execution of the algorithm is essentially the same! It still picks 37 as the first pivot, since it has the lowest priority,and forms the same partitions; the only possible difference is the ordering of a partition. Subsequent steps again pickthe same pivots, since they are finding the minimal priority over the same subset of elements and associated priorities.Thus, given just the sorted set of elements and their priorities, we can determine which elements were compared as



stated in Claim 144, regardless of how they were initially ordered. We proceed to prove Claim 144:

Proof 145 First, we show that if the lowest priority among the elements 𝑆[𝑖], 𝑆[𝑖 + 1], . . . , 𝑆[𝑗 − 1], 𝑆[𝑗] isthat of either 𝑆[𝑖] or 𝑆[𝑗], then 𝑆[𝑖] and 𝑆[𝑗] are compared. Without loss of generality, suppose 𝑆[𝑖] is theelement with the lowest priority in this subset. This implies that no element in 𝑆[𝑖+ 1], . . . , 𝑆[𝑗] will be pickedas a pivot prior to 𝑆[𝑖] being picked. Thus, all pivots chosen by the algorithm before 𝑆[𝑖] must be either lessthan 𝑆[𝑖] or greater than 𝑆[𝑗]. For each such pivot, the elements 𝑆[𝑖], 𝑆[𝑖 + 1], . . . , 𝑆[𝑗] are all placed in thesame partition – they are in sorted order, and if the pivot is less than 𝑆[𝑖], all of these elements are larger than thepivot, while if the pivot is greater than 𝑆[𝑗], then all these elements are smaller than the pivot. Then when thealgorithm chooses 𝑆[𝑖] as the pivot, 𝑆[𝑗] is in the same subset of the array as 𝑆[𝑖] and thus will be compared to𝑆[𝑖] in the partitioning step. The same reasoning holds for when 𝑆[𝑗] has the lowest priority – it will eventuallybe chosen as the pivot, at which point 𝑆[𝑖] will be in the same subset and will be compared to 𝑆[𝑗].Next, we show that if the lowest priority among the elements 𝑆[𝑖], 𝑆[𝑖 + 1], . . . , 𝑆[𝑗 − 1], 𝑆[𝑗] is neitherthat of 𝑆[𝑖] nor of 𝑆[𝑗], then 𝑆[𝑖] and 𝑆[𝑗] are never compared. As long as the algorithm chooses pivotsoutside of the set 𝑆[𝑖], 𝑆[𝑖 + 1], . . . , 𝑆[𝑗], the two elements 𝑆[𝑖] and 𝑆[𝑗] remain in the same subset of thearray without having been compared to each other – they remain in the same subset since the pivots so far havebeen either less than both 𝑆[𝑖] and 𝑆[𝑗] or greater than both of them, and they have not been compared yetsince neither has been chosen as a pivot. When the algorithm first chooses a pivot from among the elements𝑆[𝑖], 𝑆[𝑖+1], . . . , 𝑆[𝑗−1], 𝑆[𝑗], the pivot it chooses must be one of 𝑆[𝑖+1], . . . , 𝑆[𝑗−1], since one of thoseelements has lower priority than both 𝑆[𝑖] and 𝑆[𝑗]. In the partitioning step for that pivot, 𝑆[𝑖] and 𝑆[𝑗] areplaced into separate partitions, since 𝑆[𝑖] is smaller than the pivot while 𝑆[𝑗] is larger than the pivot. Since 𝑆[𝑖]and 𝑆[𝑗] end up in separate partitions, they cannot be compared in subsequent steps of the algorithm.

We can now proceed to compute the expected number of comparisons. Let 𝑋𝑖𝑗 be an indicator random variable suchthat 𝑋𝑖𝑗 = 1 if 𝑆[𝑖] and 𝑆[𝑗] were compared at some point in the algorithm, 𝑋𝑖𝑗 = 0 otherwise. From Claim 144,we can conclude that

Pr[𝑋𝑖𝑗 = 1] = 2

𝑗 − 𝑖 + 1

This is because each of the 𝑗 − 1 + 1 elements in 𝑆[𝑖], 𝑆[𝑖 + 1], . . . , 𝑆[𝑗 − 1], 𝑆[𝑗] has equal chance of having thelowest priority among those elements, and 𝑆[𝑖] and 𝑆[𝑗] are compared when one of those two elements has lowestpriority among this set.

We also observe that a pair of elements is compared at most once – they are compared once in the partitioning stepif they are in the same subarray and one of the two is chosen as a pivot, and after the partitioning step is over, thepivot is never compared to anything else. Thus, the total number of comparisons 𝑋 is just the number of pairs that arecompared, out of the 𝑛(𝑛 − 1)/2 distinct pairs:

𝑋 =

𝑛

∑𝑖=1

𝑛

∑𝑗=𝑖+1

𝑋𝑖𝑗

We have

E[𝑋𝑖𝑗] = Pr[𝑋𝑖𝑗 = 1] = 2

𝑗 − 𝑖 + 1

and by linearity of expectation, we get

E[𝑋] =𝑛

∑𝑖=1

𝑛

∑𝑗=𝑖+1

E[𝑋𝑖𝑗]

=

𝑛

∑𝑖=1

𝑛

∑𝑗=𝑖+1

2

𝑗 − 𝑖 + 1



Let 𝑡 = 𝑗 − 𝑖 + 1. We can then rewrite this sum as

E[𝑋] =𝑛

∑𝑖=1

𝑛

∑𝑗=𝑖+1

2

𝑗 − 𝑖 + 1

=

𝑛

∑𝑖=1

𝑛−𝑖+1

∑𝑡=2

2𝑡

= 2 ⋅𝑛

∑𝑖=1

𝑛−𝑖+1

∑𝑡=2

1𝑡

The quantity∑𝑛−𝑖+1𝑡=2

1𝑡

is a partial sum of the harmonic series58, which has an upper bound

𝑘

∑𝑚=1

1𝑚 ≤ ln 𝑘 + 1

Thus, we have

𝑛−𝑖+1

∑𝑡=2

1𝑡= −1 +

𝑛−𝑖+1

∑𝑡=1

1𝑡

≤ −1 + ln(𝑛 − 𝑖 + 1) + 1

= ln(𝑛 − 𝑖 + 1)≤ ln𝑛 (for 𝑖 ≥ 1)

Plugging this into our expression for E[𝑋], we get

E[𝑋] = 2 ⋅𝑛

∑𝑖=1

𝑛−𝑖+1

∑𝑡=2

1𝑡

≤ 2 ⋅𝑛

∑𝑖=1

ln𝑛

= 2𝑛 ln𝑛

= O(𝑛 log 𝑛)

Thus, the expected number of comparisons is O(𝑛 log 𝑛), as previously claimed. Since the modified quick-sort algo-rithm models the behavior of the original randomized algorithm, this result also applies to the latter.

In practice, randomized quick sort is faster than the deterministic version as well as other deterministic O(𝑛 log 𝑛)sorts such as merge sort (page 12). It is also much simpler to implement in place (i.e. without using an auxiliary datastructure). Thus, variants of randomized quick sort are widely used in practice.

Exercise 146 Randomized quick sort has worst-case running time Θ(𝑛2) but expected running time O(𝑛 log 𝑛)when run on any array of 𝑛 elements.

Prove that for an array 𝐴 of 𝑛 elements and any constant 𝑐 > 0,

lim𝑛→∞

Pr[randomized quick sort on input 𝐴 takes at least 𝑐𝑛2 steps] = 0

That is, randomized quick sort is very unlikely to run in Ω(𝑛2) time.

Hint: The number of steps a randomized algorithm takes is a random variable. You do not need to know anythingabout how randomized quick sort works; just use the facts above and Markov’s inequality.

58 https://en.wikipedia.org/wiki/Harmonic_series_(mathematics)


https://en.wikipedia.org/wiki/Harmonic_series_(mathematics)



CHAPTER

TWENTY

PROBABILISTIC COMPLEXITY CLASSES

The Fermat primality test (page 197) is an example of a one-sided-error randomized algorithm – an input that is pseu-doprime is always accepted, while a non-pseudoprime is sometimes accepted and sometimes rejected. We can definecomplexity classes corresponding to decision problems that have efficient one-sided-error randomized algorithms.

Definition 147 (RP) RP is the class of languages that have efficient one-sided-error randomized algorithms withno false positives. A language 𝐿 is in RP if there is an efficient randomized algorithm 𝐴 such that:

• if 𝑥 ∈ 𝐿, Pr[𝐴 accepts 𝑥] ≥ 𝑐

• if 𝑥 ∉ 𝐿, Pr[𝐴 rejects 𝑥] = 1

Here, 𝑐 must be a constant greater than 0. Often, 𝑐 is chosen to be 12

.

Definition 148 (coRP) coRP is the class of languages that have efficient one-sided-error randomized algorithmswith no false negatives. A language 𝐿 is in coRP if there is an efficient randomized algorithm 𝐴 such that:

• if 𝑥 ∈ 𝐿, Pr[𝐴 accepts 𝑥] = 1

• if 𝑥 ∉ 𝐿, Pr[𝐴 rejects 𝑥] ≥ 𝑐

Here, 𝑐 must be a constant greater than 0. Often, 𝑐 is chosen to be 12

.

RP stands for randomized polynomial time. If a language 𝐿 is in RP, then its complement language 𝐿 is in coRP – analgorithm for 𝐿 with no false positives can be converted into an algorithm for 𝐿 with no false negatives. The Fermattest produces no false negatives, so PSEUDOPRIMES ∈ coRP. Thus, the language

PSEUDOPRIMES = 𝑚 ∈ N ∶ 𝑎𝑚−1 mod 𝑚 ≡ 1 for at least halfthe witnesses 1 ≤ 𝑎 ≤ 𝑚 − 1

is in RP.

The constant 𝑐 in the definition of RP and coRP is arbitrary. With any 𝑐 > 0, we can amplify the probability of successby repeatedly running the algorithm. For instance, if we have a randomized algorithm 𝐴 with no false positives for alanguage 𝐿, we have:

𝑥 ∈ 𝐿 ⟹ Pr[𝐴 accepts 𝑥] ≥ 𝑐

𝑥 ∉ 𝐿 ⟹ Pr[𝐴 accepts 𝑥] = 0

We can construct a new algorithm 𝐵 as follows:

Algorithm 𝐵(𝑥) ∶Run 𝐴 on 𝑥 twiceIf 𝐴 accepts at least once, acceptElse reject

213


𝐵 just runs 𝐴 twice on the input, accepting if at least one run of 𝐴 accepts; 𝐵 only rejects if both runs of 𝐴 reject.Thus, the probability that 𝐵 rejects 𝑥 is:

Pr[𝐵 rejects 𝑥] = Pr[𝐴 rejects 𝑥 in run 1, 𝐴 rejects 𝑥 in run 2]= Pr[𝐴 rejects 𝑥 in run 1] ⋅ Pr[𝐴 rejects 𝑥 in run 2]= Pr[𝐴 rejects 𝑥]2

Here, we have used the fact that the two runs of 𝐴 are independent, so the probability of 𝐴 rejecting twice is theproduct of the probabilities it rejects each time. This gives us:

𝑥 ∈ 𝐿 ⟹ Pr[𝐵 rejects 𝑥] ≤ (1 − 𝑐)2

𝑥 ∉ 𝐿 ⟹ Pr[𝐵 rejects 𝑥] = 1

or equivalently:

𝑥 ∈ 𝐿 ⟹ Pr[𝐵 accepts 𝑥] ≥ 1 − (1 − 𝑐)2

𝑥 ∉ 𝐿 ⟹ Pr[𝐵 accepts 𝑥] = 0

Repeating this reasoning, if we modify 𝐵 to run 𝐴 𝑘 times, we get:

𝑥 ∈ 𝐿 ⟹ Pr[𝐵 accepts 𝑥] ≥ 1 − (1 − 𝑐)𝑘

𝑥 ∉ 𝐿 ⟹ Pr[𝐵 accepts 𝑥] = 0

By applying a form of Bernoulli’s inequality59, (1 + 𝑥)𝑟 ≤ 𝑒𝑟𝑥, we get:

𝑥 ∈ 𝐿 ⟹ Pr[𝐵 accepts 𝑥] ≥ 1 − 𝑒−𝑐𝑘

If we want a particular lower bound 1 − 𝜀 for acceptance of true positives, we need

𝑒−𝑐𝑘

≤ 𝜀

−𝑐𝑘 ≤ ln 𝜀

𝑘 ≥− ln 𝜀𝑐 =

ln(1/𝜀)𝑐

This is a constant for any constants 𝑐 and 𝜀. Thus, we can amplify the probability of accepting a true positive from𝑐 to 1 − 𝜀 for any 𝜀 with a constant ln(1/𝜀)/𝑐 number of repetitions. The same reasoning can be applied to amplifyone-sided-error algorithms with no false negatives.

One-sided-error randomized algorithms are a special case of two-sided-error randomized algorithms. Such an algo-rithm for a language 𝐿 can produce the wrong result for an input 𝑥 whether or not 𝑥 ∈ 𝐿. However, the probability ofgetting the wrong result is bounded by a constant.

Definition 149 (BPP) BPP is the class of languages that have efficient two-sided-error randomized algorithms.A language 𝐿 is in BPP if there is an efficient randomized algorithm 𝐴 such that:



Here, 𝑐 must be a constant greater than 12

, so that the algorithm produces the correct answer the majority of thetime. Often, 𝑐 is chosen to be 2

3or 3

4.

BPP stands for bounded-error probabilistic polynomial time. Given the symmetric definition of a two-sided erroralgorithm, it is clear that the class BPP is closed under complement – if a language 𝐿 ∈ BPP, then 𝐿 ∈ BPP as well.

Languages in RP and in coRP both trivially satisfy the conditions for BPP; for instance, we have the following forRP:

59 https://en.wikipedia.org/wiki/Bernoulli%27s_inequality

214

https://en.wikipedia.org/wiki/Bernoulli%27s_inequality



• if 𝑥 ∉ 𝐿, Pr[𝐴 rejects 𝑥] = 1 ≥ 𝑐

Thus, RP ∪ coRP ⊆ BPP. By similar reasoning, we can relate P to these classes as follows:

P ⊆ RP ⊆ BPP

P ⊆ coRP ⊆ BPP

As with one-sided-error algorithms, the probability of success for two-sided-error randomized algorithms can be am-plified arbitrarily. However, we do not yet have the tools (page 220) we need to reason about how to do so, so we willcome back to this later (page 231).

One-sided-error and two-sided-error randomized algorithms are known as Monte Carlo algorithms. Such algorithmshave a bounded runtime, but they may produce the wrong answer. Contrast this with Las Vegas algorithms, whichalways produce the correct answer, but where the runtime bound only holds in expectation. We can define a complexityclass for languages that have Las Vegas algorithms with expected runtime that is polynomial in the size of the input.

Definition 150 (ZPP) ZPP is the class of languages that have efficient Las Vegas algorithms. A language 𝐿 isin ZPP if there is a randomized algorithm 𝐴 such that:

• If 𝑥 ∈ 𝐿, 𝐴 always accepts 𝑥.

• If 𝑥 ∉ 𝐿, 𝐴 always rejects 𝑥.

• The expected runtime of 𝐴 on input 𝑥 is O(∣𝑥∣𝑘) for some constant 𝑘.

There is a relationship between Monte Carlo and Las Vegas algorithms: if a language 𝐿 has both a one-sided-erroralgorithm with no false positives and a one-sided-error algorithm with no false negatives, then it has a Las Vegasalgorithm, and vice versa. In terms of complexity classes, we have 𝐿 is in both RP and coRP if and only if 𝐿 is inZPP. This implies that

RP ∩ coRP = ZPP

We demonstrate this as follows. Suppose 𝐿 is in RP ∩ coRP. Then it has an efficient, one-sided-error algorithm 𝐴with no false positives such that:

• if 𝑥 ∈ 𝐿, Pr[𝐴 accepts 𝑥] ≥ 12

• if 𝑥 ∉ 𝐿, Pr[𝐴 accepts 𝑥] = 0

Here, we have selected 𝑐 = 12

for concreteness. 𝐿 also has an efficient, one-sided algorithm 𝐵 with no false negativessuch that:

• if 𝑥 ∈ 𝐿, Pr[𝐵 rejects 𝑥] = 0

• if 𝑥 ∉ 𝐿, Pr[𝐵 rejects 𝑥] ≥ 12

We can construct a new algorithm 𝐶 as follows:

Algorithm 𝐶(𝑥) ∶Repeat ∶

Run 𝐴 on 𝑥, accept if 𝐴 acceptsRun 𝐵 on 𝑥, reject if 𝐵 rejects

Since 𝐴 only accepts 𝑥 ∈ 𝐿 and 𝐶 only accepts when 𝐴 does, 𝐶 only accepts 𝑥 ∈ 𝐿. Similarly, 𝐵 only rejects 𝑥 ∉ 𝐿,so 𝐶 only rejects 𝑥 ∉ 𝐿. Thus, 𝐶 always produces the correct answer for a given input.

To show that 𝐿 ∈ ZPP, we must also demonstrate that the expected runtime of 𝐶 is polynomial. For each iteration 𝑖of 𝐶, we have:

• if 𝑥 ∈ 𝐿, Pr[𝐴 accepts 𝑥 in iteration 𝑖] ≥ 12

215


• if 𝑥 ∉ 𝐿, Pr[𝐵 rejects 𝑥 in iteration 𝑖] ≥ 12

Thus, if 𝐶 gets to iteration 𝑖, the probability that 𝐶 terminates in that iteration is at least 12

. We can model the numberof iterations as a random variable 𝑋 , and 𝑋 has the probability distribution:

Pr[𝑋 > 𝑘] ≤ (1

2)𝑘

This is similar to a geometric distribution, where

Pr[𝑌 > 𝑘] = (1 − 𝑝)𝑘

for some 𝑝 ∈ (0, 1]. The expected value of such a distribution is E[𝑌 ] = 1/𝑝. This gives us:

E[𝑋] ≤ 2

Thus, the expected number of iterations of 𝐶 is no more than two, and since 𝐴 and 𝐵 are both efficient, calling themtwice is also efficient.

We have shown that RP ∩ coRP ⊆ ZPP. We will proceed to show that ZPP ⊆ RP, meaning that if a language has anefficient Las Vegas algorithm, it also has an efficient one-sided-error algorithm with no false positives. Assume that𝐶 is a Las Vegas algorithm for 𝐿 ∈ ZPP, with expected runtime of no more than ∣𝑥∣𝑘 steps for an input 𝑥 and someconstant 𝑘. We construct a new algorithm 𝐴 as follows:

Algorithm 𝐴(𝑥) ∶Run 𝐶 on 𝑥 for 2∣𝑥∣𝑘 steps (or until it halts, whichever comes first)Accept if 𝐶 accepted, else reject

𝐴 is clearly efficient in the size of the input 𝑥. It also has no false positives, since it only accepts if 𝐶 acceptswithin the first 2∣𝑥∣𝑘 steps, and 𝐶 always produces the correct answer. All that is left is to show that if 𝑥 ∈ 𝐿,Pr[𝐴 accepts 𝑥] ≥ 1

2(again, we arbitrarily choose 𝑐 = 1

2).

Let 𝑆 be the number of steps before 𝐶 accepts 𝑥. By assumption, we have

E[𝑆] ≤ ∣𝑥∣𝑘

𝐴 runs 𝐶 for 2∣𝑥∣𝑘 steps, so we need to bound the probability that 𝐶 takes more than 2∣𝑥∣𝑘 steps on 𝑥. We have:

Pr[𝑆 > 2∣𝑥∣𝑘] ≤ Pr[𝑆 ≥ 2∣𝑥∣𝑘]

≤E[𝑆]2∣𝑥∣𝑘

≤∣𝑥∣𝑘

2∣𝑥∣𝑘

=1

2

Here, we applied Markov’s inequality in the second step. Then:

Pr[𝐴 accepts 𝑥] = Pr[𝑆 ≤ 2∣𝑥∣𝑘]= 1 − Pr[𝑆 > 2∣𝑥∣𝑘]

≥1

2

Thus, 𝐴 is indeed a one-sided-error algorithm with no false positives, so 𝐿 ∈ RP. A similar construction can be usedto demonstrate that 𝐿 ∈ coRP, allowing us to conclude that ZPP ⊆ RP ∩ coRP. Combined with our previous proofthat RP ∩ coRP ⊆ ZPP, we have RP ∩ coRP = ZPP.

216


One final observation we will make is that if a language 𝐿 is in RP, then it is also in NP. Since 𝐿 ∈ RP, there isan efficient, one-sided-error randomized algorithm 𝐴 to decide 𝐿. 𝐴 is allowed to make random choices, and we canmodel each choice as a coin flip, i.e. being either 0 or 1 according to some probability distribution. We can representthe combination of these choices in a particular run of 𝐴 as a binary string 𝑐. This enables us to write an efficientverifier 𝑉 for 𝐿 as follows:

Algorithm 𝑉 (𝑥, 𝑐 = 𝑐1𝑐2 . . . 𝑐𝑚) ∶Simulate the execution of 𝐴 on 𝑥, taking choice 𝑐𝑖

for the 𝑖th random decision in 𝐴

Accept if 𝐴 accepts , else reject

If 𝑥 ∈ 𝐿, then Pr[𝐴 accepts 𝑥] ≥ 12

, so at least half the possible sequences of random choices lead to 𝐴 accepting𝑥. On the other hand, if 𝑥 ∉ 𝐿, then Pr[𝐴 rejects 𝑥] = 1, so all sequences of random choices lead to 𝐴 rejecting 𝑥.Thus, 𝑉 accepts at least half of all possible certificates when 𝑥 ∈ 𝐿, and 𝑉 rejects all certificates when 𝑥 ∉ 𝐿, so 𝑉 isa verifier for 𝐿. Since 𝐴 is efficient, 𝑉 is also efficient.

We summarize the known relationships between complexity classes as follows:

P ⊆ ZPP ⊆ RP ⊆ NP

P ⊆ ZPP ⊆ RP ⊆ BPP

P ⊆ ZPP ⊆ coRP ⊆ coNP

P ⊆ ZPP ⊆ coRP ⊆ BPP

These relationships are represented pictorially, with edges signifying containment, as follows:

NP BPP coNP

RP coRP

ZPP

P

2

Here, coNP is the class of languages whose complements are in NP:

coNP = 𝐿 ∶ 𝐿 ∈ NP

We do not know if any of the containments above are strict, and we do not know the relationships between NP, coNP,and BPP. It is commonly believed that P and BPP are equal and thus BPP is contained in NP∩ coNP, that neither P

217


nor BPP contain all of NP or all of coNP, and that NP and coNP are not equal. However, none of these conjectureshas been proven60.

60 It is known, however, that if P = NP, then P = BPP. This is because BPP is contained within the polynomial hierarchyPage 218, 61, and oneof the consequences of P = NP is the “collapse” of the hierarchy. The details are beyond the scope of this text.

61 https://en.wikipedia.org/wiki/Polynomial_hierarchy

218

https://en.wikipedia.org/wiki/Polynomial_hierarchy

CHAPTER

TWENTYONE

MONTE CARLO METHODS AND CONCENTRATION BOUNDS

Some algorithms rely on repeated trials to compute an approximation of a result. Such algorithms are called MonteCarlo methods, which are distinct from Monte Carlo algorithms (page 215) – a Monte Carlo method does repeatedsampling of a probability distribution, while a Monte Carlo algorithm is an algorithm that may produce the wrongresult within some bounded probability. The former are commonly approximation algorithms for estimating a quantity,while the latter are exact algorithms that sometimes produce the wrong result. As we will see (page 231), there is aconnection – repeated sampling (the strategy of a Monte Carlo method) can be used to amplify the probability ofgetting the correct result from a Monte Carlo algorithm.

As an example of a Monte Carlo method, we consider an algorithm for estimating the value of 𝜋, the area of a unitcircle (a circle with a radius of one). If such a circle is located in the plane, centered at the origin, its top-right quadrantis as follows:

This quadrant falls within the square interval between (0, 0) and (1, 1). The area of the quadrant is 𝜋/4, while the areaof the interval is 1. Thus, if we choose a random point in this interval, the probability that it lies within the quadrantof the circle is 𝜋/4

62. This motivates the following algorithm for estimating the value of 𝜋:

Algorithm 𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝑃 𝑖(𝑛) ∶𝑐𝑜𝑢𝑛𝑡 ∶= 0


𝑥 ∶= a value in [0, 1) chosen uniformly at random𝑦 ∶= a value in [0, 1) chosen uniformly at random

If√𝑥2 + 𝑦2 ≤ 1 ∶

𝑐𝑜𝑢𝑛𝑡 ∶= 𝑐𝑜𝑢𝑛𝑡 + 1

Return 4 ⋅ 𝑐𝑜𝑢𝑛𝑡/𝑛

The algorithm randomly chooses points between (0, 0) and (1, 1), counting how many of them fall within the unit

62 This is follows from applying the tools of continuous probability, which we will not discuss in any detail in this text.

219


circle, which is when the point’s distance from the origin is at most one. We expect this number to be 𝜋/4 of thesamples, so the algorithm returns four times the ratio of the points that fell within the circle as an estimate of 𝜋.

We formally show that the algorithm returns the value of 𝜋 in expectation. Let 𝑋 be a random variable correspondingto the value returned, and let 𝑌𝑖 be an indicator random variable that is 1 if the 𝑖th point falls within the unit circle. Asargued previously, we have:

Pr[𝑌𝑖 = 1] = 𝜋/4

We also have that

𝑋 = 4/𝑛 ⋅ (𝑌1 + ⋅ ⋅ ⋅ + 𝑌𝑛)

This leads to the following expected value for 𝑋:

E[𝑋] = 4/𝑛 ⋅ (E[𝑌1] + E[𝑌2] + ⋅ ⋅ ⋅ + E[𝑌𝑛])= 4/𝑛 ⋅ (Pr[𝑌1 = 1] + Pr[𝑌2 = 1] + ⋅ ⋅ ⋅ + Pr[𝑌𝑛 = 1])= 4/𝑛 ⋅ (𝜋/4 + 𝜋/4 + ⋅ ⋅ ⋅ + 𝜋/4)= 𝜋

The expected value of the algorithm’s output is indeed 𝜋.

21.1 Chernoff Bounds

When estimating 𝜋, how likely are we to actually get a result that is close to the expected value? While we can applyMarkov’s inequality, the bound we get is very loose, and it does not give us any information about how the numberof samples affects the quality of the estimate. The law of large numbers states that the actual result converges tothe expected value as the number of samples 𝑛 increases. But how fast does it converge? There are many types ofconcentration bounds that allow us to reason about the deviation of a random variable from its expectation; Markov’sinequality is just the simplest one. Chernoff bounds are another tool. There are multiple variants of Chernoff bounds,but the ones we will use are as follows.

Let 𝑋 = 𝑋1 + ⋅ ⋅ ⋅ + 𝑋𝑛 be the sum of independent indicator random variables, where the indicator 𝑋𝑖 has theexpectation

E[𝑋𝑖] = Pr[𝑋𝑖 = 1] = 𝑝𝑖

Let 𝜇 be the expected value of 𝑋:

𝜇 = E[𝑋] =∑𝑖

E[𝑋𝑖] =∑𝑖

𝑝𝑖

Here, we’ve applied linearity of expectation to relate the expectation of 𝑋 to that of the indicators 𝑋𝑖. The Chernoffbounds then are as follows.

Theorem 151 (Chernoff Bound – Upper Tail) Let 𝑋 = 𝑋1+⋅ ⋅ ⋅+𝑋𝑛, where the 𝑋𝑖 are independent indicatorrandom variables with E[𝑋𝑖] = 𝑝𝑖, and 𝜇 = ∑𝑖 𝑝𝑖. Suppose we wish to bound the probability of 𝑋 exceedingits expectation 𝜇 by at least a factor of 1 + 𝛿, where 𝛿 > 0. Then

Pr[𝑋 ≥ (1 + 𝛿)𝜇] ≤ ( 𝑒𝛿

(1 + 𝛿)1+𝛿)𝜇

21.1. Chernoff Bounds 220


Theorem 152 (Chernoff Bound – Lower Tail) Let 𝑋 = 𝑋1+⋅ ⋅ ⋅+𝑋𝑛, where the 𝑋𝑖 are independent indicatorrandom variables with E[𝑋𝑖] = 𝑝𝑖, and 𝜇 = ∑𝑖 𝑝𝑖. Suppose we wish to bound the probability of 𝑋 being belowits expectation 𝜇 by at least a factor of 1 − 𝛿, where 0 < 𝛿 < 1. Then

Pr[𝑋 ≤ (1 − 𝛿)𝜇] ≤ ( 𝑒−𝛿

(1 − 𝛿)1−𝛿)𝜇

These inequalities can be unwieldy to work with, so we often use the following simpler, looser bounds.

Theorem 153 (Chernoff Bound – Simplified Upper Tail) Let 𝑋 = 𝑋1+ ⋅ ⋅ ⋅+𝑋𝑛, where the 𝑋𝑖 are indepen-dent indicator random variables with E[𝑋𝑖] = 𝑝𝑖, and 𝜇 = ∑𝑖 𝑝𝑖. Suppose we wish to bound the probability of𝑋 exceeding its expectation 𝜇 by at least a factor of 1 + 𝛿, where 𝛿 > 0. Then

Pr[𝑋 ≥ (1 + 𝛿)𝜇] ≤ 𝑒− 𝛿

2𝜇

2+𝛿

Theorem 154 (Chernoff Bound – Simplified Lower Tail) Let 𝑋 = 𝑋1+⋅ ⋅ ⋅+𝑋𝑛, where the 𝑋𝑖 are indepen-dent indicator random variables with E[𝑋𝑖] = 𝑝𝑖, and 𝜇 = ∑𝑖 𝑝𝑖. Suppose we wish to bound the probability of𝑋 being below its expectation 𝜇 by at least a factor of 1 − 𝛿, where 0 < 𝛿 < 1. Then

Pr[𝑋 ≤ (1 − 𝛿)𝜇] ≤ 𝑒− 𝛿

2𝜇

2

Before we proceed to make use of these bounds, we first prove that the unsimplified upper-tail and lower-tail boundshold63.

Proof 155 (Proof of Upper-tail Chernoff Bound) To demonstrate the upper-tail Chernoff bound, we make useof the Chernoff bounding technique – rather than reasoning about the random variable 𝑋 directly, we insteadreason about 𝑒𝑡𝑋 , since small deviations in 𝑋 turn into large deviations in 𝑒

𝑡𝑋 . We have that 𝑋 ≥ (1 + 𝛿)𝜇exactly when 𝑒

𝑡𝑋≥ 𝑒

𝑡(1+𝛿)𝜇 for any 𝑡 ≥ 0; we obtain this by raising 𝑒𝑡≥ 1 to the power of both sides of the

former inequality. Then

Pr[𝑋 ≥ (1 + 𝛿)𝜇] = Pr[𝑒𝑡𝑋 ≥ 𝑒𝑡(1+𝛿)𝜇]

≤E[𝑒𝑡𝑋]𝑒𝑡(1+𝛿)𝜇

where the latter step follows from Markov’s inequality. Continuing, we have

E[𝑒𝑡𝑋]𝑒𝑡(1+𝛿)𝜇

= 𝑒−𝑡(1+𝛿)𝜇

⋅ E[𝑒𝑡𝑋]

= 𝑒−𝑡(1+𝛿)𝜇

⋅ E[𝑒𝑡(𝑋1+⋅⋅⋅+𝑋𝑛)]

= 𝑒−𝑡(1+𝛿)𝜇

⋅ E [∏𝑖

𝑒𝑡𝑋𝑖]

We now make use of the fact that the 𝑋𝑖 (and therefore the 𝑒𝑡𝑋𝑖 ) are independent. For independent random

variables 𝑌 and 𝑍, we have E[𝑌 𝑍] = E[𝑌 ] ⋅ E[𝑍] (see the appendix (page 262) for a proof). Thus,

𝑒−𝑡(1+𝛿)𝜇

⋅ E [∏𝑖

𝑒𝑡𝑋𝑖] = 𝑒

−𝑡(1+𝛿)𝜇∏𝑖

E[𝑒𝑡𝑋𝑖]

63 Refer to the appendix (page 259) for proofs of the simplified bounds.



The 𝑋𝑖 are indicators, with Pr[𝑋𝑖 = 1] = 𝑝𝑖 and Pr[𝑋𝑖 = 0] = 1 − 𝑝𝑖. When 𝑋𝑖 = 1, 𝑒𝑡𝑋𝑖= 𝑒

𝑡, and when𝑋𝑖 = 0, 𝑒𝑡𝑋𝑖

= 𝑒0= 1. Thus, the distribution of 𝑒𝑡𝑋𝑖 is

Pr[𝑒𝑡𝑋𝑖= 𝑒

𝑡] = 𝑝𝑖

Pr[𝑒𝑡𝑋𝑖= 1] = 1 − 𝑝𝑖

We can then compute the expectation of 𝑒𝑡𝑋𝑖 :

E[𝑒𝑡𝑋𝑖] = 𝑒𝑡⋅ Pr[𝑒𝑡𝑋𝑖

= 𝑒𝑡] + 1 ⋅ Pr[𝑒𝑡𝑋𝑖

= 1]= 𝑒

𝑡𝑝𝑖 + 1 − 𝑝𝑖

= 1 + 𝑝𝑖(𝑒𝑡 − 1)Plugging this into the upper bound that resulted from Markov’s inequality, we get

𝑒−𝑡(1+𝛿)𝜇∏

𝑖

E[𝑒𝑡𝑋𝑖] = 𝑒−𝑡(1+𝛿)𝜇∏

𝑖

1 + 𝑝𝑖(𝑒𝑡 − 1)

We can further simplify this expression by observing that 1 + 𝑥 ≤ 𝑒𝑥 for all 𝑥:

-1

0

1

2

3

4

-4 -3 -2 -1 0 1 2 3 4

𝑒!

1 + 𝑥

Using 𝑥 = 𝑝𝑖(𝑒𝑡 − 1), we have

1 + 𝑝𝑖(𝑒𝑡 − 1) ≤ 𝑒𝑝𝑖(𝑒𝑡−1)



Applying this to our previously computed upper bound, we get

𝑒−𝑡(1+𝛿)𝜇∏

𝑖

1 + 𝑝𝑖(𝑒𝑡 − 1) ≤ 𝑒−𝑡(1+𝛿)𝜇∏

𝑖

𝑒𝑝𝑖(𝑒𝑡−1)

= 𝑒−𝑡(1+𝛿)𝜇

𝑒(𝑒𝑡−1)∑𝑖 𝑝𝑖

= 𝑒−𝑡(1+𝛿)𝜇

𝑒(𝑒𝑡−1)𝜇

= 𝑒𝜇(𝑒𝑡−1−𝑡(1+𝛿))

We can now choose 𝑡 to minimize the exponent, which in turn minimizes the value of the exponential itself.Let 𝑓(𝑡) = 𝑒

𝑡 − 1 − 𝑡(1 + 𝛿). Then we compute the derivative with respect to 𝑡 and set that to zero to find anextremum:

𝑓(𝑡) = 𝑒𝑡− 1 − 𝑡(1 + 𝛿)

𝑓′(𝑡) = 𝑒

𝑡− (1 + 𝛿) = 0

𝑒𝑡= 1 + 𝛿

𝑡 = ln(1 + 𝛿)

Computing the second derivative 𝑓 ′′(ln(1+𝛿)) = 𝑒ln(1+𝛿)

= 1+𝛿, we see that it is positive since 𝛿 > 0, therefore𝑡 = ln(1 + 𝛿) is a minimum. Substituting it into our earlier expression, we obtain

𝑒𝜇(𝑒𝑡−1−𝑡(1+𝛿))

≤ 𝑒𝜇(𝑒ln(1+𝛿)−1−(1+𝛿) ln(1+𝛿))

= 𝑒𝜇((1+𝛿)−1−(1+𝛿) ln(1+𝛿))

= 𝑒𝜇(𝛿−(1+𝛿) ln(1+𝛿))

= (𝑒𝛿−(1+𝛿) ln(1+𝛿))𝜇

= ( 𝑒𝛿

𝑒(1+𝛿) ln(1+𝛿))𝜇

= ( 𝑒𝛿

(𝑒ln(1+𝛿))1+𝛿)𝜇

= ( 𝑒𝛿

(1 + 𝛿)1+𝛿)𝜇

Proof 156 (Proof of Lower-tail Chernoff Bound) The proof of the lower-tail Chernoff bound follows similarreasoning as that of the upper-tail bound. We have:

Pr[𝑋 ≤ (1 − 𝛿)𝜇] = Pr[−𝑋 ≥ −(1 − 𝛿)𝜇]= Pr[𝑒−𝑡𝑋 ≥ 𝑒

−𝑡(1−𝛿)𝜇]

≤E[𝑒−𝑡𝑋]𝑒−𝑡(1−𝛿)𝜇

Here, we can apply Markov’s inequality to the random variable 𝑒−𝑡𝑋 since it is nonnegative, unlike −𝑋 . Contin-



uing, we have

E[𝑒−𝑡𝑋]𝑒−𝑡(1−𝛿)𝜇

= 𝑒𝑡(1−𝛿)𝜇

⋅ E[𝑒−𝑡𝑋]


⋅ E[𝑒−𝑡(𝑋1+⋅⋅⋅+𝑋𝑛)]


⋅ E [∏𝑖

𝑒−𝑡𝑋𝑖]

= 𝑒𝑡(1−𝛿)𝜇∏

𝑖

E[𝑒−𝑡𝑋𝑖]

In the last step, we make use of the fact that the 𝑋𝑖 and therefore the 𝑒−𝑡𝑋𝑖 are independent. The distribution of

𝑒−𝑡𝑋𝑖 is

Pr[𝑒−𝑡𝑋𝑖= 𝑒

−𝑡] = 𝑝𝑖

Pr[𝑒−𝑡𝑋𝑖= 1] = 1 − 𝑝𝑖

Then the expectation of 𝑒−𝑡𝑋𝑖 is:

E[𝑒−𝑡𝑋𝑖] = 𝑒−𝑡⋅ Pr[𝑒−𝑡𝑋𝑖

= 𝑒−𝑡] + 1 ⋅ Pr[𝑒−𝑡𝑋𝑖

= 1]= 𝑒

−𝑡𝑝𝑖 + 1 − 𝑝𝑖

= 1 + 𝑝𝑖(𝑒−𝑡 − 1)Plugging this into the bound we’ve computed so far, we get

𝑒𝑡(1−𝛿)𝜇∏

𝑖

E[𝑒−𝑡𝑋𝑖] = 𝑒𝑡(1−𝛿)𝜇∏

𝑖

1 + 𝑝𝑖(𝑒−𝑡 − 1)

As with the upper-tail, we use the fact that 1 + 𝑥 ≤ 𝑒𝑥 for all 𝑥 to simplify this, with 𝑥 = 𝑝𝑖(𝑒−𝑡 − 1):

𝑒𝑡(1−𝛿)𝜇∏

𝑖

1 + 𝑝𝑖(𝑒−𝑡 − 1) ≤ 𝑒𝑡(1−𝛿)𝜇∏

𝑖

𝑒𝑝𝑖(𝑒−𝑡−1)


𝑒(𝑒−𝑡−1)∑𝑖 𝑝𝑖


𝑒(𝑒−𝑡−1)𝜇

= 𝑒𝜇(𝑒−𝑡−1+𝑡(1−𝛿))

We choose 𝑡 to minimize the exponent. Let 𝑓(𝑡) = 𝑒−𝑡 − 1 + 𝑡(1 − 𝛿). Then we compute the derivative with

respect to 𝑡 and set that to zero to find an extremum:

𝑓(𝑡) = 𝑒−𝑡− 1 + 𝑡(1 − 𝛿)

𝑓′(𝑡) = −𝑒−𝑡 + (1 − 𝛿) = 0

𝑒−𝑡= 1 − 𝛿

𝑡 = − ln(1 − 𝛿)

Computing the second derivative 𝑓 ′′(− ln(1−𝛿)) = 𝑒−(− ln(1−𝛿))

= 1−𝛿, we see that it is positive since 0 < 𝛿 < 1,



therefore 𝑡 = ln(1 − 𝛿) is a minimum. Substituting it into our earlier expression, we obtain

𝑒𝜇(𝑒−𝑡−1+𝑡(1−𝛿))

≤ 𝑒𝜇(𝑒ln(1−𝛿)−1−(1−𝛿) ln(1−𝛿))

= 𝑒𝜇((1−𝛿)−1−(1−𝛿) ln(1−𝛿))

= 𝑒𝜇(−𝛿−(1−𝛿) ln(1−𝛿))

= (𝑒−𝛿−(1−𝛿) ln(1−𝛿))𝜇

= ( 𝑒−𝛿

𝑒(1−𝛿) ln(1−𝛿))𝜇

= ( 𝑒−𝛿

(𝑒ln(1−𝛿))1−𝛿)𝜇

= ( 𝑒−𝛿

(1 − 𝛿)1−𝛿)𝜇

In lieu of applying Chernoff bounds to the algorithm for estimating 𝜋, we consider a different example of flipping acoin. If we flip a biased coin with probability 𝑝 of heads, we expect to see 𝑛𝑝 heads. Let 𝐻 be the total number ofheads, and let 𝐻𝑖 be an indicator variable corresponding to whether the 𝑖th flip is heads. We have Pr[𝐻𝑖 = 1] = 𝑝,and E[𝐻] = 𝑛𝑝 by linearity of expectation.

Suppose the coin is fair. What is the probability of getting at least six heads out of ten flips? This is a fractionaldeviation of 𝛿 = 6/5 − 1 = 0.2, and applying the upper-tail Chernoff bound gives us:

Pr[𝐻 ≥ (1 + 0.2) ⋅ 5] ≤ ( 𝑒0.2

1.21.2)5

≈ 0.98145

≈ 0.91

What is the probability of getting at least 60 heads out of 100 flips, which is the same fractional deviation 𝛿 = 0.2from the expectation? Applying the upper tail again, we get

Pr[𝐻 ≥ (1 + 0.2) ⋅ 50] ≤ ( 𝑒0.2

1.21.2)50

≈ 0.981450

≈ 0.39

Thus, the probability of deviating by a factor of 1 + 𝛿 decreases significantly as the number of samples increases. Forthis particular example, we can compute the actual probabilities quite tediously from exact formulas for a binomialdistribution64, which yields 0.37 for getting six heads out of ten flips and 0.03 for 60 heads out of 100 flips. However,this approach becomes more and more expensive as 𝑛 increases, and the Chernoff bound produces a reasonable resultwith much less work.

64 https://en.wikipedia.org/wiki/Binomial_distribution


https://en.wikipedia.org/wiki/Binomial_distribution

https://en.wikipedia.org/wiki/Binomial_distribution


21.2 Hoeffding’s Inequality

Another concentration bound that can often produce a tighter result than Chernoff bounds is Hoeffding’s inequality.For simplicity, we restrict ourselves to the special case65 of independent indicator random variables, with expectation

Pr[𝑋𝑖 = 1] = 𝑝𝑖

for each indicator 𝑋𝑖.

Let 𝑝 be the expected value of 1𝑛𝑋 . We have

𝑝 = E [ 1𝑛𝑋]

=1𝑛∑

𝑖

E[𝑋𝑖]

=1𝑛∑

𝑖

𝑝𝑖

Hoeffding’s inequality places a limit on how much the actual value of 1𝑛𝑋 may deviate from the expectation 𝑝.

Theorem 157 (Hoeffding’s Inequality – Upper Tail) Let 𝑋 = 𝑋1 + ⋅ ⋅ ⋅ +𝑋𝑛, where the 𝑋𝑖 are independentindicator random variables with E[𝑋𝑖] = 𝑝𝑖. Let

𝑝 = E [ 1𝑛𝑋] = 1

𝑛∑𝑖

𝑝𝑖

Let 𝜀 > 0 be a deviation from the expectation. Then

Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] ≤ 𝑒

−2𝜀2𝑛

Theorem 158 (Hoeffding’s Inequality – Lower Tail) Let 𝑋 = 𝑋1 + ⋅ ⋅ ⋅ +𝑋𝑛, where the 𝑋𝑖 are independentindicator random variables with E[𝑋𝑖] = 𝑝𝑖. Let

𝑝 = E [ 1𝑛𝑋] = 1

𝑛∑𝑖

𝑝𝑖


Pr [ 1𝑛𝑋 ≤ 𝑝 − 𝜀] ≤ 𝑒

−2𝜀2𝑛

Theorem 159 (Hoeffding’s Inequality – Combined) Let 𝑋 = 𝑋1 + ⋅ ⋅ ⋅ +𝑋𝑛, where the 𝑋𝑖 are independentindicator random variables with E[𝑋𝑖] = 𝑝𝑖. Let

𝑝 = E [ 1𝑛𝑋] = 1

𝑛∑𝑖

𝑝𝑖

65 See the appendix (page 266) for the general case of Hoeffding’s inequality, as well as proofs of the special (page 261) and general (page 266)cases.

21.2. Hoeffding’s Inequality 226



Pr [»»»»»»»1𝑛𝑋 − 𝑝

»»»»»»»≥ 𝜀] ≤ 2𝑒

−2𝜀2𝑛

We consider again the example of flipping a fair coin with probability. Let 𝐻 be the total number of heads, and let 𝐻𝑖

be an indicator variable corresponding to whether the 𝑖th flip is heads. We have Pr[𝐻𝑖 = 1] = 12

, and E[ 1𝑛𝐻] = 1

2for

any number of flips 𝑛.

What is the probability of getting at least six heads out of ten flips? This is a deviation of 𝜀 = 0.1, and applying theupper-tail Hoeffding’s inequality gives us:

Pr [ 1𝑛𝐻 ≥ 𝑝 + 𝜀] = Pr [ 1

𝑛𝐻 ≥ 0.5 + 0.1]

≤ 𝑒−2⋅0.12⋅10

≈ 0.980210

≈ 0.82

What is the probability of getting at least 60 heads out of 100 flips, which is the same deviation 𝜀 = 0.1 from theexpectation? Applying the upper tail again, we get

Pr [ 1𝑛𝐻 ≥ 𝑝 + 𝜀] ≤ 𝑒

−2⋅0.12⋅100≈ 0.9802

100≈ 0.14

This is a significantly tighter bound than that produced by the Chernoff bound.

21.3 Polling

Rather than applying concentration bounds to compute the probability of a deviation for a specific sample size, weoften wish to determine how many samples we need to be within a particular deviation with high confidence. Oneapplication of this is big data, where we have vast datasets that are too large to examine in their entirety, so we samplethe data instead to estimate the quantities of interest. Similar to this is polling – outside of elections themselves, wetypically do not have the resources to ask the entire population for their opinions, so we need to sample the populaceto estimate the support for a particular candidate or political position. In both applications, we need to determine howmany samples are needed to obtain a good estimate.

In general, a poll estimates the fraction of the population that supports a particular candidate by asking 𝑛 randomlychosen people whether they support that candidate. Let 𝑋 be a random variable corresponding to the number of peoplewho answer this question in the affirmative. Then 𝑋/𝑛 is an estimate of the level of support in the full population.

A typical poll has both a confidence level and a margin of error – the latter corresponds to the deviation from thetrue fraction 𝑝 of people who support the candidate, and the former corresponds to a bound on the probability that theestimate is within that deviation. For example, a 95% confidence level and a margin of error of ±2% requires that

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≤ 0.02] ≥ 0.95

More generally, for a confidence level 1 − 𝛾 and margin of error 𝜀, we require

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≤ 𝜀] ≥ 1 − 𝛾

or equivalently

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»> 𝜀] < 𝛾

21.3. Polling 227


Formally, we define indicator variables 𝑋𝑖 as

𝑋𝑖 = 1 if person 𝑖 supports the candidate0 otherwise

for each person 𝑖 in the set that we poll. Then 𝑋 = 𝑋1 + ⋅ ⋅ ⋅ +𝑋𝑛 is the sum of independent indicator variables, with

𝜇 = E[𝑋] =∑𝑖

E[𝑋𝑖] = 𝑛𝑝

21.3.1 Analysis with Chernoff Bounds

For an arbitrary margin of error 𝜀, we want to bound the probability

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»> 𝜀] = Pr [𝑋𝑛 − 𝑝 > 𝜀] + Pr [𝑋𝑛 − 𝑝 < −𝜀]

= Pr [𝑋𝑛 > 𝜀 + 𝑝] + Pr [𝑋𝑛 < −𝜀 + 𝑝]

= Pr[𝑋 > 𝜀𝑛 + 𝑝𝑛] + Pr[𝑋 < −𝜀𝑛 + 𝑝𝑛]= Pr[𝑋 > 𝜀𝑛 + 𝜇] + Pr[𝑋 < −𝜀𝑛 + 𝜇]= Pr[𝑋 > (1 + 𝜀𝑛/𝜇)𝜇] + Pr[𝑋 < (1 − 𝜀𝑛/𝜇)𝜇]= Pr[𝑋 ≥ (1 + 𝜀𝑛/𝜇)𝜇] + Pr[𝑋 ≤ (1 − 𝜀𝑛/𝜇)𝜇]− Pr[𝑋 = (1 + 𝜀𝑛/𝜇)𝜇] − Pr[𝑋 = (1 − 𝜀𝑛/𝜇)𝜇]

≤ Pr[𝑋 ≥ (1 + 𝜀𝑛/𝜇)𝜇] + Pr[𝑋 ≤ (1 − 𝜀𝑛/𝜇)𝜇]In the second-to-last step, we use the fact that 𝑋 = 𝑥 and 𝑋 > 𝑥 are disjoint events, since a random variable mapseach sample point to a single value, and that (𝑋 ≥ 𝑥) = (𝑋 = 𝑥) ∪ (𝑋 > 𝑥).

We now have events that are in the right form to apply a Chernoff bound, with 𝛿 = 𝜀𝑛/𝜇. We first apply the simplifiedupper-tail bound to the first term, obtaining

Pr[𝑋 ≥ (1 + 𝜀𝑛/𝜇)𝜇] ≤ 𝑒− 𝛿

2𝜇

2+𝛿 where 𝛿 = 𝜀𝑛/𝜇

= 𝑒− (𝜀𝑛)2𝜇

𝜇2(2+𝜀𝑛/𝜇)

= 𝑒− (𝜀𝑛)2

𝜇(2+𝜀𝑛/𝜇)

= 𝑒− (𝜀𝑛)2

2𝜇+𝜀𝑛

= 𝑒− (𝜀𝑛)2

2𝑛𝑝+𝜀𝑛

= 𝑒− 𝜀

2𝑛

2𝑝+𝜀

We want this to be less than some value 𝛽:

𝑒− 𝜀

2𝑛

2𝑝+𝜀 < 𝛽

1

𝛽< 𝑒

𝜀2𝑛

2𝑝+𝜀

ln ( 1𝛽) < 𝜀

2𝑛

2𝑝 + 𝜀

𝑛 >2𝑝 + 𝜀

𝜀2ln ( 1

𝛽)

21.3. Polling 228


Unfortunately, this expression includes 𝑝, the quantity we’re trying to estimate However, we know that 𝑝 ≤ 1, so wecan set

𝑛 >2 + 𝜀

𝜀2ln ( 1

𝛽)

to ensure that Pr[𝑋 ≥ (1 + 𝜀𝑛/𝜇)𝜇] < 𝛽.

We can also apply the simplified lower-tail bound to the second term in our full expression above:

Pr[𝑋 ≤ (1 − 𝜀𝑛/𝜇)𝜇] ≤ 𝑒− 𝛿

2𝜇

2 where 𝛿 = 𝜀𝑛/𝜇

= 𝑒− (𝜀𝑛)2𝜇

2𝜇2

= 𝑒− (𝜀𝑛)2

2𝜇

= 𝑒− (𝜀𝑛)2

2𝑛𝑝

= 𝑒− 𝜀

2𝑛

2𝑝

As before, we want this to be less some value 𝛽:

𝑒− 𝜀

2𝑛

2𝑝) < 𝛽

1

𝛽< 𝑒

𝜀2𝑛

2𝑝

ln ( 1𝛽) < 𝜀

2𝑛

2𝑝

𝑛 >2𝑝

𝜀2ln ( 1

𝛽)

We again observe that 𝑝 ≤ 1, so we can set

𝑛 >2

𝜀2ln ( 1

𝛽)

to ensure that Pr[𝑋 ≤ (1 − 𝜀𝑛/𝜇)𝜇] < 𝛽.

To bound both terms simultaneously, we can choose 𝛽 = 𝛾/2, so that

Pr[𝑋 ≥ (1 + 𝜀𝑛/𝜇)𝜇] + Pr[𝑋 ≤ (1 − 𝜀𝑛/𝜇)𝜇] < 2𝛽 = 𝛾

To achieve this, we require

𝑛 > max (2 + 𝜀

𝜀2ln ( 2

𝛾 ) , 2

𝜀2ln ( 2

𝛾 ))

=2 + 𝜀

𝜀2ln ( 2

𝛾 )

For example, for a 95% confidence level and a margin of error of ±2%, we have 𝜀 = 0.02 and 𝛾 = 0.05. Pluggingthose values into the result above, we need no more than

2 + 𝜀

𝜀2ln ( 2

𝛾 ) = 2.02

0.022ln 40 ≈ 18623

samples to achieve the desired confidence level and margin of error. Observe that this does not depend on the totalpopulation size!

21.3. Polling 229


21.3.2 Analysis with Hoeffding’s Inequality

We can also apply Hoeffding’s inequality to polling. The combined inequality gives us

Pr [»»»»»»»1𝑛𝑋 − 𝑝

»»»»»»»≥ 𝜀] ≤ 2𝑒

−2𝜀2𝑛

For a 95% confidence level and a margin of error of ±2%, we require that

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≤ 0.02] ≥ 0.95

However, this isn’t quite in the form where we can apply Hoeffding’s inequality, so we need to do some manipulationfirst. We have:

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≤ 0.02] = 1 − Pr [

»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»> 0.02]

≥ 1 − Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≥ 0.02]

Hoeffding’s inequality gives us:

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≥ 0.02] ≤ 2𝑒

−2⋅0.022⋅𝑛

Substituting this into the above, we get:

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≤ 0.02] ≥ 1 − 2𝑒

−2⋅0.022⋅𝑛

We want this to be at least 0.95:

1 − 2𝑒−2⋅0.022⋅𝑛

≥ 0.95

2𝑒−2⋅0.022⋅𝑛

≤ 0.05

𝑒2⋅0.022⋅𝑛

≥ 40

2 ⋅ 0.022⋅ 𝑛 ≥ ln 40

𝑛 ≥ln 40

2 ⋅ 0.022

≈ 4611.1

Thus, we obtain the given confidence level and margin of error by polling at least 4612 people, which is a significantlybetter result than we obtain from the simplified Chernoff bounds.

For an arbitrary margin of error ±𝜀, we obtain:

Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≤ 𝜀] = 1 − Pr [

»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»> 𝜀]

≥ 1 − Pr [»»»»»»»𝑋𝑛 − 𝑝

»»»»»»»≥ 𝜀]

≥ 1 − 2𝑒−2𝜀2𝑛

21.3. Polling 230


To achieve an arbitrary confidence level 1 − 𝛾, we need:

1 − 2𝑒−2𝜀2𝑛

≥ 1 − 𝛾

2𝑒−2𝜀2𝑛

≤ 𝛾

𝑒2𝜀

2𝑛≥

2𝛾

2𝜀2𝑛 ≥ ln ( 2

𝛾 )

𝑛 ≥1

2𝜀2ln ( 2

𝛾 )

This is a factor of four better than the result obtained from the simplified Chernoff bounds.

More generally, if we wish to gauge the level of support for 𝑚 different candidates, the sampling theorem tells us thatthe number of samples required is logarithmic in 𝑚.

Theorem 160 (Sampling Theorem) Suppose 𝑛 people are polled to ask which candidate they support, out of 𝑚possible candidates. Let 𝑋(𝑗) be the number of people who state that they support candidate 𝑗, and let 𝑝𝑗 be thetrue level of support for that candidate. We wish to obtain

Pr [⋂𝑗

(»»»»»»»»»»

𝑋(𝑗)

𝑛 − 𝑝𝑗

»»»»»»»»»»≤ 𝜀)] ≥ 1 − 𝛾

In other words, we desire a confidence level 1− 𝛾 that all estimates 𝑋(𝑗)/𝑛 are within margin of error ±𝜀 of thetrue values 𝑝𝑗 . We obtain this when the number of samples 𝑛 is

𝑛 ≥1

2𝜀2ln (2𝑚

𝛾 )

The sampling theorem can be derived from applying the union bound (page 235), which we will discuss shortly.

In conclusion, when sampling from a large dataset, the number of samples required does depend on the desiredaccuracy of the estimation and the range size (i.e. number of possible answers). But it does not depend on thepopulation size. This makes sampling a powerful technique for dealing with big data, as long as we are willing totolerate a small possibility of obtaining an inaccurate estimate.

21.4 Amplification for Two-Sided-Error Algorithms

Previously, we saw how to amplify the probability of success for one-sided-error randomized algorithms (page 213).We now consider amplification for two-sided-error algorithms (page 214). Recall that such an algorithm 𝐴 for alanguage 𝐿 has the following behavior:



Here, 𝑐 must be a constant that is strictly greater than 12

.

Unlike in the one-sided case with no false positives, we cannot just run the algorithm multiple times and observe if itaccepts at least once. A two-sided-error algorithm can accept both inputs in and not in the language, and it can rejectboth such inputs as well. However, we observe that because 𝑐 > 1

2, we expect to get the right answer the majority of

the time when we run a two-sided-error algorithm on an input. More formally, suppose we run the algorithm 𝑛 times.Let 𝑌𝑖 be an indicator random variable that is 1 if the algorithm accepts in the 𝑖th run. Then:

• if 𝑥 ∈ 𝐿, E[𝑌𝑖] = Pr[𝑌𝑖 = 1] ≥ 𝑐

21.4. Amplification for Two-Sided-Error Algorithms 231


• if 𝑥 ∉ 𝐿, E[𝑌𝑖] = Pr[𝑌𝑖 = 1] ≤ 1 − 𝑐

Let 𝑌 = 𝑌1 + ⋅ ⋅ ⋅ + 𝑌𝑛 be the total number of accepts out of 𝑛 trials. By linearity of expectation, we have:

• if 𝑥 ∈ 𝐿, E[𝑌 ] ≥ 𝑐𝑛 > 𝑛2

(since 𝑐 > 12

)

• if 𝑥 ∉ 𝐿, E[𝑌 ] ≤ (1 − 𝑐)𝑛 < 𝑛2

(since 1 − 𝑐 < 12

)

This motivates an amplification algorithm 𝐵 that runs the original algorithm 𝐴 multiple times and takes the majorityvote:

Algorithm 𝐵(𝑥) ∶Run 𝐴 on 𝑥 repeatedly, for a total of 𝑛 timesIf 𝐴 accepts at least 𝑛/2 times, acceptElse reject

Suppose we wish to obtain a bound that is within 𝛾 of 1:

• if 𝑥 ∈ 𝐿, Pr[𝐵 accepts 𝑥] ≥ 1 − 𝛾

• if 𝑥 ∉ 𝐿, Pr[𝐵 rejects 𝑥] ≥ 1 − 𝛾

What value for 𝑛 should we use in 𝐵?

We first consider the case where 𝑥 ∈ 𝐿. The indicators 𝑌𝑖 are independent, allowing us to use Hoeffding’s inequalityon their sum 𝑌 . 𝐵 accepts 𝑥 when 𝑌 ≥

𝑛2

, or equivalently, 𝑌𝑛≥

12

. We want

Pr [𝑌𝑛 ≥1

2] ≥ 1 − 𝛾

or equivalently,

Pr [𝑌𝑛 <1

2] ≤ Pr [𝑌𝑛 ≤

1

2]

= Pr [𝑌𝑛 ≤ 𝑐 − (𝑐 − 1

2)]

= Pr [𝑌𝑛 ≤ 𝑐 − 𝜀]

≤ 𝛾

where 𝜀 = 𝑐 − 12

. To apply Hoeffding’s inequality, we need something of the form

Pr [𝑌𝑛 ≤ 𝑝 − 𝜀]

where 𝑝 = E [𝑌𝑛]. Unfortunately, we do not know the exact value of 𝑝; all we know is that 𝑝 = E [𝑌

𝑛] ≥ 𝑐. However,

we know that because 𝑝 ≥ 𝑐,

Pr [𝑌𝑛 ≤ 𝑐 − 𝜀] ≤ Pr [𝑌𝑛 ≤ 𝑝 − 𝜀]

In general, the event 𝑋 ≤ 𝑎 includes at most as many sample points as 𝑋 ≤ 𝑏 when 𝑎 ≤ 𝑏; the latter includes all theoutcomes in 𝑋 ≤ 𝑎, as well as those in 𝑎 < 𝑋 ≤ 𝑏. We thus need only compute an upper bound on Pr [𝑌

𝑛≤ 𝑝 − 𝜀],

and that same upper bound will apply to Pr [𝑌𝑛≤ 𝑐 − 𝜀]. Taking 𝜀 = 𝑐 − 1

2and applying the lower-tail Hoeffding’s

inequality, we get:

Pr [𝑌𝑛 ≤ 𝑝 − 𝜀] = Pr [𝑌𝑛 ≤ 𝑝 − (𝑐 − 1

2)]

≤ 𝑒−2(𝑐−1/2)2𝑛



We want this to be bound by 𝛾:

𝑒−2(𝑐−1/2)2𝑛

≤ 𝛾

𝑒2(𝑐−1/2)2𝑛

≥1𝛾

2(𝑐 − 1/2)2𝑛 ≥ ln ( 1𝛾 )

𝑛 ≥1

2(𝑐 − 1/2)2 ln ( 1𝛾 )

As a concrete example, suppose 𝑐 = 34

, meaning that 𝐴 accepts 𝑥 ∈ 𝐿 with probability at least 34

. Suppose we want 𝐵to accept 𝑥 ∈ 𝐿 at least 99% of the time, giving us 𝛾 = 0.01. Then:

𝑛 ≥1

2(3/4 − 1/2)2 ln ( 1

0.01)

=1

2 ⋅ 0.252ln 100

≈ 36.8

Thus, it is sufficient for 𝐵 to run 𝑛 ≥ 37 trials of 𝐴 on 𝑥.

We now consider 𝑥 ∉ 𝐿. We want 𝐵 to reject 𝑥 with probability at least 1 − 𝛾, or equivalently, 𝐵 to accept 𝑥 withprobability at most 𝛾:

Pr [𝑌𝑛 ≥1

2] ≤ 𝛾

Similar to before, we have

Pr [𝑌𝑛 ≥1

2] = Pr [𝑌𝑛 ≥ (1 − 𝑐) + (𝑐 − 1

2)]

= Pr [𝑌𝑛 ≥ (1 − 𝑐) + 𝜀]

with 𝜀 = 𝑐 − 12

. Since 𝑝 = E [𝑌𝑛] ≤ (1 − 𝑐), we have

Pr [𝑌𝑛 ≥ (1 − 𝑐) + 𝜀] ≤ Pr [𝑌𝑛 ≥ 𝑝 + 𝜀]

This follows from similar reasoning as earlier: an event 𝑋 ≥ 𝑎 contains no more sample points than 𝑋 ≥ 𝑏 when𝑎 ≥ 𝑏. With 𝜀 = 𝑐 − 1

2, the upper-tail Hoeffding’s inequality gives us:

Pr [𝑌𝑛 ≥ 𝑝 + 𝜀] = Pr [𝑌𝑛 ≥ 𝑝 + (𝑐 − 1

2)]

≤ 𝑒−2(𝑐−1/2)2𝑛

We want this to be no more than 𝛾, which leads to the same solution as before:

𝑛 ≥1

2(𝑐 − 1/2)2 ln ( 1𝛾 )

Using the same concrete example as before, if we want 𝐵 to reject 𝑥 ∉ 𝐿 at least 99% of the time when 𝐴 rejects𝑥 ∉ 𝐿 with probability at least 3

4, it suffices for 𝐵 to run 𝑛 ≥ 37 trials of 𝐴 on 𝑥.



21.5 Load Balancing

Job scheduling is another application where we can exploit randomness to construct a simple, highly effective algo-rithm. In this problem, we have 𝑘 servers, and there are 𝑛 jobs that need to be distributed to these servers. The goal isto balance the load among the servers, so that no one server is overloaded. This problem is very relevant to content-delivery networks – there may be on the order of millions or even billions of concurrent users, and each user needs tobe routed to one of only hundreds or thousands of servers. Thus, we have 𝑛≫ 𝑘 in such a case.

One possible algorithm is to always send a job to the most lightly loaded server. However, this requires significantcoordination – the scheduler must keep track of the load on each server, which requires extra communication, space,and computational resources. Instead, we consider a simple, randomized algorithm that just sends each job to a randomserver. The expected number of jobs on each server is 𝑛/𝑘. But how likely are we to be close to this ideal, balancedload?

We start our analysis by defining random variables 𝑋(𝑗) corresponding to the number of jobs assigned to server 𝑗. Wewould like to demonstrate that the joint probability of 𝑋(𝑗) being close to 𝑛/𝑘 for all 𝑗 is high. We first reason aboutthe load on an individual server. In particular, we wish to compute a bound on

Pr [𝑋(𝑗)≥

𝑛

𝑘+ 𝑐]

for some value 𝑐 > 0, i.e. the probability that server 𝑗 is overloaded by at least 𝑐 jobs. Let 𝑋(𝑗)𝑖 be an indicator random

variable that is 1 if job 𝑖 is sent to server 𝑗. Since the target server for job 𝑖 is chosen uniformly at random out of 𝑘possible choices, we have

E [𝑋(𝑗)𝑖 ] = Pr [𝑋(𝑗)

𝑖 = 1] = 1

𝑘

We also have 𝑋(𝑗)= 𝑋

(𝑗)1 + ⋅ ⋅ ⋅ +𝑋

(𝑗)𝑛 , giving us

E [𝑋(𝑗)] =∑𝑖

E [𝑋(𝑗)𝑖 ] = 𝑛

𝑘

as we stated before. Then:

Pr [𝑋(𝑗)≥

𝑛

𝑘+ 𝑐] = Pr [ 1

𝑛𝑋(𝑗)≥

1

𝑘+

𝑐𝑛]

≤ 𝑒−2(𝑐/𝑛)2𝑛

= 𝑒−2𝑐2/𝑛

by the upper-tail Hoeffding’s inequality.

Now that we have a bound on an individual server being overloaded by 𝑐 jobs, we wish to bound the probability thatthere is some server that is overloaded. The complement event is that no server is overloaded, so we can equivalentlycompute the joint probability that none of the servers is overloaded by 𝑐 or more jobs:

Pr [𝑋(1)<

𝑛

𝑘+ 𝑐, . . . , 𝑋

(𝑛)<

𝑛

𝑘+ 𝑐]

However, we cannot do so by simply multiplying the individual probabilities together – that only works if the proba-bilities are independent, and in this case, they are not. In particular, if one server is overloaded, some other server mustbe underloaded, so the random variables 𝑋(𝑗) are not independent.

Rather than trying to work with the complement event, we attempt to directly compute the probability that there is atleast one overloaded server:

Pr [(𝑋(1)≥

𝑛

𝑘+ 𝑐) ∪ ⋅ ⋅ ⋅ ∪ (𝑋(𝑛)

≥𝑛

𝑘+ 𝑐)]

21.5. Load Balancing 234


More succinctly, we denote this as:

Pr [⋃𝑗

(𝑋(𝑗)≥

𝑛

𝑘+ 𝑐)]

We have a union of events, and we want to compute the probability of that union. The union bound allows us tocompute a bound on that probability.

Theorem 161 (Union Bound) Let 𝐸1, 𝐸2, . . . , 𝐸𝑚 be a sequence of events over the same sample space. Then

Pr [𝑚

⋃𝑖=1

𝐸𝑖] ≤𝑚

∑𝑖=1

Pr[𝐸𝑖]

In other words, the probability of the union is no more than the sum of the probabilities of the individual events.

To see why the union bound holds, consider the outcomes that are in the union event ∪𝑖𝐸𝑖. The probability of thisevent is the sum of the probabilities of the individual outcomes in ∪𝑖𝐸𝑖. In the maximal case, the events 𝐸𝑖 aredisjoint, so that an individual outcome is not a member of more than one 𝐸𝑖. Then the sum of the probabilities of theevents 𝐸𝑖 is also the sum of the probabilities of the individual outcomes in ∪𝑖𝐸𝑖. In the general case, the 𝐸𝑖’s mayoverlap:

𝐸1

𝐸2

𝐸3In this case, the sum ∑𝑖 Pr[𝐸𝑖] overcounts the probability Pr[∪𝑖𝐸𝑖], since we “double count” outcomes that occurin more than one 𝐸𝑖. Thus,∑𝑖 Pr[𝐸𝑖] is an upper bound on the probability Pr[∪𝑖𝐸𝑖].Returning to the job-scheduling problem, we have:

Pr [⋃𝑗

(𝑋(𝑗)≥

𝑛

𝑘+ 𝑐)] ≤∑

𝑗

Pr [𝑋(𝑗)≥

𝑛

𝑘+ 𝑐]

≤∑𝑗

𝑒−2𝑐2/𝑛

= 𝑘 ⋅ 𝑒−2𝑐2/𝑛

= 𝑒ln𝑘

⋅ 𝑒−2𝑐2/𝑛

= 𝑒ln𝑘−2𝑐2/𝑛



For 𝑐 =√𝑛 ln 𝑘, we get:

Pr [⋃𝑗

(𝑋(𝑗)≥

𝑛

𝑘+

√𝑛 ln 𝑘)] ≤ 𝑒

ln𝑘−2(√𝑛 ln𝑘)2/𝑛

= 𝑒ln𝑘−2(𝑛 ln𝑘)/𝑛

= 𝑒− ln𝑘

= 1/𝑒ln𝑘

=1

𝑘

With concrete values 𝑛 = 1010 and 𝑘 = 1000, we compute the overload relative to the expected value as:

√𝑛 ln 𝑘

𝑛/𝑘 =

√1010 ln 1000

107

≈ 0.026

This is an overhead of about 2.6% above the expected load. The probability that there is a server overloaded by at least2.6% is bounded from above by 1/𝑘 = 0.001, or ≤0.1%. Thus, when there are 𝑛 = 10

10 jobs and 𝑘 = 1000 servers,the randomized algorithm has a high probability (≥99.9%) of producing a schedule where the servers all have a lowoverhead (≤2.6%).

Exercise 162 Previously, we demonstrated (page 230) that to use polling to achieve a confidence level 1−𝛾 anda margin of error ±𝜀,

𝑛 ≥1

2𝜀2ln ( 2

𝛾 )

samples are sufficient when there is a single candidate. We also saw the sampling theorem (page 231), whichstates that for 𝑚 candidates,

𝑛 ≥1

2𝜀2ln (2𝑚

𝛾 )

samples suffice to achieve a confidence level 1 − 𝛾 that all estimates 𝑋(𝑗)/𝑛 are within margin of error ±𝜀.

Use the union bound to demonstrate that this latter result follows from the result for a single candidate.


Part V

Cryptography

237

CHAPTER

TWENTYTWO

INTRODUCTION TO CRYPTOGRAPHY

Security and privacy are core principles in computing, enabling a wide range of applications including online com-merce, social networking, wireless communication, and so on. Cryptography, which is concerned with techniques andprotocols for secure communication, is fundamental to building systems that provide security and privacy. In this unit,we will examine several cryptographic protocols, which address the following needs:

• authentication: proving one’s identity

• privacy/confidentiality: ensuring that no one can read the message except the intended receiver

• integrity: guaranteeing that the received message has not been altered in any way

Our standard problem setup is that we have two parties who wish to communicate, traditionally named Alice and Bob.However, they are communicating over an insecure channel, and there is an eavesdropper Eve who can observe alltheir communication, and in some cases, can even modify the data in-flight. How can Alice and Bob communicatewhile achieving the goals of authentication, privacy, and integrity?

A central goal in designing a cryptosystem is Kerckhoff’s principle:

A cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

Thus, we want to ensure that Alice and Bob can communicate securely even if Eve knows every detail about whatprotocol they are using, other than the key, a secret piece of knowledge that is never communicated over the insecurechannel.

We refer to the message that Alice and Bob wish to communicate as the plaintext. We want to design a cryptosystemthat involves encoding the message in such a way as to prevent Eve from recovering the plaintext, even with accessto the ciphertext, the result of encoding the message. There are two levels of security around which we can design acryptosystem:

• Information-theoretic, or unconditional, security: Eve cannot learn the secret message communicated betweenAlice and Bob, even with unbounded computational power.

• Computational, or conditional, security: to learn any information about the secret message, Eve will have tosolve a computationally hard problem.

The cryptosystems we examine all rely to some extent on modular arithmetic. Before we proceed further, we reviewsome basic details about modular arithmetic.

238


22.1 Review of Modular Arithmetic

Modular arithmetic is a mathematical system that maps the infinitely many integers to a finite set, those in0, 1, . . . , 𝑛 − 1 for some positive modulus 𝑛 ∈ Z+. The core concept in this system is that of congruence: twointegers 𝑎 and 𝑏 are said to be congruent modulo 𝑛, written as

𝑎 ≡ 𝑏 (mod 𝑛)

when 𝑎 and 𝑏 differ by an integer multiple of 𝑛:

∃𝑘 ∈ Z. 𝑎 − 𝑏 = 𝑘𝑛

Note that 𝑎 and 𝑏 need not be in the range [0, 𝑛); in fact, if 𝑎 ≠ 𝑏, than at least one must be outside this range for 𝑎 and𝑏 to be congruent modulo 𝑛. More importantly, for any integer 𝑖 ∈ Z, there is exactly one integer 𝑗 ∈ 0, 1, . . . , 𝑛−1such that 𝑖 ≡ 𝑗 (mod 𝑛). This is because the elements in this set are at most 𝑛 − 1 apart, so no two elements differby a multiple of 𝑛. At the same time, by Euclid’s division lemma66, we know that there exist unique integers 𝑞 and 𝑟such that

𝑖 = 𝑛𝑞 + 𝑟

where 0 ≤ 𝑟 < 𝑛. Thus, each integer 𝑖 ∈ Z is mapped to exactly one integer 𝑗 ∈ 0, 1, . . . , 𝑛 − 1 by the congruencerelation modulo 𝑛.

Formally, we can define a set of equivalence classes, denoted by Z𝑛, corresponding to each element in 0, 1, . . . , 𝑛−1:

Z𝑛 = 0, 1, . . . , 𝑛 − 1

Each class 𝑗 consists of the integers that are congruent to 𝑗 modulo 𝑛:

𝑗 = 𝑗, 𝑗 − 𝑛, 𝑗 + 𝑛, 𝑗 − 2𝑛, 𝑗 + 2𝑛, . . .

However, the overline notation here is cumbersome, so we usually elide it, making the equivalence classes implicitinstead:

Z𝑛 = 0, 1, . . . , 𝑛 − 1

We refer to determining the equivalence class of an integer 𝑖 modulo 𝑛 as reducing it modulo $n$. If we know what 𝑖is, we need only compute its remainder when divided by 𝑛 to reduce it. More commonly, we have a complicated ex-pression for 𝑖, consisting of additions, subtractions, multiplications, exponentiations, and so on. Rather than evaluatingthe expression directly, we can take advantage of properties of modular arithmetic to simplify the task of reducing theexpression.

Property 163 Suppose 𝑎 ≡ 𝑎′ (mod 𝑛) and 𝑏 ≡ 𝑏

′ (mod 𝑛) for a modulus 𝑛 ≥ 1. Then

𝑎 + 𝑏 ≡ 𝑎′+ 𝑏

′ (mod 𝑛)

and

𝑎 − 𝑏 ≡ 𝑎′− 𝑏

′ (mod 𝑛)

Proof 164 By definition of congruence, we have 𝑎 − 𝑎′= 𝑘𝑛 and 𝑏 − 𝑏

′= 𝑚𝑛 for some integers 𝑘 and 𝑚. Then

𝑎 + 𝑏 = (𝑘𝑛 + 𝑎′) + (𝑚𝑛 + 𝑏

′)= (𝑘 +𝑚)𝑛 + 𝑎

′+ 𝑏

′

66 https://en.wikipedia.org/wiki/Euclidean_division

22.1. Review of Modular Arithmetic 239

https://en.wikipedia.org/wiki/Euclidean_division


Since 𝑎+ 𝑏 and 𝑎′+ 𝑏

′ differ by an integer (𝑘+𝑚) multiple of 𝑛, we conclude that 𝑎+ 𝑏 ≡ 𝑎′+ 𝑏

′ (mod 𝑛). Theproof for 𝑎 − 𝑏 ≡ 𝑎

′ − 𝑏′ (mod 𝑛) is similar.

Property 165 Suppose 𝑎 ≡ 𝑎′ (mod 𝑛) and 𝑏 ≡ 𝑏

′ (mod 𝑛) for a modulus 𝑛 ≥ 1. Then

𝑎𝑏 ≡ 𝑎′𝑏′ (mod 𝑛)

Proof 166 By definition of congruence, we have 𝑎 − 𝑎′= 𝑘𝑛 and 𝑏 − 𝑏

′= 𝑚𝑛 for some integers 𝑘 and 𝑚. Then

𝑎𝑏 = (𝑘𝑛 + 𝑎′) ⋅ (𝑚𝑛 + 𝑏

′)= 𝑘𝑚𝑛

2+ 𝑎

′𝑚𝑛 + 𝑏

′𝑘𝑛 + 𝑎

′𝑏′

= (𝑘𝑚𝑛 + 𝑎′𝑚 + 𝑏

′𝑘)𝑛 + 𝑎

′𝑏′

Since 𝑎𝑏 and 𝑎′𝑏′ differ by an integer (𝑘𝑚𝑛 + 𝑎

′𝑚 + 𝑏

′𝑘) multiple of 𝑛, we conclude that 𝑎𝑏 ≡ 𝑎

′𝑏′ (mod 𝑛).

Corollary 167 Suppose 𝑎 ≡ 𝑏 (mod 𝑛). Then for any integer 𝑘 ≥ 0,

𝑎𝑘≡ 𝑏

𝑘 (mod 𝑛)

We can prove Corollary 167 by observing that 𝑎𝑘 = 𝑎 ⋅ 𝑎 ⋅ ⋅ ⋅ ⋅ ⋅ 𝑎 is the product of 𝑘 copies of 𝑎 and applying Property165 along with induction over 𝑘. We leave the details as an exercise.

The following is an example that applies the properties above to reduce a complicated expression.

Example 168 Suppose we wish to find an element 𝑎 ∈ Z7 such that

(2203

⋅ 3281

+ 4370)376 ≡ 𝑎 (mod 7)

Note that the properties above do not give us the ability to reduce any of the exponents modulo 7. (Later, we willsee Fermat’s little theorem (page 251), which does give us a means of simplifying exponents. We have also seenfast modular exponentiation (page 199), but we will not use that here.) We can use Property 167 once we reducethe base

2203

⋅ 3281

+ 4370

which we can recursively reduce using the properties above.

Let’s start with 2203. Observe that 2

3= 8 ≡ 1 (mod 𝑛). Then

2203

= 2201

⋅ 22

= (23)67 ⋅ 4

≡ 167⋅ 4 (mod 7)

≡ 4 (mod 7)



Similarly, 33= 27 ≡ −1 (mod 7), so 3

6≡ 1 (mod 7). This gives us

3281

= 3276

⋅ 33⋅ 3

2

= (36)46 ⋅ 3

3⋅ 3

2

≡ 146⋅ −1 ⋅ 9 (mod 7)

≡ −9 (mod 7)≡ 5 (mod 7)

We also have 43= 2

6= (23)2 ≡ 1 (mod 7), so

4370

= 4369

⋅ 4

= (43)123 ⋅ 4

≡ 4 (mod 7)

Combining these using the addition and multiplication properties above, we get

2203

⋅ 3281

+ 4370

≡ 4 ⋅ 5 + 4 (mod 7)≡ 24 (mod 7)≡ 3 (mod 7)

We can now reduce the full expression:

(2203

⋅ 3281

+ 4370)376 ≡ 3

376 (mod 7)≡ 3

372⋅ 3

4 (mod 7)

≡ (36)62 ⋅ (3

2)2 (mod 7)≡ 1

62⋅ 9

2 (mod 7)≡ 1 ⋅ 2

2 (mod 7)≡ 4 (mod 7)

Thus, 𝑎 ≡ 4 (mod 7).

22.1.1 Division and Modular Inverses

We have seen how to do addition, subtraction, multiplication, and exponentiation in modular arithmetic, as wellas several properties that help in reducing a complication expression using these operations. What about division?Division is not a closed operation over the set of integers, so it is perhaps not surprising that division is not alwayswell-defined in modular arithmetic. However, for some combinations of 𝑛 and 𝑎 ∈ Z𝑛, we can determine a modularinverse that allows us to divide by 𝑎. Recall that in standard arithmetic, dividing by a number 𝑥 is equivalent tomultiplying by the 𝑥

−1, the multiplicative inverse of 𝑥. For example:

21/4 = 21 ⋅ 4−1= 21 ⋅

1

4=

21

4

In the same way, division by 𝑎 is defined modulo 𝑛 exactly when an inverse 𝑎−1 exists for 𝑎 modulo 𝑛.

Theorem 169 Let 𝑛 be a positive integer and 𝑎 be an element of Z+𝑛. An inverse of 𝑎 modulo 𝑛 is an element



𝑏 ∈ Z+𝑛 such that

𝑎 ⋅ 𝑏 ≡ 1 (mod 𝑛)

An inverse of 𝑎, denoted as 𝑎−1, exists modulo 𝑛 if and only if 𝑎 and 𝑛 are coprime, i.e. gcd(𝑎, 𝑛) = 1.

A modular inverse can be efficiently found using an extended version of Euclid’s algorithm (page 5).

Extended Euclidean Algorithm

The extended Euclidean algorithm can be used to find modular inverses where one exists. More generally, given𝑥 > 𝑦 ≥ 0, the algorithm returns a triple (𝑔, 𝑎, 𝑏) such that

𝑔 = 𝑎𝑥 + 𝑏𝑦

and 𝑔 = gcd(𝑥, 𝑦).

The algorithm is as follows:

Algorithm 𝐸𝑥𝑡𝑒𝑛𝑑𝑒𝑑𝐸𝑢𝑐𝑙𝑖𝑑(𝑥, 𝑦) ∶If 𝑦 = 0 return (𝑥, 1, 0)(𝑔, 𝑎′, 𝑏′) ∶= 𝐸𝑥𝑡𝑒𝑛𝑑𝑒𝑑𝐸𝑢𝑐𝑙𝑖𝑑(𝑦, 𝑥 mod 𝑦)

Return (𝑔, 𝑏′, 𝑎′ − 𝑏′⋅ ⌊𝑥𝑦 ⌋)

The complexity bound is the same as for Euclid’s algorithm (page 9) – O(log 𝑥) number of operations. We demon-strate that 𝑔 = 𝑎𝑥 + 𝑏𝑦 by (strong) induction over 𝑦:

• Base case: 𝑦 = 0. The algorithm returns (𝑔, 𝑎, 𝑏) = (𝑥, 1, 0), and we have

𝑎𝑥 + 𝑏𝑦 = 1 ⋅ 𝑥 + 0 ⋅ 0 = 𝑥 = 𝑔

• Inductive step: 𝑦 > 0. Let 𝑥 = 𝑞𝑦 + 𝑟, so that 𝑞 = ⌊𝑥/𝑦⌋ and 𝑟 = 𝑥 mod 𝑦 < 𝑦. Given arguments 𝑦 and 𝑟,the recursion produces (𝑔, 𝑎′, 𝑏′). By the inductive hypothesis, we assume that

𝑔 = 𝑎′𝑦 + 𝑏

′𝑟

The algorithm computes the return values 𝑎 and 𝑏 as

𝑎 = 𝑏′

𝑏 = 𝑎′− 𝑏

′𝑞

Then

𝑎𝑥 + 𝑏𝑦 = 𝑏′𝑥 + (𝑎′ − 𝑏

′𝑞)𝑦

= 𝑏′(𝑞𝑦 + 𝑟) + (𝑎′ − 𝑏

′𝑞)𝑦

= 𝑏′𝑟 + 𝑎

′𝑦

= 𝑔

Thus, we conclude that 𝑔 = 𝑎𝑥 + 𝑏𝑦 as required.

When 𝑔 = 1, we have

𝑎𝑥 = 1 − 𝑏𝑦

𝑏𝑦 = 1 − 𝑎𝑥



which imply that

𝑎𝑥 ≡ 1 (mod 𝑦)𝑏𝑦 ≡ 1 (mod 𝑥)

by definition of modular arithmetic. Thus, the extended Euclidean algorithm computes 𝑎 as the inverse of 𝑥 modulo𝑦 and 𝑏 as the inverse of 𝑦 modulo 𝑥, when these inverses exist.

Example 170 We compute the inverse of 13 modulo 21 using the extended Euclidean algorithm. Since 13and 21 are coprime, such an inverse must exist.

We keep track of the values of 𝑥, 𝑦, 𝑔, 𝑎, and 𝑏 at each recursive step of the algorithm. First, we trace thealgorithm from the initial 𝑥 = 21, 𝑦 = 13 down to the base case:

Step 𝑥 𝑦0 21 131 13 82 8 53 5 34 3 25 2 16 1 0

We can then trace the algorithm back up, computing the return values 𝑔, 𝑎, 𝑏:

Step 𝑥 𝑦 𝑔 𝑎 𝑏6 1 0 1 1 0

5 2 1 1 0 1 − 0 ⋅ ⌊ 21⌋ = 1

4 3 2 1 1 0 − 1 ⋅ ⌊ 32⌋ = −1

3 5 3 1 −1 1 − (−1) ⋅ ⌊ 53⌋ = 2

2 8 5 1 2 −1 − 2 ⋅ ⌊ 85⌋ = −3

1 13 8 1 −3 2 − (−3) ⋅ ⌊ 138⌋ = 5

0 21 13 1 5 −3 − 5 ⋅ ⌊ 2113

⌋ = −8

Thus, we have 𝑏𝑦 = −8 ⋅ 13 ≡ 1 (mod 21). Translating 𝑏 to an element of Z21, we get 𝑏 = −8 ≡ 13(mod 21), which means that 13 is its own inverse modulo 21. We can verify this:

13 ⋅ 13 = 169 = 8 ⋅ 21 + 1 ≡ 1 (mod 21)

Exercise 171 The extended Euclidean algorithm is a constructive proof that when 𝑔𝑐𝑑(𝑎, 𝑛) = 1 for 𝑛 ∈ Z+ and𝑎 ∈ Z+𝑛, an inverse of 𝑎 exists modulo 𝑛. Show that no such inverse exists when 𝑔𝑐𝑑(𝑎, 𝑛) > 1, completing theproof of Theorem 169.



22.2 One-time Pad

We now take a look at the first category of cryptosystems, that of information-theoretic security, which relies onone-time pads.

Definition 172 (One-time Pad) A one-time pad is an encryption technique that relies on a key with the followingproperties:

• The key is a random string at least as long as the plaintext.

• The key is preshared between the communicating parties over some secure channel.

• The key is only used once (hence the name one-time pad).

As an example of a scheme that uses a one-time pad, consider a plaintext message

𝑚 = 𝑚1𝑚2 . . .𝑚𝑛

where each symbol 𝑚𝑖 is a lowercase English letter. Let the secret key also be composed of lowercase letters:

𝑘 = 𝑘1𝑘2 . . . 𝑘𝑛

Here, the key is the same length as the message. We encrypt the message as

𝐸𝑘(𝑚) = 𝑐1𝑐2 . . . 𝑐𝑛

by taking the sum of each plaintext character and the corresponding key character modulo 26:

𝑐𝑖 ≡ 𝑚𝑖 + 𝑘𝑖 (mod 26)

We map between lowercase characters and integers modulo 26, with 0 taken to be the letter a, 1 to be the letter b, andso on. Then decryption is as follows:

𝐷𝑘(𝑐) = 𝑑1𝑑2 . . . 𝑑𝑛, where 𝑑𝑖 ≡ 𝑐𝑖 − 𝑘𝑖 (mod 26)

Applying both operations results in 𝐷𝑘(𝐸𝑘(𝑚)) = 𝑚, the original plaintext message.

As a concrete example, suppose 𝑚 = flower and 𝑘 = lafswl. Then the ciphertext is

𝐸𝑘(𝑚) = qltoac

and we have 𝐷𝑘(𝐸𝑘(𝑚)) = flower.

Observe that Eve has no means of learning any information about 𝑚 from the ciphertext qltoac, other than that themessage is six letters (and we can avoid even that by padding the message with random additional data). Even if Eveknows that the message is in English, she has no way of determining which six-letter word it is. For instance, theciphertext qltoac could have been produced by the plaintext futile and key lragpy. In fact, for any six-lettersequence 𝑠, there is a key 𝑘 such that 𝐸𝑘(𝑠) = 𝑐 for any six-letter ciphertext 𝑐. Without having access to either theplaintext or key directly, Eve cannot tell whether the original message is flower, futile, or some other six-lettersequence.

Thus, a one-time-pad scheme provides information-theoretic security – an adversary cannot recover information aboutthe message that they do not already know. In fact, one-time-pad schemes are the only cryptosystems that provide thislevel of security. However, there are significant tradeoffs, which are exactly the core requirements of a one-time pad:

• The key must be preshared between the communicating parties through some other, secure channel.

• The key has to be as long as the message, which limits the amount of information that can be communicatedgiven a preshared key.

22.2. One-time Pad 244


• The key can only be used once.

These limitations make one-time pads costly to use in practice. Perhaps by relaxing some of these restrictions, we canobtain “good-enough” security at a lower cost? We first take a look at what happens when we reduce the key size – infact, we will take this to the extreme, reducing our key size to a single symbol. This results in a scheme known as aCaesar cipher.

Definition 173 (Caesar Cipher) Let 𝑚 = 𝑚1𝑚2 . . .𝑚𝑛 be a plaintext message. Let 𝑠 be a single symbol to beused as the key. Then in a Caesar cipher, 𝑚 is encrypted as

𝐸𝑠(𝑚) = 𝑐1𝑐2 . . . 𝑐𝑛, where 𝑐𝑖 ≡ 𝑚𝑖 + 𝑠 (mod 26)

and a ciphertext 𝑐 is decrypted as

𝐷𝑠(𝑐) = 𝑑1𝑑2 . . . 𝑑𝑛, where 𝑑𝑖 ≡ 𝑐𝑖 − 𝑠 (mod 26)

The result is 𝐷𝑠(𝐸𝑠(𝑚)) = 𝑚.

Observe that a Caesar cipher is similar to a one-time pad, except that the key only has one character of randomness asopposed to 𝑛; essentially, we have 𝑘𝑖 = 𝑠 for all 𝑖, i.e. a key where all the characters are the same.

Unfortunately, the Caesar cipher suffers from a fatal flaw in that it can be defeated by statistical analysis – in particular,the relative frequencies of letters in the ciphertext allow symbols to be mapped to the underlying message, using afrequency table of how common individual letters are in the language in which the message is written. In English, forinstance, the letters e and t are most common. A full graph of frequencies of English letters in “average” English textis as follows:

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

a b c d e f g h i j k l m n o p q r s t u v w x y z

A Caesar cipher merely does a circular shift of these frequencies, making it straightforward to recover the key as themagnitude of that shift. The longer the message, the more likely the underlying frequencies match the average for the



language, and the more easily the scheme is broken.

Exercise 174 Suppose the result of applying a Caesar cipher produces the ciphertext

𝐸𝑠(𝑚) = cadcq

Use frequency analysis to determine both the original message 𝑚 and the key 𝑠.

Note that another weakness in the Caesar-cipher scheme as described above is that the key is restricted to one oftwenty-six possibilities, making it trivial to brute force the mapping. However, the scheme can be tweaked to use amuch larger character set, making it harder to brute force but still leaving it open to statistical attacks in the form offrequency analysis.

A scheme that compromises between a one-time pad and Caesar cipher, by making the key larger than a single char-acter but smaller than the message size, still allows information to leak through statistical attacks. Such a scheme isessentially the same as reusing a one-time pad more than once. What can go wrong if we do so?

Suppose we have the following two plaintext messages

𝑚 = 𝑚1𝑚2 . . .𝑚𝑛

𝑚′= 𝑚

′1𝑚

′2 . . .𝑚

′𝑛

where each character is a lowercase English letter. If we encode them both with the same key 𝑘 = 𝑘1𝑘2 . . . 𝑘𝑛, weobtain the ciphertexts

𝐸𝑘(𝑚) = 𝑐1𝑐2 . . . 𝑐𝑛, where 𝑐𝑖 ≡ 𝑚𝑖 + 𝑘𝑖 (mod 26)𝐸𝑘(𝑚′) = 𝑐

′1𝑐′2 . . . 𝑐

′𝑛, where 𝑐

′𝑖 ≡ 𝑚

′𝑖 + 𝑘𝑖 (mod 26)

If both these ciphertexts go out over the insecure channel, Eve can observe them both and compute their difference:

𝐸𝑘(𝑚) − 𝐸𝑘(𝑚′) = (𝑐1 − 𝑐′1)(𝑐2 − 𝑐

′2) . . . (𝑐𝑛 − 𝑐

′𝑛)

= (𝑚1 −𝑚′1)(𝑚2 −𝑚

′2) . . . (𝑚𝑛 −𝑚

′𝑛)

= 𝑚 −𝑚′

Here, all subtractions are done modulo 26. The end result is the character-by-character difference between the twomessages (modulo 26). Unless the plaintext messages are random strings, this difference is not random! Again,statistical attacks can be used to obtain information about the original messages.

As a pictorial illustration of how the difference of two messages reveals information, the following is an image encodedwith a random key under a one-time pad:

The result appears as just random noise, as we would expect. The following is another image encoded with the samekey:



This result too appears as random noise. However it is the same noise, and if we subtract the two ciphertexts, the noiseall cancels out:

The end result clearly reveals information about the original images.

In summary, the only way to obtain information-theoretic security is by using a one-time-pad scheme, where the key isat least as long as the message and is truly used only once. A one-time-pad-like scheme that compromises on these twocharacteristics opens up the scheme to statistical attacks. The core issue is that plaintext messages are not random –they convey information between parties by virtue of not being random. In a one-time pad, the key is the actual sourceof the randomness in the ciphertext. And if we weaken the scheme by reducing the key size or reusing a key, we loseenough randomness to enable an adversary to learn information about the plaintext messages.


CHAPTER

TWENTYTHREE

DIFFIE-HELLMAN KEY EXCHANGE

Many encryption schemes, including a one-time pad, are symmetric, using the same key for both encryption anddecryption. Such a scheme requires a preshared secret key that is known to both communicating parties. Ideally, we’dlike the two parties to be able to establish a shared key even if all their communication is over an insecure channel.This is known as the key exchange problem.

One solution to the key-exchange problem is the Diffie-Hellman protocol. The central idea is that each party has itsown secret key – this is a private key, since it is never shared with anyone. They each use their own private key togenerate a public key, which they transmit to each other over the insecure channel. A public key is generated in such away that recovering the private key would require solving a computationally hard problem, resulting in computationalrather than information-theoretic security. Finally, each party uses its own private key and the other party’s public keyto obtain a shared secret.

Before we examine the details of the Diffie-Hellman protocol, we discuss some relevant concepts from modular arith-metic. Let Z𝑞 refer to the set of integers between 0 and 𝑞 − 1, inclusive:

Z𝑞 = 0, 1, 2, . . . , 𝑞 − 1

We then define a generator of Z𝑝, where 𝑝 is prime, as follows:

Definition 175 (Generator) An element 𝑔 ∈ Z𝑝, where 𝑝 is prime, is a generator of Z𝑝 if for every nonzeroelement 𝑥 ∈ Z𝑝 (i.e. every element in Z+𝑝 ), there exists a number 𝑖 ∈ N such that:

𝑔𝑖≡ 𝑥 (mod 𝑞)

In other words, 𝑔 is an 𝑖th root of 𝑥 over Z𝑝 for some natural number 𝑖.

As an example, 𝑔 = 2 is a generator of Z5:

20= 1 ≡ 1 (mod 5)

21= 2 ≡ 2 (mod 5)

22= 4 ≡ 4 (mod 5)

23= 8 ≡ 3 (mod 5)

Similarly, 𝑔 = 3 is a generator of Z7:

30= 1 ≡ 1 (mod 7)

31= 3 ≡ 3 (mod 7)

32= 9 ≡ 2 (mod 7)

33= 27 ≡ 6 (mod 7)

34= 81 ≡ 4 (mod 7)

35= 243 ≡ 5 (mod 7)

248


On the other hand, 𝑔 = 2 is not a generator of Z7:

20= 1 ≡ 1 (mod 7)

21= 2 ≡ 2 (mod 7)

22= 4 ≡ 4 (mod 7)

23= 8 ≡ 1 (mod 7)

24= 16 ≡ 2 (mod 7)

25= 32 ≡ 4 (mod 7)

. . .

We see that the powers of 𝑔 = 2 are cyclic, without ever generating the elements 3, 5, 6 ⊆ Z+7 .

Theorem 176 If 𝑝 is prime, then Z𝑝 has at least one generator.

This theorem was first proved by Gauss67. Moreover, Z𝑝 has 𝜑(𝑝 − 1) generators when 𝑝 is prime, where 𝜑(𝑛) isEuler’s totient function68, making it relatively easy to find a generator over Z𝑝.

We now return to the Diffie-Hellman protocol, which is as follows:

Protocol 177 (Diffie-Hellman) Suppose two parties, henceforth referred to as Alice and Bob, wish to establisha shared secret key 𝑘 but can only communicate over an insecure channel. Alice and Bob first establish publicparameters 𝑝, a very large prime number, and 𝑔, a generator of Z𝑝. Then:

• Alice picks a random 𝑎 ∈ Z+𝑝 as her private key, computes 𝐴 ≡ 𝑔𝑎 (mod 𝑝) as her public key, and sends

𝐴 to Bob over the insecure channel.

• Bob picks a random 𝑏 ∈ Z+𝑝 as his private key, computes 𝐵 ≡ 𝑔𝑏 (mod 𝑝) as his public key, and sends 𝐵

to Alice over the insecure channel.

• Alice computes

𝑘 ≡ 𝐵𝑎≡ (𝑔𝑏)𝑎 ≡ 𝑔

𝑎𝑏 (mod 𝑝)

as the shared secret key.

• Bob computes

𝑘 ≡ 𝐴𝑏≡ (𝑔𝑎)𝑏 ≡ 𝑔

𝑎𝑏 (mod 𝑝)

as the shared secret key.

This protocol is efficient – suitable values of 𝑝 and 𝑔 can be obtained from public tables, and Alice and Bob each needto perform two modular exponentiations, which can be done efficiently (page 199). But is it secure? By Kerckhoff’sprinciple, Eve knows how the protocol works, and she observes the values 𝑝, 𝑔, 𝐴 ≡ 𝑔

𝑎 (mod 𝑝), and 𝐵 ≡ 𝑔𝑏

(mod 𝑝). However, she does not know 𝑎 or 𝑏, since those values are never communicated and so are known only toAlice and Bob, respectively. Can Eve obtain 𝑘 ≡ 𝑔

𝑎𝑏 (mod 𝑝) from what she knows? The Diffie-Hellman assumptionstates that she cannot do so efficiently.

Claim 178 (Diffie-Hellman Assumption) There is no efficient algorithm that computes 𝑔𝑎𝑏 (mod 𝑝) given:

• a prime 𝑝,

67 https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss68 https://en.wikipedia.org/wiki/Euler%27s_totient_function

249

https://en.wikipedia.org/wiki/Carl_Friedrich_Gauss

https://en.wikipedia.org/wiki/Euler%27s_totient_function


• a generator 𝑔 of Z𝑝,

• 𝑔𝑎 (mod 𝑝), and

• 𝑔𝑏 (mod 𝑝).

What if, rather than attempting to recover 𝑔𝑎𝑏 (mod 𝑝) directly, Eve attempts to recover one of the private keys 𝑎 or𝑏? This entails solving the discrete logarithm problem.

Definition 179 (Discrete Logarithm) Let 𝑞 be a modulus and let 𝑔 be a generator of Z𝑞 . The discrete logarithmof 𝑥 ∈ Z+𝑞 with respect to the base 𝑔 is an integer 𝑖 such that

𝑔𝑖≡ 𝑥 (mod 𝑞)

The discrete logarithm problem is the task of computing 𝑖, given 𝑞, 𝑔, and 𝑥.

If 𝑞 is prime and 𝑔 is a generator of Z𝑞 , then every 𝑥 ∈ Z+𝑞 has a discrete log 𝑖 with respect to base 𝑔 such that0 ≤ 𝑖 < 𝑞 − 1

69. Thus, a brute-force algorithm to compute the discrete log is to try every possible value in this range.However, such an algorithm requires checking O(𝑞) possible values, and this is exponential in the size of the inputs,which take O(log 𝑞) bits to represent. The brute-force algorithm is thus inefficient.

Do better algorithms exist for computing a discrete log? Indeed they do, including the baby-step giant-step algo-rithm70, which requires O(√𝑞) multiplications on numbers of size O(log 𝑞). However, as we saw in primality testing(page 196), this is still not polynomial in the input size O(log 𝑞). In fact, the discrete logarithm assumption states thatno efficient algorithm exists.

Claim 180 (Discrete Logarithm Assumption) There is no efficient algorithm to compute 𝑖 given:

• 𝑞,

• a generator 𝑔 of Z𝑞 , and

• 𝑔𝑖 (mod 𝑞),

where 0 ≤ 𝑖 < 𝑞 − 1.

Neither the Diffie-Hellman nor the discrete logarithm assumption have been proven. However, they have both heldup in practice, as there is no known algorithm to defeat either assumption on classical computers. As we will brieflydiscuss later, the assumptions do not hold on quantum computers (page 256), but this is not yet a problem in practiceas no scalable quantum computer exists.

69 This is a consequence of Fermat’s little theorem (page 251), which states that 𝑥𝑞−1≡ 1 (mod 𝑞) for prime 𝑞 and 𝑥 ∈ Z+𝑞 .

70 https://en.wikipedia.org/wiki/Baby-step_giant-step

250

https://en.wikipedia.org/wiki/Baby-step_giant-step

https://en.wikipedia.org/wiki/Baby-step_giant-step

CHAPTER

TWENTYFOUR

RSA

The Diffie-Hellman protocol uses private and public keys for key exchange, resulting in a shared key known to bothcommunicating parties, which can then be used in a symmetric encryption scheme. However, the concept of a public-key cryptosystem can also be used to formulate an asymmetric encryption protocol, which uses different keys forencryption and decryption. The RSA (Rivest-Shamir-Adleman) cryptosystem gives rise to one such protocol that iswidely used in practice.

As with Diffie-Hellman, the RSA system relies on facts about modular arithmetic. Core to the working of RSA isFermat’s little theorem and its variants.

Theorem 181 (Fermat’s Little Theorem) Let 𝑝 be a prime number. Let 𝑎 be any element of Z+𝑝 , where

Z+𝑝 = 1, 2, . . . , 𝑝 − 1

Then 𝑎𝑝−1

≡ 1 (mod 𝑝).

Corollary 182 Let 𝑝 be a prime number. Then 𝑎𝑝≡ 𝑎 (mod 𝑝) for any element 𝑎 ∈ Z𝑝, where

Z𝑝 = 0, 1, . . . , 𝑝 − 1

As an example, let 𝑝 = 7. Then:

07= 0 ≡ 0 (mod 7)

17= 1 ≡ 1 (mod 7)

27= 128 = 2 + 18 ⋅ 7 ≡ 2 (mod 7)

37= 2187 = 3 + 312 ⋅ 7 ≡ 3 (mod 7)

47= 16384 = 4 + 2340 ⋅ 7 ≡ 4 (mod 7)

57= 78125 = 5 + 11160 ⋅ 7 ≡ 5 (mod 7)

67= 279936 = 6 + 39990 ⋅ 7 ≡ 6 (mod 7)

Fermat’s little theorem can be extended to the product of two distinct primes.

Theorem 183 Let 𝑛 = 𝑝𝑞 be a product of two distinct primes, i.e. 𝑝 ≠ 𝑞. Let 𝑎 be an element of Z𝑛. Then

𝑎1+𝑘𝜑(𝑛)

≡ 𝑎 (mod 𝑛)

where 𝑘 ∈ N and 𝜑(𝑛) is Euler’s totient function, whose value is the number of elements in Z+𝑛 that are coprimeto 𝑛. For a product of two distinct primes 𝑛 = 𝑝𝑞,

𝜑(𝑝𝑞) = (𝑝 − 1)(𝑞 − 1)

251


Proof 184 We prove Theorem 183 using Fermat’s little theorem. First, we show that for any 𝑎 ∈ Z𝑛 where𝑛 = 𝑝𝑞 is the product of two distinct primes 𝑝 and 𝑞, if 𝑎 ≡ 𝑏 (mod 𝑝) and 𝑎 ≡ 𝑏 (mod 𝑞), then 𝑎 ≡ 𝑏(mod 𝑛). By definition of modular arithmetic, we have

𝑎 = 𝑘𝑝 + 𝑏 (since 𝑎 ≡ 𝑏 (mod 𝑝))𝑎 = 𝑚𝑞 + 𝑏 (since 𝑎 ≡ 𝑏 (mod 𝑞))

where 𝑘 and 𝑚 are integers. Then:

𝑘𝑝 + 𝑏 = 𝑚𝑞 + 𝑏

𝑘𝑝 = 𝑚𝑞

Since 𝑝 divides 𝑘𝑝, 𝑝 also divides 𝑚𝑞; however, since 𝑝 is a distinct prime from 𝑞, this implies that 𝑝 divides 𝑚(by Euclid’s lemma71, which states that if a prime 𝑝 divides 𝑎𝑏 where 𝑎, 𝑏 ∈ Z, then 𝑝 must divide at least one of𝑎 or 𝑏). Thus, 𝑚 = 𝑟𝑝 for some integer 𝑟, and we have

𝑎 = 𝑚𝑞 + 𝑏

= 𝑟𝑝𝑞 + 𝑏

≡ 𝑏 (mod 𝑝𝑞)

Now consider 𝑎1+𝑘𝜑(𝑛). If 𝑎 = 0, we trivially have

01+𝑘𝜑(𝑛)

≡ 0 (mod 𝑛)

If 𝑎 ≠ 0, we have:


= 𝑎1+𝑘(𝑝−1)(𝑞−1)

= 𝑎 ⋅ (𝑎𝑝−1)𝑘(𝑞−1)

≡ 𝑎 ⋅ 1𝑘(𝑞−1) (mod 𝑝) (by Fermat’s little theorem)

≡ 𝑎 (mod 𝑝)

Similarly:


= 𝑎1+𝑘(𝑝−1)(𝑞−1)

= 𝑎 ⋅ (𝑎𝑞−1)𝑘(𝑝−1)

≡ 𝑎 ⋅ 1𝑘(𝑝−1) (mod 𝑞) (by Fermat’s little theorem)

≡ 𝑎 (mod 𝑞)

Applying our earlier result, we conclude that 𝑎1+𝑘𝜑(𝑛)≡ 𝑎 (mod 𝑛).

71 https://en.wikipedia.org/wiki/Euclid%27s_lemma

As an example, let 𝑛 = 6. We have 𝜑(𝑛) = 2. Then:

03= 0 ≡ 0 (mod 6)

13= 1 ≡ 1 (mod 6)

23= 8 = 2 + 1 ⋅ 6 ≡ 2 (mod 6)

33= 27 = 3 + 4 ⋅ 6 ≡ 3 (mod 6)

43= 64 = 4 + 10 ⋅ 6 ≡ 4 (mod 6)

53= 125 = 5 + 20 ⋅ 6 ≡ 5 (mod 6)

252

https://en.wikipedia.org/wiki/Euclid%27s_lemma


Proof of Fermat’s Little Theorem

Fermat’s little theorem has many proofs; we take a look at a proof that relies solely on the existence of modularinverses (Theorem 169).

Let 𝑝 be prime, and let 𝑎 ∈ Z+𝑝 be an arbitrary nonzero element of Z𝑝. Consider the set of elements

𝑆 = 1𝑎, 2𝑎, . . . , (𝑝 − 1)𝑎

These elements are all nonzero and distinct when taken modulo 𝑝:

• Suppose 𝑘𝑎 ≡ 0 (mod 𝑝). Since 𝑎 ∈ Z+𝑝 , it has an inverse 𝑎−1 modulo 𝑝, so we can multiply both sides by

𝑎−1 to obtain 𝑘 ≡ 0 (mod 𝑝). Since 𝑘 /≡ 0 (mod 𝑝) for all elements in 1, 2, . . . , 𝑝 − 1, 𝑘𝑎 /≡ 0 (mod 𝑝)

for all elements 𝑘𝑎 ∈ 𝑆.

• Suppose 𝑘𝑎 ≡ 𝑚𝑎 (mod 𝑝). Multiplying both sides by 𝑎−1, we obtain 𝑘 ≡ 𝑚 (mod 𝑝). Since all pairs of

elements in 1, 2, . . . , 𝑝 − 1 are distinct modulo 𝑝, all pairs of elements in 𝑆 = 1𝑎, 2𝑎, . . . , (𝑝 − 1)𝑎 arealso distinct modulo 𝑝.

Since there are 𝑝 − 1 elements in 𝑆, and they are all nonzero and distinct modulo 𝑝, the elements of 𝑆 when takenmodulo 𝑝 are exactly those in Z+𝑝 . Thus, the products of the elements in each set are equivalent modulo 𝑝:

1𝑎 × 2𝑎 × ⋅ ⋅ ⋅ × (𝑝 − 1)𝑎 ≡ 1 × 2 × ⋅ ⋅ ⋅ × (𝑝 − 1) (mod 𝑝)(1 × 2 × ⋅ ⋅ ⋅ × (𝑝 − 1)) ⋅ 𝑎𝑝−1 ≡ 1 × 2 × ⋅ ⋅ ⋅ × (𝑝 − 1) (mod 𝑝)

Since 𝑝 is prime, all elements in Z+𝑝 have an inverse modulo 𝑝, so we multiply both sides above by (1−1 × 2−1 ×

⋅ ⋅ ⋅ × (𝑝 − 1)−1) to obtain

𝑎𝑝−1

≡ 1 (mod 𝑝)

Now that we have established the underlying mathematical facts, we take a look at the RSA encryption protocol.

Protocol 185 (RSA Encryption) Suppose Bob wishes to send a message 𝑚 to Alice, but the two can onlycommunicate over an insecure channel.

• Alice chooses two very large, distinct primes 𝑝 and 𝑞. We assume without loss of generality that 𝑚, wheninterpreted as an integer (using the bitstring representation of 𝑚), is smaller than 𝑛 = 𝑝𝑞; otherwise, 𝑚 canbe divided into smaller pieces that are then sent separately using this protocol.

• Alice chooses an element

𝑒 ∈ Z+𝑛 such that gcd(𝑒, 𝜑(𝑛)) = 1

as her public key, with the requirement that it be coprime with 𝜑(𝑛) = (𝑝 − 1)(𝑞 − 1). She computes theinverse of 𝑒 modulo 𝜑(𝑛)

𝑑 ≡ 𝑒−1 (mod 𝜑(𝑛))

as her private key.

• Alice sends 𝑛 and 𝑒 to Bob.

• Bob computes

𝑐 ≡ 𝑚𝑒 (mod 𝑛)

253


as the ciphertext and sends it to Alice.

• Alice computes

𝑚′≡ 𝑐

𝑑 (mod 𝑛)

with the result that 𝑚′= 𝑚.

This protocol is efficient. Alice can find large primes using an efficient primality test such as the Fermat test (page 197)– in fact, she need only do this once, reusing the same parameters 𝑛, 𝑒, 𝑑 for future communication. Computing theinverse of 𝑒 can be done efficiently using the extended Euclidean algorithm. Finally, modular exponentiation can alsobe done efficiently (page 199).

Does the protocol work? We have

𝑚′≡ 𝑐

𝑑≡ 𝑚

𝑒𝑑 (mod 𝑛)

Since 𝑑 ≡ 𝑒−1 (mod 𝜑(𝑛)), we have

𝑒𝑑 ≡ 1 (mod 𝜑(𝑛))= 1 + 𝑘𝜑(𝑛)

by definition of modular arithmetic, where 𝑘 ∈ N. By Theorem 183,

𝑚′≡ 𝑚

𝑒𝑑 (mod 𝑛)≡ 𝑚

1+𝑘𝜑(𝑛) (mod 𝑛)≡ 𝑚 (mod 𝑛)

Thus, Alice does indeed recover the intended message 𝑚.

Finally, is the protocol secure? The public information that Eve can observe consists of 𝑛, 𝑒, and 𝑐 ≡ 𝑚𝑒 (mod 𝑛).

Eve does not know the private parameters 𝑝, 𝑞, 𝑑, which Alice has kept to herself. Can Eve recover 𝑚? The RSAassumption states that she cannot do so efficiently.

Claim 186 (RSA Assumption) There is no algorithm that efficiently computes 𝑚 given:

• a product of two distinct primes 𝑛,

• an element 𝑒 ∈ Z+𝑛 such that gcd(𝑒, 𝜑(𝑛)) = 1, and

• 𝑚𝑒 (mod 𝑛).

Rather than trying to compute 𝑚 directly, Eve could attempt to compute 𝜑(𝑛), which would allow her to compute𝑑 ≡ 𝑒

−1 (mod 𝜑(𝑛)) and thus decrypt the ciphertext 𝑐 ≡ 𝑚𝑒 (mod 𝑛) using the same process as Alice. However,

recovering 𝜑(𝑛) = (𝑝− 1)(𝑞− 1) is as hard as factoring 𝑛 = 𝑝𝑞, and the factorization hardness assumption states thatshe cannot do this efficiently.

Claim 187 (Factorization Hardness Assumption) There is no efficient algorithm to compute the prime factor-ization 𝑝1, 𝑝2, . . . 𝑝𝑘 of an integer 𝑛, where

𝑛 = 𝑝𝑎1

1 ⋅ 𝑝𝑎2

2 ⋅ ⋅ ⋅ ⋅ ⋅ 𝑝𝑎𝑘

𝑘

for 𝑎𝑖 ∈ Z+.

Like the Diffie-Hellman and discrete logarithm assumptions, the RSA and factorization hardness assumptions havenot been proven, but they appear to hold in practice. And like the former two assumptions, the latter two do not holdon quantum computers (page 256); we will discuss this momentarily.

254


Exercise 188 Suppose that 𝑛 = 𝑝𝑞 is a product of two distinct primes 𝑝 and 𝑞. Demonstrate that obtaining 𝜑(𝑛)is as hard as obtaining 𝑝 and 𝑞 by showing that given 𝜑(𝑛), the prime factors 𝑝 and 𝑞 can be computed efficiently.

Exercise 189 The ElGamal encryption scheme relies on the hardness of the discrete logarithm problem to per-form encryption, like Diffie-Hellman does for key exchange. The following describes how Bob can send anencrypted message to Alice using ElGamal. Assume that a large prime 𝑝 and generator 𝑔 of Z𝑝 have already beenestablished.

• Alice chooses a private key 𝑎 ∈ Z+𝑝 , computes 𝐴 ≡ 𝑔𝑎 (mod 𝑝) as her public key, and sends 𝐴 to Bob.

• Bob chooses a private key 𝑏 ∈ Z+𝑝 , computes 𝐵 ≡ 𝑔𝑏 (mod 𝑝) as his public key, and sends 𝐵 to Alice.

• Alice and Bob both compute the shared key 𝑘 ≡ 𝑔𝑎𝑏 (mod 𝑝).

• Bob encrypts the message 𝑚 as

𝑐 ≡ 𝑚 ⋅ 𝑘 ≡ 𝑚 ⋅ 𝑔𝑎𝑏 (mod 𝑝)

and sends the ciphertext 𝑐 to Alice.

a) Show how Alice can recover 𝑚 from the information she knows (𝑝, 𝑔, 𝑎, 𝐴, 𝐵, 𝑘, 𝑐).

b) Demonstrate that if Eve has an efficient algorithm for recovering 𝑚 from what she can observe (𝑝, 𝑔, 𝐴,𝐵, 𝑐), she also has an efficient method for breaking Diffie-Hellman key exchange.

24.1 RSA Signatures

Thus far, we have focused on the privacy and confidentiality provided by encryption protocols. We briefly considerauthentication and integrity, in the form of a signature scheme using RSA. The goal here is not to keep a secret – rather,given a message, we want to verify the identity of the author as well as the integrity of its contents. The way we doso is to essentially run the RSA encryption protocol “backwards” – rather than having Bob run the encryption processon a plaintext message and Alice the decryption on a ciphertext, we will have Alice apply the decryption function to amessage she wants to sign, and Bob will apply the encryption function to the resulting signed message.

Protocol 190 (RSA Signature) Suppose Alice wishes to send a message 𝑚 to Bob and allow Bob to verify thathe receives the intended message. As with the RSA encryption protocol (page 253), Alice computes (𝑒, 𝑛) as herpublic key and sends them to Bob, and she computes 𝑑 ≡ 𝑒

−1 (mod 𝜑(𝑛)) as her private key. Then:

• Alice computes

𝑠 ≡ 𝑚𝑑 (mod 𝑛)

and sends 𝑚 and 𝑠 to Bob.

• Bob computes

𝑚′≡ 𝑠

𝑒 (mod 𝑛)

and verifies that the result is equal to 𝑚.

24.1. RSA Signatures 255


The correctness of this scheme follows from the correctness of RSA encryption; we have:

𝑚′≡ 𝑠

𝑒 (mod 𝑛)≡ 𝑚

𝑒𝑑 (mod 𝑛)≡ 𝑚

1+𝑘𝜑(𝑛) (mod 𝑛)≡ 𝑚 (mod 𝑛)

Thus, if 𝑚′= 𝑚, Bob can be assured that Alice sent the message – only she knows 𝑑, so only she can efficiently

compute 𝑚𝑑 (mod 𝑛). In essence, the pair 𝑚,𝑚𝑑 (mod 𝑛) acts as a certificate (page 142) that the sender knows the

secret key 𝑑.

In practice, this scheme needs a few more details to avoid the possibility of spoofing, or having someone else send amessage purporting to be from Alice. We leave these details as an exercise.

Exercise 191 The RSA signature scheme as described above is subject to spoofing – it is possible for a thirdparty to produce a pair 𝑚, 𝑠 such that 𝑠 ≡ 𝑚

𝑑 (mod 𝑛).

a) Describe one way in which someone other than Alice can produce a matching pair 𝑚,𝑚𝑑 (mod 𝑛).

b) Propose a modification to the scheme that addresses this issue.

Hint: Think about adding some form of padding to the message to assist in verifying that Alice was theauthor.

24.2 Quantum Computers and Cryptography

The abstraction of standard Turing machines (page 61) does not seem to quite capture the operation of quantumcomputers, but there are other models such as quantum Turing machines72 and quantum circuits73 that do so. Thestandard Church Turing thesis (page 83) still applies – anything that can be computed on a quantum computer can becomputed on a Turing machine. However, a quantum computer is a probabilistic model, so the extended Church-Turingthesis (page 141) does not apply – we do not know whether quantum computers can solve problems more efficientlythan so-called classical computers. There are problems that are known to have efficient probabilistic algorithmson quantum computers but do not have known, efficient probabilistic algorithms on classical computers. Discretelogarithm and integer factorization, the problems that are core to Diffie-Hellman and RSA, are two such problems;they are both efficiently solvable on quantum computers by Shor’s algorithm74.

In practice, scalable quantum computers pose significant implementation challenges, and they are a long way fromposing a risk to the security of Diffie-Hellman and RSA. As of this writing, the largest number known to have beenfactored by Shor’s algorithm on a quantum computer is 35, which was accomplished in 2019. This follows previousrecords of 21 in 2012 and 15 in 2001. Compare this to factorization on classical computers, where the 829-bit RSA-250 number75 was factored in 2020. Typical keys currently used in implementations of RSA have at least 2048 bits,and both quantum and classical computers are far from attacking numbers of this size. Regardless, post-quantumcryptograpy76 is an active area of research, so that if quantum computers do eventually become powerful enough toattack Diffie-Hellman or RSA, other cryptosystems can be used that are more resilient against quantum attacks.

In complexity-theoretic terms, the decision versions of discrete logarithm and integer factorization are in the com-plexity class BQP, which is the quantum analogue of BPP; in other words, BQP is the class of languages that have

72 https://en.wikipedia.org/wiki/Quantum_Turing_machine73 https://en.wikipedia.org/wiki/Quantum_circuit74 https://en.wikipedia.org/wiki/Shor%27s_algorithm75 https://en.wikipedia.org/wiki/RSA_numbers#RSA-25076 https://en.wikipedia.org/wiki/Post-quantum_cryptography

24.2. Quantum Computers and Cryptography 256

https://en.wikipedia.org/wiki/Quantum_Turing_machine

https://en.wikipedia.org/wiki/Quantum_circuit

https://en.wikipedia.org/wiki/Shor%27s_algorithm

https://en.wikipedia.org/wiki/RSA_numbers#RSA-250

https://en.wikipedia.org/wiki/RSA_numbers#RSA-250

https://en.wikipedia.org/wiki/Post-quantum_cryptography

https://en.wikipedia.org/wiki/Post-quantum_cryptography


efficient two-sided-error randomized algorithms (page 214) on quantum computers. We know that

BPP ⊆ BQP

but we do not know whether this containment is strict. And like the relationship between BPP and NP, most com-plexity theorists do not believe that BQP contains all of NP. Neither the decision versions of discrete logarithm nor ofinteger factorization are believed to be NP-Complete – they are believed (but not proven) to be NP-Intermediate, i.e.in the class NPI of languages that are in NP but neither in P nor NP-Complete. It is known that if P ≠ NP, then theclass NPI is not empty.

On the other hand, if P = NP, then computational security is not possible – such security relies on the existence ofhard problems that are efficiently verifiable, and no such problem exists if P = NP.

24.2. Quantum Computers and Cryptography 257

Part VI

Appendix

258

CHAPTER

TWENTYFIVE

APPENDIX

25.1 Proof of the Simplified Chernoff Bounds

We previously showed (page 220) that the unsimplified Chernoff bounds

Pr[𝑋 ≥ (1 + 𝛿)𝜇] ≤ ( 𝑒𝛿

(1 + 𝛿)1+𝛿)𝜇

for 𝛿 > 0

Pr[𝑋 ≤ (1 − 𝛿)𝜇] ≤ ( 𝑒−𝛿

(1 − 𝛿)1−𝛿)𝜇

for 0 < 𝛿 < 1

hold. We now demonstrate that the simplified bounds

Pr[𝑋 ≥ (1 + 𝛿)𝜇] ≤ 𝑒− 𝛿

2𝜇

2+𝛿 for 𝛿 > 0

Pr[𝑋 ≤ (1 − 𝛿)𝜇] ≤ 𝑒− 𝛿

2𝜇

2 for 0 < 𝛿 < 1

follow from the unsimplified bounds.

Proof 192 We first consider the upper-tail Chernoff bound. For 𝛿 > 0, we have

Pr[𝑋 ≥ (1 + 𝛿)𝜇] ≤ ( 𝑒𝛿

(1 + 𝛿)1+𝛿)𝜇

= ( 𝑒𝛿

(𝑒ln(1+𝛿))1+𝛿)𝜇

= ( 𝑒𝛿

𝑒(1+𝛿) ln(1+𝛿))𝜇

= (𝑒𝛿−(1+𝛿) ln(1+𝛿))𝜇

From a list of logarithmic identities77, we obtain the inequality

ln(1 + 𝑥) ≥ 2𝑥

2 + 𝑥

259

https://en.wikipedia.org/wiki/List_of_logarithmic_identities#Inequalities


for 𝑥 > 0. This gives us

− ln(1 + 𝑥) ≤ − 2𝑥

2 + 𝑥

𝑒− ln(1+𝑥)

≤ 𝑒− 2𝑥

2+𝑥

Applying this to our Chernoff-bound expression with 𝑥 = 𝛿, we get

Pr[𝑋 ≥ (1 + 𝛿)𝜇] ≤ (𝑒𝛿−(1+𝛿) ln(1+𝛿))𝜇

≤ (𝑒𝛿−(1+𝛿)2𝛿2+𝛿 )

𝜇

= (𝑒𝛿(2+𝛿)−2𝛿(1+𝛿)

2+𝛿 )𝜇

= (𝑒2𝛿+𝛿2−2𝛿−2𝛿2

2+𝛿 )𝜇

= (𝑒−𝛿2

2+𝛿 )𝜇

= 𝑒− 𝛿

2𝜇

2+𝛿

resulting in the simplified upper-tail bound.

For the lower-tail Chernoff bound, we have for 0 < 𝛿 < 1:

Pr[𝑋 ≤ (1 − 𝛿)𝜇] ≤ ( 𝑒−𝛿

(1 − 𝛿)1−𝛿)𝜇

= ( 𝑒𝛿

(𝑒ln(1−𝛿))1−𝛿)𝜇

= ( 𝑒−𝛿

𝑒(1−𝛿) ln(1−𝛿))𝜇

= (𝑒−𝛿−(1−𝛿) ln(1−𝛿))𝜇

The Taylor series for the natural logarithm78 gives us:

ln(1 − 𝑥) = −𝑥 − 𝑥2

2−

𝑥3

3−

𝑥4

4− . . .

≥ −𝑥 −𝑥2

2for 0 < 𝑥 < 1

− ln(1 − 𝑥) ≤ 𝑥 +𝑥2

2

The second step follows since 𝑥3

3+ 𝑥

4

4+ . . . is positive for 𝑥 > 0. Applying the result to our Chernoff-bound

25.1. Proof of the Simplified Chernoff Bounds 260

https://en.wikipedia.org/wiki/Mercator_series


expression with 𝑥 = 𝛿, we obtain

Pr[𝑋 ≤ (1 − 𝛿)𝜇] ≤ (𝑒−𝛿−(1−𝛿) ln(1−𝛿))𝜇

≤ (𝑒−𝛿+(1−𝛿)(𝛿+𝛿2

2))

𝜇

= (𝑒−𝛿+𝛿−𝛿2+ 𝛿

2

2− 𝛿

3

2 )𝜇

= (𝑒−𝛿2

2− 𝛿

3

2 )𝜇

Since 𝛿 > 0, we have 𝛿3/2 > 0, and

−𝛿2

2−

𝛿3

2≤ −

𝛿2

2

Thus,

Pr[𝑋 ≤ (1 − 𝛿)𝜇] ≤ (𝑒−𝛿2

2− 𝛿

3

2 )𝜇

≤ (𝑒−𝛿2

2 )𝜇

= 𝑒− 𝛿

2𝜇

2

completing our proof of the simplified lower-tail bound.

77 https://en.wikipedia.org/wiki/List_of_logarithmic_identities#Inequalities78 https://en.wikipedia.org/wiki/Mercator_series

25.2 Proof of the Upper-Tail Hoeffding’s Inequality

The upper-tail Hoeffding’s inequality we use (Theorem 157) states that for a sequence of independent indicator randomvariables 𝑋1, . . . , 𝑋𝑛,

Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] ≤ 𝑒

−2𝜀2𝑛

where 𝑋 = 𝑋1 + ⋅ ⋅ ⋅ +𝑋𝑛, E[𝑋𝑖] = 𝑝𝑖, and E [ 1𝑛𝑋] = 1

𝑛∑𝑖 𝑝𝑖 = 𝑝. We proceed to prove a more general variant of

this bound:

Theorem 193 Let 𝑋 = 𝑋1 + ⋅ ⋅ ⋅ +𝑋𝑛, where the 𝑋𝑖 are independent random variables such that 𝑋𝑖 ∈ [0, 1],and the 𝑋𝑖 have expectation E[𝑋𝑖] = 𝑝𝑖, respectively. Let

𝑝 = E [ 1𝑛𝑋] = 1

𝑛∑𝑖

𝑝𝑖


Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] ≤ exp (−2𝜀

2𝑛)

25.2. Proof of the Upper-Tail Hoeffding’s Inequality 261


(Note that exp (𝑥) is alternate notation for 𝑒𝑥.)

Here, we do not require that the 𝑋𝑖 are indicators, just that they are in the range [0, 1].First, we establish a preliminary fact about the expectation of the product of independent random variables.

Lemma 194 Let 𝑋 and 𝑌 be independent random variables. Then

E[𝑋𝑌 ] = E[𝑋] ⋅ E[𝑌 ]

Proof 195 By definition of expectation, we have

E[𝑋𝑌 ] =∑𝑥,𝑦

𝑥𝑦 Pr[𝑋 = 𝑥, 𝑌 = 𝑦]

=∑𝑥

∑𝑦

𝑥𝑦 Pr[𝑋 = 𝑥] ⋅ Pr[𝑌 = 𝑦]

=∑𝑥

(𝑥Pr[𝑋 = 𝑥]∑𝑦

𝑦 Pr[𝑌 = 𝑦])

=∑𝑥

𝑥Pr[𝑋 = 𝑥] ⋅ E[𝑌 ]

= E[𝑌 ]∑𝑥

𝑥Pr[𝑋 = 𝑥]

= E[𝑌 ] ⋅ E[𝑋]

In the second step, we used the fact that 𝑋 and 𝑌 are independent, so that Pr[𝑋 = 𝑥, 𝑌 = 𝑦] = Pr[𝑋 =

𝑥] ⋅ Pr[𝑌 = 𝑦].

By induction, we can conclude that

E [∏𝑖

𝑋𝑖] =∏𝑖

E[𝑋𝑖]

when the 𝑋𝑖 are independent.

We now proceed to prove Theorem 193. We have

Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] = Pr[𝑋 ≥ 𝑛(𝑝 + 𝜀)]

= Pr[𝑒𝑠𝑋 ≥ exp (𝑠𝑛(𝑝 + 𝜀))]

The latter step holds for all 𝑠 ≥ 0. The process of working with the random variable 𝑒𝑠𝑋 rather than 𝑋 directly is the

Chernoff bounding technique; the idea is that small deviations in 𝑋 turn into large deviations in 𝑒𝑠𝑋 , so that Markov’s

inequality (Theorem 135) produces a more useful result. We will choose an appropriate value of 𝑠 later. For any valueof 𝑠, 𝑒𝑠𝑋 is nonnegative, so we can apply Markov’s inequality to obtain:

Pr[𝑒𝑠𝑋 ≥ exp (𝑠𝑛(𝑝 + 𝜀))] ≤ E[𝑒𝑠𝑋]/ exp (𝑠𝑛(𝑝 + 𝜀))= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ E[𝑒𝑠𝑋]= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ E[exp (𝑠(𝑋1 +𝑋2 + ⋅ ⋅ ⋅ +𝑋𝑛))]= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ E[𝑒𝑠𝑋1𝑒

𝑠𝑋2 . . . 𝑒𝑠𝑋𝑛]

= exp (−𝑠𝑛(𝑝 + 𝜀))∏𝑖

E[𝑒𝑠𝑋𝑖]

In the last step, we applied the fact that the 𝑋𝑖 are independent to conclude that the random variables 𝑒𝑠𝑋𝑖 are also

independent, and therefore the expectation of their product is the product of their expectations.



We now need to establish a bound on E[𝑒𝑠𝑋𝑖].

Lemma 196 Let 𝑋 be a random variable such that 𝑋 ∈ [0, 1] and E[𝑋] = 𝑝. Then

E[𝑒𝑠𝑋] ≤ exp (𝑠𝑝 + 𝑠2/8)

for all 𝑠 ∈ R.

Proof 197 We first observe that because 𝑒𝑠𝑥 is a convex function79, we can bound it from above on the interval

[0, 1] by the line that passes through the endpoints 1 and 𝑒𝑠:

1

20

0 1

𝑒!

𝑥𝑒! − 𝑥 + 1

𝑒!"

This line has slope 𝑒𝑠 − 1 and y-intercept 1, giving us 𝑥𝑒𝑠 − 𝑥 + 1. Thus,

𝑒𝑠𝑥≤ 𝑥𝑒

𝑠− 𝑥 + 1

in the interval 0 ≤ 𝑥 ≤ 1. Then

E[𝑒𝑠𝑋] ≤ E[𝑋𝑒𝑠−𝑋 + 1]

= (𝑒𝑠 − 1)E[𝑋] + 1

= 𝑝𝑒𝑠− 𝑝 + 1

where the second step follows from linearity of expectation. We now have an upper bound on E[𝑒𝑠𝑋], but it isnot a convenient one – we would like it to be of the form 𝑒

𝛼, so that we can better use it in proving Theorem 193,


https://en.wikipedia.org/wiki/Convex_function


for which we’ve already obtained a bound that contains an exponential. Using the fact that 𝑧 = 𝑒ln 𝑧 , we get

E[𝑒𝑠𝑋] ≤ 𝑝𝑒𝑠− 𝑝 + 1

= exp (ln(𝑝𝑒𝑠 − 𝑝 + 1))

To further bound this expression, let

𝜑(𝑠) = ln(𝑝𝑒𝑠 − 𝑝 + 1)

Then finding an upper bound for 𝜑(𝑠) will in turn allow us to bound 𝑒𝜑(𝑠). By the Lagrange form of Taylor’s

theorem80, we get

𝜑(𝑠) = 𝜑(0) + 𝑠𝜑′(0) + 1

2𝑠2𝜑′′(𝑣)

for some 𝑣 ∈ [0, 𝑠]. To differentiate 𝜑(𝑠), we let 𝑓(𝑠) = ln 𝑠 and 𝑔(𝑠) = 𝑝𝑒𝑠 − 𝑝+ 1, so that 𝜑(𝑠) = (𝑓 𝑔)(𝑠).

Then by the chain rule81, we have82

𝜑′(𝑠) = (𝑓 𝑔)′(𝑠)

= 𝑓′(𝑔(𝑠)) ⋅ 𝑔′(𝑠)

=1

𝑔(𝑠) ⋅ 𝑔′(𝑠) (since 𝑓

′(𝑠) = 𝑑

𝑑𝑠ln 𝑠 =

1𝑠 )

=1

𝑝𝑒𝑠 − 𝑝 + 1⋅ 𝑝𝑒

𝑠

=𝑝𝑒

𝑠

𝑝𝑒𝑠 − 𝑝 + 1

To compute 𝜑′′(𝑠), we now let 𝑓(𝑠) = 𝑝𝑒

𝑠, so that 𝜑′(𝑠) = 𝑓(𝑠)/𝑔(𝑠). Then by the quotient rule83, we get

𝜑′′(𝑠) = 𝑓

′(𝑠)𝑔(𝑠) − 𝑓(𝑠)𝑔′(𝑠)𝑔2(𝑠)

=𝑝𝑒

𝑠(𝑝𝑒𝑠 − 𝑝 + 1) − 𝑝𝑒𝑠𝑝𝑒

𝑠

(𝑝𝑒𝑠 − 𝑝 + 1)2

=𝑝𝑒

𝑠

𝑝𝑒𝑠 − 𝑝 + 1⋅(𝑝𝑒𝑠 − 𝑝 + 1) − 𝑝𝑒

𝑠

𝑝𝑒𝑠 − 𝑝 + 1

=𝑝𝑒

𝑠

𝑝𝑒𝑠 − 𝑝 + 1(1 −

𝑝𝑒𝑠

𝑝𝑒𝑠 − 𝑝 + 1)

= 𝑡(1 − 𝑡)

where 𝑡 = 𝑝𝑒𝑠

𝑝𝑒𝑠−𝑝+1. Observe that 𝑡(1 − 𝑡) is a parabola with a maximum at 𝑡 = 1

2:

ℎ(𝑡) = 𝑡(1 − 𝑡) = 𝑡 − 𝑡2

ℎ′(𝑡) = 1 − 2𝑡

ℎ′′(𝑡) = −2

Setting ℎ′(𝑡) = 1 − 2𝑡 = 0, we get that 𝑡 = 1

2is an extremum, and since the second derivative ℎ

′′( 12) is negative,


https://en.wikipedia.org/wiki/Taylor%27s_theorem

https://en.wikipedia.org/wiki/Taylor%27s_theorem

https://en.wikipedia.org/wiki/Chain_rule

https://en.wikipedia.org/wiki/Quotient_rule


it is a maximum. Thus, we have

𝜑′′(𝑠) = 𝑡(1 − 𝑡)

≤1

2(1 −

1

2)

=1

4

We can now plug everything into our result from applying Taylor’s theorem:

𝜑(𝑠) = 𝜑(0) + 𝑠𝜑′(0) + 1

2𝑠2𝜑′′(𝑣)

= ln(𝑝 − 𝑝 + 1) + 𝑠𝑝

𝑝 − 𝑝 + 1+

1

2𝑠2𝜑′′(𝑣)

= 𝑠𝑝 +1

2𝑠2𝜑′′(𝑣)

≤ 𝑠𝑝 +1

2𝑠2⋅

1

4

= 𝑠𝑝 +𝑠2

8

Thus,

E[𝑒𝑠𝑋] ≤ 𝑒𝜑(𝑠)

≤ exp (𝑠𝑝 + 𝑠2/8)

as claimed. 79 https://en.wikipedia.org/wiki/Convex_function80 https://en.wikipedia.org/wiki/Taylor%27s_theorem81 https://en.wikipedia.org/wiki/Chain_rule82 The function 𝑔(𝑠) = 𝑝𝑒

𝑠 − 𝑝 + 1 does not have any real roots for 𝑝 ∈ [0, 1], so it is safe to divide by it.83 https://en.wikipedia.org/wiki/Quotient_rule

Continuing our proof of Theorem 193, we have

Pr[𝑒𝑠𝑋 ≥ exp (𝑠𝑛(𝑝 + 𝜀))] ≤ exp (−𝑠𝑛(𝑝 + 𝜀))∏𝑖

E[𝑒𝑠𝑋𝑖]

Applying Lemma 196, we get

Pr[𝑒𝑠𝑋 ≥ exp (𝑠𝑛(𝑝 + 𝜀))] ≤ exp (−𝑠𝑛(𝑝 + 𝜀))∏𝑖

E[𝑒𝑠𝑋𝑖]

≤ exp (−𝑠𝑛(𝑝 + 𝜀))∏𝑖

exp (𝑠𝑝𝑖 + 𝑠2/8)

= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ exp(∑𝑖

(𝑠𝑝𝑖 + 𝑠2/8))

= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ exp (𝑠𝑛𝑝 + 𝑠2𝑛/8)

= exp (−𝑠𝑛𝜀 + 𝑠2𝑛/8)

We now choose 𝑠 to minimize the exponent 𝑟(𝑠) = −𝑠𝑛𝜀 + 𝑠2𝑛/8. We have:

𝑟(𝑠) = −𝑠𝑛𝜀 + 𝑛

8𝑠2

𝑟′(𝑠) = −𝑛𝜀 + 𝑛

8⋅ 2𝑠 = −𝑛𝜀 +

𝑛

4𝑠

𝑟′′(𝑠) = 𝑛

4



We have another parabola, and since the second derivative is positive, we obtain a minimum for 𝑟(𝑠) at

𝑟′(𝑠) = −𝑛𝜀 + 𝑛

4𝑠 = 0

𝑠 = 4𝜀

Then

𝑟(4𝜀) = −4𝑛𝜀2+

𝑛

8⋅ 16𝜀

2

= −4𝑛𝜀2+ 2𝑛𝜀

2

= −2𝑛𝜀2

Thus,

Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] ≤ Pr[𝑒𝑠𝑋 ≥ exp (𝑠𝑛(𝑝 + 𝜀))]

≤ exp (−2𝑛𝜀2)

completing our proof of Theorem 193.

25.3 General Case of Hoeffding’s Inequality

We can further generalize Theorem 193 to the case where the individual random variables 𝑋𝑖 are in the range [𝑎𝑖, 𝑏𝑖].We start by generalizing Lemma 196 to obtain Hoeffding’s lemma84.

Lemma 198 (Hoeffding’s Lemma) Let 𝑋 be a random variable such that 𝑋 ∈ [𝑎, 𝑏], where 𝑎 < 𝑏, andE[𝑋] = 𝑝. Then

E[𝑒𝑠𝑋] ≤ exp (𝑠𝑝 + 𝑠2(𝑎 − 𝑏)2/8)

for all 𝑠 ∈ R.

Proof 199 Let 𝑋 ′=

𝑋−𝑎𝑏−𝑎

. Then 𝑋′ is a random variable such that 𝑋 ′

∈ [0, 1], and

E[𝑋 ′] = E [𝑋 − 𝑎

𝑏 − 𝑎] = 𝑝 − 𝑎

𝑏 − 𝑎

by linearity of expectation. Let 𝑝′ = 𝑝−𝑎𝑏−𝑎

. Applying Lemma 196 to 𝑋′, we get

E[𝑒𝑠′𝑋′

] ≤ exp (𝑠′𝑝′ + 𝑠′2/8)

Now consider 𝑒𝑠𝑋 . We have 𝑋 = 𝑎 + (𝑏 − 𝑎)𝑋 ′, so

E[𝑒𝑠𝑋] = E[exp (𝑠𝑎 + 𝑠(𝑏 − 𝑎)𝑋 ′)]= 𝑒

𝑠𝑎E[exp (𝑠(𝑏 − 𝑎)𝑋 ′)]

84 https://en.wikipedia.org/wiki/Hoeffding%27s_lemma

25.3. General Case of Hoeffding’s Inequality 266

https://en.wikipedia.org/wiki/Hoeffding%27s_lemma


Let 𝑠′ = 𝑠(𝑏 − 𝑎). From the result above, we have

𝑒𝑠𝑎E[exp (𝑠(𝑏 − 𝑎)𝑋 ′)] = 𝑒

𝑠𝑎E[𝑒𝑠′𝑋′

]≤ 𝑒

𝑠𝑎exp (𝑠′𝑝′ + 𝑠

′2/8)= exp (𝑠𝑎 + 𝑠

′𝑝′+ 𝑠

′2/8)

Substituting 𝑝′=

𝑝−𝑎𝑏−𝑎

and 𝑠′= 𝑠(𝑏 − 𝑎), we obtain the following for the exponent:

𝑠𝑎 + 𝑠′𝑝′+

1

8𝑠′2= 𝑠𝑎 + 𝑠(𝑏 − 𝑎)𝑝 − 𝑎

𝑏 − 𝑎+

1

8𝑠2(𝑏 − 𝑎)2

= 𝑠𝑎 + 𝑠(𝑝 − 𝑎) + 1

8𝑠2(𝑏 − 𝑎)2

= 𝑠𝑝 +1

8𝑠2(𝑏 − 𝑎)2

Thus,

E[𝑒𝑠′𝑋′

] ≤ exp (𝑠𝑎 + 𝑠′𝑝′+ 𝑠

′2/8)= exp (𝑠𝑝 + 𝑠

2(𝑏 − 𝑎)2/8)

as claimed.

We can now derive the general case of Hoeffding’s inequality85.

Theorem 200 (Hoeffding’s Inequality – General Case) Let 𝑋 = 𝑋1+⋅ ⋅ ⋅+𝑋𝑛, where the 𝑋𝑖 are independentrandom variables such that 𝑋𝑖 ∈ [𝑎𝑖, 𝑏𝑖] for 𝑎𝑖 < 𝑏𝑖 and E[𝑋𝑖] = 𝑝𝑖. Let

𝑝 = E [ 1𝑛𝑋] = 1

𝑛∑𝑖

𝑝𝑖


Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] ≤ exp(− 2𝑛

2𝜀2

∑𝑖(𝑏𝑖 − 𝑎𝑖)2)

Proof 201 We proceed as in the proof of Theorem 193 to obtain

Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] = Pr[𝑋 ≥ 𝑛(𝑝 + 𝜀)]

= Pr[𝑒𝑠𝑋 ≥ exp (𝑠𝑛(𝑝 + 𝜀))]≤ E[𝑒𝑠𝑋]/ exp (𝑠𝑛(𝑝 + 𝜀)) (Markov’s inequality)

= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ E[𝑒𝑠𝑋]= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ E[exp (𝑠(𝑋1 +𝑋2 + ⋅ ⋅ ⋅ +𝑋𝑛))]= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ E[𝑒𝑠𝑋1𝑒

𝑠𝑋2 . . . 𝑒𝑠𝑋𝑛]

= exp (−𝑠𝑛(𝑝 + 𝜀))∏𝑖

E[𝑒𝑠𝑋𝑖] (independence of the 𝑋𝑖)

85 https://en.wikipedia.org/wiki/Hoeffding%27s_inequality


https://en.wikipedia.org/wiki/Hoeffding%27s_inequality


Applying Hoeffding’s lemma (Lemma 198), we obtain

exp (−𝑠𝑛(𝑝 + 𝜀))∏𝑖

E[𝑒𝑠𝑋𝑖] ≤ exp (−𝑠𝑛(𝑝 + 𝜀))∏𝑖

exp (𝑠𝑝𝑖 + 𝑠2(𝑏𝑖 − 𝑎𝑖)2/8)

= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ exp(∑𝑖

(𝑠𝑝𝑖 + 𝑠2(𝑏𝑖 − 𝑎𝑖)2/8))

= exp (−𝑠𝑛(𝑝 + 𝜀)) ⋅ exp(𝑠𝑛𝑝 + 𝑠2

8∑𝑖

(𝑏𝑖 − 𝑎𝑖)2)

= exp(−𝑠𝑛𝜀 + 𝑠2

8∑𝑖

(𝑏𝑖 − 𝑎𝑖)2)

We choose 𝑠 to minimize the exponent 𝑟(𝑠):

𝑟(𝑠) = −𝑠𝑛𝜀 + 𝑠2

8∑𝑖

(𝑏𝑖 − 𝑎𝑖)2

𝑟′(𝑠) = −𝑛𝜀 + 𝑠

4∑𝑖


𝑟′′(𝑠) = 1

4∑𝑖


Since 𝑎𝑖 < 𝑏𝑖, we have 𝑏𝑖 − 𝑎𝑖 > 0 for all 𝑖, so that 𝑟′′(𝑠) is positive. Thus, we obtain a minimum for 𝑟(𝑠) when𝑟′(𝑠) = 0:

𝑟′(𝑠) = −𝑛𝜀 + 𝑠

4∑𝑖

(𝑏𝑖 − 𝑎𝑖)2 = 0

𝑠 =4𝑛𝜀

∑𝑖(𝑏𝑖 − 𝑎𝑖)2

Then

𝑟 ( 4𝑛𝜀

∑𝑖(𝑏𝑖 − 𝑎𝑖)2) = −𝑛𝜀 4𝑛𝜀

∑𝑖(𝑏𝑖 − 𝑎𝑖)2+

16𝑛2𝜀2

(∑𝑖(𝑏𝑖 − 𝑎𝑖)2)2⋅

1

8∑𝑖


= −4𝑛

2𝜀2

∑𝑖(𝑏𝑖 − 𝑎𝑖)2+

2𝑛2𝜀2


= −2𝑛

2𝜀2


Thus,

Pr [ 1𝑛𝑋 ≥ 𝑝 + 𝜀] ≤ Pr[𝑒𝑠𝑋 ≥ exp (𝑠𝑛(𝑝 + 𝜀))]

≤ exp(− 2𝑛2𝜀2

∑𝑖(𝑏𝑖 − 𝑎𝑖)2)

completing our proof of the general case of Hoeffding’s inequality.


Part VII

About

269

CHAPTER

TWENTYSIX

ABOUT

This text was originally written for EECS 376, the Foundations of Computer Science course at the University ofMichigan, by Amir Kamil86 in Fall 2020. This is version 0.3 of the text.

This text is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license87.

86 https://web.eecs.umich.edu/~akamil87 https://creativecommons.org/licenses/by-sa/4.0/

270

https://web.eecs.umich.edu/~akamil

https://creativecommons.org/licenses/by-sa/4.0/

INDEX

Symbols0-1 knapsack problem, see Knapsack problem0-1 random variable, see Indicator random vari-

able3CNF, 158

exact, 1923SAT, 158

AAbstract data type, 201Abstraction, 3Accept, 51, 62, 86Accept state, 54Acceptance problem, 99Adversary, 191AKS primality test, 196Algorithms as games, 191All-pairs shortest path, 3, 39

Floyd-Warshall algorithm, 41Alpha approximation, see ApproximationAlphabet, 49

input alphabet, 61tape alphabet, 61

Alternation, 121dovetailing, 126

Amplification, 213, 219, 231one-sided, 213two-sided, 231

Approximation, 177, 185approximation ratio, 177benchmark, 178in expectation, 192𝛼-approximation, 177𝛼-approximation algorithm, 177

Asymmetric encryption, 251Authentication, 238, 255Automata, 52

Chomsky hierarchy, 52finite automata, 52, 54linear-bounded automata, 52pushdown automata, 52state machine, 52

Turing machine, 52, 61Automaton, see Automata

BBaby-step giant-step, 250Backtracking, see Dynamic programming, 36Barber paradox, 96Benchmark, see Approximation, 178Big data, 227Bijective, 89Binary search tree, 201Binomial distribution, 225, 227Black box, 103Boolean formula, 147, 185

3CNF, 1583SAT (language), 158assignment, 147clause, 158CNF, 158conjunctive normal form, 158conversion to 3CNF, 159exact 3CNF, 192literal, 147MAX-3SAT, 192operator, 147SAT (language), 147satisfiability, 147satisfying assignment, 147variable, 147

BPP (complexity class), 214, 256BQP (complexity class), 256

CCardinality, 90

countable, 89countably infinite, 90diagonalization, 93of the integers Z, 90of the pairs of naturals N×N, 90of the positive rationals Q+, 90of the rationals Q, 92of the reals R, 92

271


of the set of finite binarystrings 0,1∗, 94

of the set of languages, 94, 95of the set of machines, 94, 95uncountable, 92

Carmichael number, 196, 198Cell, see Turing machine, 61Certificate, 143, 256Chain rule, 264Chernoff bounding technique, 221, 262Chernoff bounds, 220, 228

proof, 221Chomsky hierarchy, 52Church-Turing thesis, 83, 96

extended, 141, 256Ciphertext, 238Classical computer, 256Clause, 158Clique, see Maximal cliqueClosed, 97

decidable languages, 97recognizable languages, 120two-sided error (BPP), 214

Closest-pair problem, 251D algorithm, 262D algorithm, 29𝛿-strip, 27

CNF, see Conjunctive normal formCo-recognizable, 123Complement, 49Complete graph, 142, 166Completeness, 137Complexity

space, 3, 140time, 3, 140

Complexity class, 140BPP, 214, 256BQP, 256coNP, 217coRP, 213DTIME(t(n)), 140NP, 144NPI, 256P, 141R, 140RE, 140RP, 213ZPP, 215

Compression, 129by program, 129lossless, 129lossy, 129

Computable function, 128Computational security, 238, 257

Computational step, 62Concentration bounds, 220, 226

Chernoff bounds, 220Hoeffding's inequality, 226

Conditional security, 238Confidence level, 227Confidentiality, 238, 255Configuration, 113, 148Congruence, 239Conjunctive normal form, see Boolean formula,

158conversion to 3CNF, 159

Connected, 42minimally connected, 42

coNP, 217Context-free language, 52Context-sensitive language, 52Convex function, 263Cook-Levin theorem, 148

3SAT version, 160Coprime, 4, 241, 251, 253coRP (complexity class), 213Countable, 89Countably infinite, 90Cover

set, 166vertex, 162

Cover-and-remove algorithm, see Vertex cover,177

Crossing edgemaximal cut, 179vertex cover, 164

Cryptography, 238asymmetric, 251ciphertext, 238Diffie-Hellman, 248, 249, 251, 255Diffie-Hellman assumption, 249, 254discrete logarithm assumption, 250,

254ElGamal, 255factorization hardness assumption,

254Kerckhoff's principle, 238, 249key, 238key exchange, 248, 251one-time pad, 244padding, 244, 256plaintext, 238post-quantum, 256private key, 248, 251, 253, 255public key, 248, 251, 253, 255public-key cryptosystem, 251RSA, 251RSA assumption, 254

Index 272


RSA encryption, 253RSA signature, 255spoofing, 256statistical attack, 245symmetric, 248, 251

Cut, see Maximal cut, 179Cycle, 42

DDecidable, 87

efficiently decidable, 141Decide, 51, 86, 87Decider, 86Decision problem, 4, 49Degree, 178Delta strip, see Closest-pair problemDeterministic, 187Deterministic finite automata, see Finite

automataDFA, see Finite automataDiagonalization, 93Diffie-Hellman assumption, 249, 254Diffie-Hellman key exchange, 248, 249, 251,

255Discrete logarithm, 250, 255, 256

baby-step giant-step, 250Shor's algorithm, 256

Discrete logarithm assumption, 250, 254Discrete random variable

Random variable; discrete, 193Divide and conquer, 12Double-cover algorithm, see Vertex cover, 179Dovetailing, 126DTIME, 140Dumb-greedy algorithm, see Knapsack problem,

184Dynamic programming, 30

backtracking, 36bottom-up table, 31top-down memoized, 31top-down recursive, 31

EEfficiently decidable, 141Efficiently verifiable, 143ElGamal encryption, 255Euclid's algorithm, 5

extended, 242potential function, 9time complexity, 11

Euclid's division lemma, 239Euclid's lemma, 198, 252Euler's totient function, 249Event, 188

Exact 3CNF, 192Exact problem, 175Exchange property, 46Expectation, 189

linearity of expectation, 189Extended Church-Turing thesis, 141, 256Extended Euclidean algorithm, 242Extended Fermat's little theorem, 196

FFactorization hardness assumption, 254False negative, 213False positive, 213Fast modular exponentiation, 199Fermat primality test, 197, 254Fermat's little theorem, 251

extended, 196proof, 252

Final state, see Turing machine, 61Finite acceptor, see Finite automataFinite automata, 52, 54

formal definition, 58language of, 59

Finite-state automata, see Finite automataFinite-state machine, see Finite automataFlipping game, 7Floyd-Warshall algorithm, 3, 41Frequency table, 245Fully connected, 45, 142Functional problem, 128, 175

equivalence with decision, 128Fundamental theorem of arithmetic, 198

GGadget, 163GCD, see Greatest common divisorGenerator, 248Geometric distribution, 196, 216Greatest common divisor, 4

Euclid's algorithm, 5Greedy algorithm, 42

HHalt, 62Halting problem, 100Hamiltonian cycle, 169

HAMCYCLE (language), 169Hamiltonian path, 169

HAMPATH (language), 169Hardcode, 106Head, see Turing machine, 61Heuristics, 185Hoeffding's inequality, 226, 230, 232, 234, 261

general case, 266, 267

Index 273


Hoeffding's lemma, 266proof, 261

Hoeffding's lemma, 266

IIn place, 208Incomplete, 138Independent random variables, 190Independent set, 166

INDEPENDENT-SET (language), 166Indicator random variable, 190Information-theoretic security, 238Initial state, see Turing machine, 61Injective, 89Integer factorization, 254, 256

Shor's algorithm, 256Integer multiplication, 22

Karatsuba algorithm, 24Integrity, 238, 255Intermediate vertex, 39Interpreter, 98Inverse, 241

JJob scheduling, 234Joint probability, 190

KKaratsuba algorithm, 24Kerckhoff's principle, 238, 249Key, 238

private, 248, 251, 253, 255public, 248, 251, 253, 255

Key exchange, 248, 251Kleene star, 50KNAPSACK (language), 173Knapsack problem, 173, 183, 185

dumb-greedy algorithm, 184fractional knapsack problem, 184relatively greedy algorithm, 183smart-greedy algorithm, 184

Kolmogorov complexity, 129Kruskal's algorithm, see Minimal spanning tree,

43proof, 46

LL (subscripted language)

𝜀, 1313, 110A376, 135ACC, 99DEC, 136EQ, 108

EVEN, 110FOO, 107HALT, 101HATES-EVENS, 127L376, 136NO-FOO, 123NO-FOO-BAR, 124QTILE, 112R376, 136REJ, 127, 136SmallTM, 136TILE, 112WritesOne, 137Σ*, 135𝜀-ACC, 110𝜀-HALT, 106∅, 125

Language, 4, 49, 87context-free, 52context-sensitive, 52of a machine, 59, 86recursively-enumerable, 52regular, 52

Las Vegas algorithm, 215Law of large numbers, 220LCS, see Longest common subsequenceLibrary, 103Limited budget, 143, 163, 175, 180Linear-bounded automata, 52Linearity of expectation, 189LIS, see Longest increasing subsequenceLiteral, see Boolean formula, 147Load balancing, 234Local-search algorithm, see Maximal cut, 181LONG-PATH (language), 171Longest common subsequence, 33Longest increasing subsequence, 37Loop, 62, 86Lossless compression, 129Lossy compression, 129

MM. C. Escher, 111Mapping reduction, 155Margin of error, 227Markov's inequality, 193, 221, 262

reverse Markov's inequality, 194Master theorem, 13

non-master-theorem recurrences, 21substitution, 21with log factors, 17

Max-3SAT, 192Maximal clique, 175

CLIQUE (language), 166, 175

Index 274


Maximal cut, 179, 185local-search algorithm, 181MAXCUT (language), 180randomized algorithm, 195

Maximal without cycles, 42Maximization problem, 175, 176MAZE, 144Memoization, see Dynamic programming, 31Merge sort, 12, 208Miller-Rabin primality test, 198Minimal spanning tree, 42

definition, 42exchange property, 46Kruskal's algorithm, 43proof of Kruskal's algorithm, 46

Minimal vertex cover, see Vertex coverMinimally connected, 42Minimization problem, 175, 176Modular arithmetic, 239

congruence, 239modulus, 239reduction, 239

Modular exponentiation, 197, 199, 249, 254Modular inverse, 241Modulus, 239Monte Carlo algorithm, 215, 219Monte Carlo method, 219MST, see Minimal spanning treeMultiplication, see Integer multiplication

NNonconstructive, 95NP (complexity class), 144

relationship with P, 145NP-Complete, 157, 256NP-Hard, 146, 157NP-Intermediate, 256NPI (complexity class), 256

OOne-sided-error randomized algorithm,

213amplification, 213

One-time pad, 244reuse, 246

One-to-one, 89Onto, 89Optimal substructure, see Dynamic program-

ming, 30Optimization problem, 175, 176Overlapping subproblems, see Dynamic pro-

gramming, 30

PP (complexity class), 141

relationship with NP, 145Padding, 244, 256Pair-wise disjoint edges, 178PALINDROME, 140Partition, 122, 179Penrose tiling, 112Pi (estimating its value), 219Pivot, 205Plaintext, 238Planar graph, 185Polling, 227

confidence level, 227margin of error, 227sampling theorem, 231, 236with Chernoff bounds, 228with Hoeffding's inequality, 230

POLY, 143Polynomial composition, 141Polynomial hierarchy, 218Polynomial identity testing, 201Polynomial-time computable, 155Polynomial-time reduction, 155Polytime reduction, see Polynomial-time reduc-

tionPost-quantum cryptography, 256Potential function, 9

for Euclid's algorithm, 9Potential method, 9Predicate, 49Primality testing, 196, 254

AKS primality test, 196Carmichael number, 196, 198Fermat primality test, 197, 254Miller-Rabin primality test, 198PRIMES (language), 196pseudoprime, 198PSEUDOPRIMES (language), 198

Prime factorization, 198, 254Prime number theorem, 196Principle of optimality, 30Privacy, 238, 255Private key, 248, 251, 253, 255Probabilistic, 187Probability

Chernoff bounds, 220event, 188expectation, 189Hoeffding's inequality, 226independent random variables, 190indicator random variable, 190joint probability, 190law of large numbers, 220

Index 275


linearity of expectation, 189outcome, 188probability distribution, 188random variable, 188sample space, 188sampling theorem, 231, 236union bound, 231, 235

Program analysis, 137Property, 132Pseudocode, 3, 96, 128Pseudopolynomial time, 183Pseudoprime, 198Public key, 248, 251, 253, 255Public-key cryptosystem, 251Pushdown automata, 52

QQuadrant, 112, 219Quantum computer, 256Quantum Turing maching, 256Quick sort, 205Quotient rule, 264

RR (complexity class), 140Random variable, 188Randomness, 187Rank, 206RE (complexity class), 140Recognizable, 120Recognize, 120Recognizer, 120Recursively-enumerable language, see RE

(complexity class), see Recognizable, 52Reduction, 102

mapping, 155modular arithmetic, 239polynomial-time, 155Turing, 103

Regular language, 52Reject, 51, 62, 86Relatively greedy algorithm, see Knapsack

problem, 183Relatively prime, 198Reverse Markov's inequality, 194Rice's theorem, 133

program analysis, 137Rock-paper-scissors, 187RP (complexity class), 213RSA, 251

ecryption, 253RSA assumption, 254RSA signature, 255

SSample space, 188Sampling theorem, 231, 236SAT, see Boolean formula, 147, 175Satisfiability, see Boolean formula, 147Search problem, 175Self-hosting, 97Semantic property, 132

trivial, 133Set cover, 166

SET-COVER (language), 167Shor's algorithm, 256Shortest path, see All-pairs shortest path, 3, 39Signature, 255Simple path, 39Simulate, 98Skip list, 201Smart-greedy algorithm, see Knapsack problem,

184Sorting algorithm

in place, 208merge sort, 12, 208quick sort, 205rank, 206stable, 208

Soundness, 137Source code, 96Space complexity, 3, 140Spanning tree, see Minimal spanning treeSpoofing, 256State, see Finite automata, 52, 54, see Turing machine,

61final, 61initial, 61

State diagram, 63State machine, 52Statistical attack, 245String, 50Subroutine, 98Subsequence, 33Subset sum, 171SUBSET-SUM (language), 171Substitution, 21Symmetric encryption, 248, 251

TTableau, see Cook-Levin theorem, 149Tape, see Turing machine, 61Taylor's theorem, 264Tiling, 111

non-periodic, 112Penrose tiling, 112Wang tile, 113

Time complexity, 3, 140

Index 276


Transition, 52, 54Transition function, 58, see Turing machine, 61Traveling salesperson problem, 3, 142, 185

definition, 142limited-budget version, 143TSP (language), 144, 171

Tree, 42definitions, 42minimal spanning tree, 42

Trivial semantic property, 133TSP, see Traveling salesperson problemTuring machine, 4, 52, 61, 87

equivalence of models, 83formal definition, 61language of, 86quantum, 256transition function, 61two-tape model, 83universal, 98

Turing reduction, 103Turing-complete, 85, 96Two-sided-error randomized algorithm,

214, 256amplification, 231

Type, 49Type system, 138

UUncompressible, 129Uncomputable, 129Unconditional security, 238Uncountable, 92Undecidable, 95Union bound, 231, 235Universal Turing machine, 98

language of, 99Unrecognizable, 95, 120Unsound, 138

VVerifiable, 142

efficiently verifiable, 143verifier, 143

Verifier, 143Vertex cover, 162, 175

approximation, 177cover-and-remove algorithm, 177double-cover algorithm, 179greedy algorithm, 178search algorithm, 176VERTEX-COVER (language), 163

WWang tile, 113

Well-ordered, 89Window, see Cook-Levin theorem, 151

ZZPP (complexity class), 215

Index 277

Documents

Foundations of Computer Science - GitHub Pages