Kitaev, alyi Vyqu.zju.edu.cn/uploadfile/file/20161012/20161012142714_47779.pdf · using the p ossibilities tum quan hanics mec in organizing computation lo oks all the more e attractiv

Classi al and quantum omputation

A. Yu. Kitaev, A. H. Shen, M. N. Vyalyi

Contents

Foreword ix

Notation xiii

Introdu tion 1

Part 1. Classi al Computation 9

1. Turing ma hines 9

1.1. De�nition of a Turing ma hine 10

1.2. Computable fun tions and de idable predi ates 11

1.3. Turing's thesis and universal ma hines 12

1.4. Complexity lasses 14

2. Boolean ir uits 17

2.1. De�nitions. Complete bases 17

2.2. Cir uits versus Turing ma hines 20

2.3. Basi algorithms. Depth, spa e and width 23

3. The lass NP: Redu ibility and ompleteness 28

3.1. Nondeterministi Turing ma hines 28

3.2. Redu ibility and NP- ompleteness 30

4. Probabilisti algorithms and the lass BPP 36

4.1. De�nitions. Ampli� ation of probability 36

4.2. Primality testing 38

4.3. BPP and ir uit omplexity 42

5. The hierar hy of omplexity lasses 45

5.1. Games ma hines play 45

5.2. The lass PSPACE 48

v

vi Contents

Part 2. Quantum Computation 53

6. De�nitions and notation 55

6.1. The tensor produ t 55

6.2. Linear algebra in Dira 's notation 55

6.3. Quantum gates and ir uits 58

7. Corresponden e between lassi al and quantum omputation 60

8. Bases for quantum ir uits 65

8.1. Exa t realization 65

8.2. Approximate realization 71

8.3. EÆ ient approximation over a omplete basis 75

9. De�nition of Quantum Computation. Examples 82

9.1. Computation by quantum ir uits 82

9.2. Quantum sear h: Grover's algorithm 83

9.3. A universal quantum ir uit 88

9.4. Quantum algorithms and the lass BQP 90

10. Quantum probability 92

10.1. Probability for state ve tors 92

10.2. Mixed states (density matri es) 94

10.3. Distan e fun tions for density matri es 98

11. Physi ally realizable transformations of density matri es 100

11.1. Physi ally realizable superoperators: hara terization 100

11.2. Cal ulation of the probability for quantum omputation 102

11.3. De oheren e 102

11.4. Measurements 105

11.5. The superoperator norm 108

12. Measuring operators 112

12.1. De�nition and examples 112

12.2. General properties 114

12.3. Garbage removal and omposition of measurements 115

13. Quantum algorithms for Abelian groups 116

13.1. The problem of hidden subgroup in (Z

2

)

k

; Simon's

algorithm 117

13.2. Fa toring and �nding the period for raising to a power 119

13.3. Redu tion of fa toring to period �nding 120

13.4. Quantum algorithm for �nding the period: the basi idea122

13.5. The phase estimation pro edure 125

13.6. Dis ussion of the algorithm 130

13.7. Parallelized version of phase estimation. Appli ations 131

13.8. The hidden subgroup problem for Z

k

135

14. The quantum analogue of NP: the lass BQNP 138

Contents vii

14.1. Modi� ation of lassi al de�nitions 138

14.2. Quantum de�nition by analogy 139

14.3. Complete problems 141

14.4. Lo al Hamiltonian is BQNP- omplete 144

14.5. The pla e of BQNP among other omplexity lasses 150

15. Classi al and quantum odes 151

15.1. Classi al odes 153

15.2. Examples of lassi al odes 154

15.3. Linear odes 155

15.4. Error models for quantum odes 156

15.5. De�nition of quantum error orre tion 158

15.6. Shor's ode 161

15.7. The Pauli operators and symple ti transformations 163

15.8. Symple ti (stabilizer) odes 167

15.9. Tori ode 170

15.10. Error orre tion for symple ti odes 172

15.11. Anyons (an example based on the tori ode) 173

Part 3. Solutions 177

S1. Problems of Se tion 1 177













Appendix A. Elementary Number Theory 237

A.1. Modular arithmeti and rings 237

A.2. Greatest ommon divisor and unique fa torization 239

A.3. Chinese remainder theorem 241

A.4. The stru ture of �nite Abelian groups 243

A.5. The stru ture of the group (Z=qZ)

�

245

A.6. Eu lid's algorithm 247

viii Contents

A.7. Continued fra tions 248

Bibliography 253

Index 257

Foreword

In re ent years interest in what is alled \quantum omputers" has grown

extraordinarily. The idea of using the possibilities of quantum me hani s in

organizing omputation looks all the more attra tive now that experimental

work has begun in this area.

However, the prospe ts for physi al realization of quantum omputers

are presently entirely un lear. Most likely this will be a matter of several

de ades. The fundamental a hievements in this area bear at present a purely

mathemati al hara ter.

This book is intended for a �rst a quaintan e with the mathemati al

theory of quantum omputation. For the onvenien e of the reader, we give

at the outset a brief introdu tion to the lassi al theory of omputational

omplexity. The se ond part in ludes the des riptions of basi e�e tive

quantum algorithms and an introdu tion to quantum odes.

The book is based on material from the ourse \Classi al and quan-

tum omputations", given by A. Shen ( lassi al omputations) and A.Kitaev

(quantum omputations) at the Independent Mos ow University in Spring of

1998. In preparing the book we also used materials from the ourse Physi s

229 | Advan ed Mathemati al Methods of Physi s (Quantum Computa-

tion) given by John Preskill and A.Kitaev at the California Institute of

Te hnology in 1998{99 (solutions to some problems in luded in the ourse

were proposed by Andrew Landahl). The original version of this book was

published in Russian [37℄, but the present edition extends it in many ways.

The prerequisites for reading this book are modest. In essen e, it is

enough to know the basi s of linear algebra (as studied in a standard uni-

versity ourse), elementary probability, basi notions of group theory, and a

ix

x Foreword

few on epts from the theory of algorithms (some omputer programming

experien e may do as well as the formal knowledge). Some topi s require

an a quaintan e with Lie groups and homology of manifolds | but only at

the level of de�nitions.

To redu e the amount of information the reader needs to digest at the

�rst reading, part of the material is given in the form of problems and

solutions. Ea h problem is assigned a grade a ording to its diÆ ulty: 1

for an exer ise in use of de�nitions, 2 for a problem that requires some

work, 3 for a diÆ ult problem whi h requires a nontrivial idea. (Of ourse,

the diÆ ulty of a problem is a subje tive thing. Also, if several problems

are based on the same idea, only the �rst of them is marked as diÆ ult).

The grade appears in square bra kets before the problem number. Some

problems are marked with an ex lamation sign, whi h indi ates that they

are almost as important as the main text. Thus, [1!℄ means an easy but

important exer ise, whereas [3℄ is a diÆ ult problem whi h is safe to skip.

Further reading

In this book we fo us on algorithm omplexity (in parti ular, for quan-

tum algorithms), while many related things are not overed. As a gen-

eral referen e on quantum information theory we re ommend the book by

Mi hael Nielsen and Isaa Chuang [51℄, whi h in ludes su h topi s as the

von Neumann entropy, quantum ommuni ation hannels, quantum ryp-

tography, fault-tolerant omputation, and various proposed s hemes for the

realization of a quantum omputer. Another book on quantum omputa-

tion and information was written by Josef Gruska [30℄. Most original papers

on the subje t an be found in the ele toni ar hive at http://arXiv.org,

se tion \Quantum Physi s" (quant-ph).

A knowledgements

A.K. thanks Mi hael Freedman and John Preskill for many inspiring

dis ussions on the topi s in luded in this book. We are grateful to Andrew

Landahl for providing the solution to Problem 3.6 and pointing to some

in onsisten ies in the original manus ript. Among other people who have

helped us to improve the book are David DiVin enzo and Barbara Terhal.

Thanks to the people at AMS, and espe ially to our patient editor Sergei

Gelfand and the opy-editor Natalya Pluzhnikov, for their help in bringing

this book into reality.

The book was written while A.K. was a member of Mi rosoft Resear h

and Calte h, and while A. S. and M.V. were members of Independent Mos-

ow University. The preparation of the original Russian version was started

Foreword xi

while all three of us were working at IMU, and A.K. was a member of

L.D. Landau Institute for Theoreti al Physi s.

A.K. gratefully a knowledges the support from the National S ien e

Foundation through Calte h's Institute for Quantum Informaiton. M.V.

a knowledges the support from the Russian Foundation for Basi Resear h

under grant 02{01{00547.

Notation

_ disjun tion (logi al OR)

^ onjun tion (logi al AND)

: negation

� addition modulo 2 (and also the dire t sum

of linear subspa es)

he

ontrolled NOT gate (p. 62)

blank symbol in the alphabet of a Turing ma hine

Æ(�; �) transition fun tion of a Turing ma hine

Æ

jk

Krone ker symbol

�

S

(�) hara teristi fun tion of the set S

f

�

invertible fun tion orresponding to the

Boolean fun tion f (p. 61)

x

n�1

� � � x

0

number represented by binary digits x

n�1

; : : : ; x

0

g d(x; y) greatest ommon divisor of x and y

a mod q residue of a modulo q

a

b

representation of the rational number a=b in the

form of an irredu ible fra tion

a j b a divides b

a � b (mod q) a is ongruent to b modulo q

A) B A implies B

A, B A is logi ally equivalent to B

L

1

_ L

2

Karp redu tion of predi ates

(L

1

an be redu ed to L

2

(p. 31))

bx the greatest integer not ex eeding x

dxe the least integer not greater than x

xiii

xiv Notation

A

�

set of all �nite words in the alphabet A

E

�

group of hara ters on the Abelian group E,

i.e., Hom(E;U(1))

z

�

omplex onjugate of z

M

�

spa e of linear fun tionals on the spa e M

h�j bra-ve tor (p. 56)

j�i ket-ve tor (p. 56)

h�j�i inner produ t

A

y

Hermitian adjoint operator

b

G unitary operator orresponding to the

permutation G (p. 61)

I

L

identity operator on the spa e L

�

M

proje tion (the operator of proje ting

onto the subspa e M)

Tr

F

A partial tra e of the operator A over the

spa e (tensor fa tor) F (p. 96)

A �B superoperator � 7! A�B (p. 108)

M

n

n-th tensor degree of M

C (a; b; : : : ) spa e generated by the ve tors a; b; : : :

�(U) operator U with quantum ontrol (p. 65)

U [A℄ appli ation of the operator U to a quantum

register (set of qubits) A (p. 58)

E [A℄, E(n; k) error spa es (p. 156)

� (�

1

; �

1

; : : : ; �

n

; �

n

) basis operators on the spa e B

n

(p. 162)

SympCode(F; �) symple ti ode (p. 168)

j � j ardinality of a set or modulus of a number

k � k norm of a ve tor (p. 71)

or operator norm (p. 71)

k � k

tr

tra e norm (p. 98)

k � k

}

superoperator norm (p. 110)

Pr[A℄ probability of the event A

P (� j�) onditional probability (in various ontexts)

P (�;M) quantum probability (p. 95)

f(n) = O(g(n)) there exist numbers C and n

0

su h that f(n) � Cg(n) for all n � n

0

f(n) = (g(n)) there exist numbers C and n

0

su h that f(n) � Cg(n) for all n � n

0

f(n) = �(g(n)) f(n) = O(g(n)) and f(n) = (g(n)) at the

same time

f(n) = poly(n) means the same as f(n) = n

O(1)

poly(n;m) abbreviation for poly(n+m)

Notation xv

N set of natural numbers, i.e., f0; 1; 2; : : : g

Z set of integers

R set of real numbers

C set of omplex numbers

B lassi al bit (set f0; 1g)

B quantum bit (qubit, spa e C

2

| p. 53)

F

q

�nite �eld of q elements

Z=nZ ring of residues modulo n

Z

n

additive group of the ring Z=nZ

(Z=nZ)

�

multipli ative group of invertible elements of

Z=nZ

Sp

2

(n) symple ti group of order n over the �eld F

2

(p. 165)

ESp

2

(n) extended symple ti group of order n over the

�eld F

2

(p. 164)

L(N ) spa e of linear operators on M

L(N ;M) spa e of linear operators from N to M

U(M) group of unitary operators in the spa e M

SU (M) spe ial unitary group in the

spa e M

SO (M) spe ial orthogonal group in the

Eu lidean spa e M

Notation for matri es:

H =

1

p

2

�

1 1

1 �1

�

; K =

�

1 0

0 i

�

;

Pauli matri es: �

x

=

�

0 1

1 0

�

; �

y

=

�

0 �i

i 0

�

; �

z

=

�

1 0

0 �1

�

Notation for omplexity lasses:

NC (p. 23) NP (p. 28) BQP (p. 91)

P (p. 15) MA (p. 138) BQNP (p. 139)

BPP (p. 37) �

k

(p. 46) PSPACE (p. 15)

PP (p. 92) �

k

(p. 46) EXPTIME (p. 22)

P=poly (p. 20)

Introdu tion

All omputers, beginning with Babbage's un onstru ted \analyti al ma h-

ine"

1

and ending with the Cray, are based on the very same prin iples. From

the logi al point of view a omputer onsists of bits (variables, taking the

values 0 and 1), and a program | that is, a sequen e of operations, ea h

using some bits. Of ourse, the newest omputers work faster than old ones,

but progress in this dire tion is limited. It is hard to imagine that the size

of a transistor or another element will ever be smaller than 10

�8

m (the

diameter of the hydrogen atom) or that the lo k frequen y will be greater

than 10

15

Hz (the frequen y of atomi transitions), so that even the super-

omputers of the future will not be able to solve omputational problems

having exponential omplexity. Let us onsider, for example, the problem

of fa toring an integer number x into primes. The obvious method is to at-

tempt to divide x by all numbers from 2 to

p

x. If x has n digits (as written

in the binary form), we need to go through �

p

x � 2

n=2

trials. There ex-

ists an ingenious algorithm that solves the same problem in approximately

exp( n

1=3

) steps ( is a onstant). But even so, to fa tor a number of a mil-

lion digits, a time equal to the age of the Universe would not suÆ e. (There

may exist more e�e tive algorithms, but it seems impossible to dispense

with the exponential.)

There is, however, another way of speeding up the al ulation pro ess

for several spe ial lasses of problems. The situation is su h that ordi-

nary omputers do not employ all the possibilities that are o�ered to us

1

Charles Babbage began his work on the \analyti al ma hine" proje t in 1833. In ontrast

to al ulating devi es already existed at the time, his was supposed to be a universal omputer.

Babbage devoted his whole life to its development, but was not su essful in realizing his dream.

(A simpler, nonuniversal ma hine was partially onstru ted. In fa t, this smaller proje t ould

have been ompleted | in 1991 the ma hine was produ ed in a ordan e with Babbage's design.)

1

2 Introdu tion

by nature. This assertion may seem extremely obvious: in nature there ex-

ists a multitude of pro esses that are unlike operations with zeros and ones.

We might attempt to use those pro esses for the reation of an analog om-

puter. For example, interferen e of light an be used to ompute the Fourier

transform. However, in most ases the gain in speed is not major, i.e., it

depends weakly on the size of the devi e. The reason lies in the fa t that

the equations of lassi al physi s (for example, Maxwell's equations) an be

e�e tively solved on an ordinary digital omputer. What does \e�e tively"

mean? The al ulation of an interferen e pattern may require more time

than the real experiment by a fa tor of a million, be ause the speed of light

is great and the wave length is small. However, as the size of the modelled

physi al system gets bigger, the required number of omputational opera-

tions grows at a moderate rate | as the size raised to some power or, as is

ustomarily said in omplexity theory, polynomially. (As a rule, the number

of operations is proportional to the quantity V t, where V is the volume and

t is the time.) Thus we see that lassi al physi s is too \simple" from the

omputational point of view.

Quantum me hani s is more interesting from this perspe tive. Let us

onsider, for example, a system of n spins. Ea h spin has two so- alled basis

states (0 = \spin up" and 1 = \spin down"), and the whole system has 2

n

basis states jx

1

; : : : ; x

n

i (ea h of the variables x

1

; : : : ; x

n

takes values 0 or 1).

By a general prin iple of quantum me hani s,

P

x

1

;:::;x

n

x

1

;:::;x

n

jx

1

; : : : ; x

n

i

is also a possible state of the system; here

x

1

;:::;x

n

are omplex numbers

alled amplitudes. The summation sign must be understood as a pure for-

mality. In fa t, the \sum" (also alled a superposition) represents a new

mathemati al obje t | a ve tor in a 2

n

-dimensional omplex ve tor spa e.

Physi ally, j

x

1

;:::;x

n

j

2

is the probability to �nd the system in the basis state

jx

1

; : : : ; x

n

i by a measurement of the values of the variables x

j

. (We note

that su h a measurement destroys the superposition.) For this to make sense,

the formula

P

x

1

;:::;x

n

j

x

1

;:::;x

n

j

2

= 1 must hold. Therefore, the general state

of the system (i.e., a superposition) is a unit ve tor in the 2

n

-dimensional

omplex spa e. A state hange over a spe i�ed time interval is des ribed by

a unitary matrix of size 2

n

� 2

n

. If the time interval is very small (� ~=J ,

where J is the energy of spin-spin intera tion and ~ is Plan k's onstant),

then this matrix is rather easily onstru ted; ea h of its elements is easily

al ulated knowing the intera tion between the spins. If, however, we want

to ompute the hange of the state over a large time interval, then it is ne -

essary to multiply su h matri es. For this purpose an exponentially large

number of operations is needed. Despite mu h e�ort, no method has been

found to simplify this omputation (ex ept for some spe ial ases). Most

plausibly, simulation of quantum me hani s is indeed an exponentially hard

omputational problem. One may think this is unfortunate, but let us take

Introdu tion 3

a di�erent point of view: quantum me hani s being hard means it is power-

ful. Indeed, a quantum system e�e tively \solves" a omplex omputational

problem | it models its very self.

Can we use quantum systems for solving other omputational prob-

lems? What would be a mathemati al model of a quantum omputer that

is just as independent of physi al realization as are models of lassi al

omputation?

2

It seems that these questions were �rst posed in 1980 in

the book by Yu. I. Manin [49℄. They were also dis ussed in the works of

R. Feynman [23, 24℄ and other authors. In 1985 D.Deuts h [20℄ proposed a

on rete mathemati al model | the quantum Turing ma hine, and in 1989

an equivalent but more onvenient model | the quantum ir uit [21℄ (the

latter was largely based on Feynman's ideas).

What exa tly is a quantum ir uit? Suppose that we have N spins,

ea h lo ated in a separate numbered ompartment and ompletely isolated

from the surrounding world. At ea h moment of time (as a omputer lo k

ti ks) we hoose, at our dis retion, any two spins and a t on them with an

arbitrary 4 � 4 unitary matrix. A sequen e of su h operations is alled a

quantum ir uit. Ea h operation is determined by a pair of integers, idexing

the spins, and sixteen omplex numbers (the matrix entries). So a quantum

ir uit is a kind of omputer program, whi h an be represented as text and

written on paper. The word \quantum" refers to the way this program is

exe uted.

Let us try to use a quantum ir uit for al ulating a fun tion F : B

n

!

B

m

, where B = f0; 1g is the set of values of a lassi al bit.

3

It is ne essary to

be able to enter the initial data, perform the omputations, and read out the

result. Input into a quantum omputer is a sequen e (x

1

; : : : ; x

n

) of zeros

and ones | meaning that we prepare an initial state jx

1

; : : : ; x

n

; 0; : : : ; 0i.

(The amount of initial data, n, is usually smaller than the overall number

of \memory ells," i.e., of spins, N . The remaining ells are �lled with

zeros.) The initial data are fed into a quantum ir uit, whi h depends on

the problem being solved, but not on the spe i� initial data. The ir uit

turns the initial state into a new quantum state,

j (x

1

; : : : ; x

n

)i =

X

y

1

;:::;y

N

y

1

;:::;y

N

(x

1

; : : : ; x

n

) jy

1

; : : : ; y

N

i;

2

The standard mathemati al model of an ordinary omputer is the Turing ma hine. Most

other models in use are polynomially equivalent to this one and to ea h other, i.e., a problem,

that is solvable in L steps in one model, will be solvable in L

k

steps in another model, where

and k are onstants.

3

Any omputational problem an be posed in this way. For example, if we wish to solve the

problem of fa toring an integer into primes, then (x

1

; : : : ; x

n

) = x (in binary notation) and F (x)

is a list of prime fa tors (in some binary ode).

4 Introdu tion

whi h depends on (x

1

; : : : ; x

n

). It is now ne essary to read out the result. If

the ir uit were lassi al (and orre tly designed to ompute F ), we would

expe t to �nd the answer in the �rstm bits of the sequen e (y

1

; : : : ; y

N

), i.e.,

we seek (y

1

; : : : ; y

m

) = F (x

1

; : : : ; x

n

). To determine the a tual result in the

quantum ase, the values of all spins should be measured. The measurement

may produ e any sequen e of zeros and ones (y

1

; : : : ; y

N

), the probability of

obtaining su h a sequen e being equal to

�

�

y

1

;:::;y

N

(x

1

; : : : ; x

n

)

�

�

2

. A quantum

ir uit is \ orre t" for a given fun tion F if the orre t answer (y

1

; : : : ; y

m

) =

F (x

1

; : : : ; x

n

) is obtained with probability that is suÆ iently lose to 1. By

repeating the omputation several times and hoosing the answer that is

en ountered most frequently, we an redu e the probability of an error to

be as small as we want.

We have just formulated (omitting some details) a mathemati al model

of quantum omputation. Now, two questions arise naturally.

1. For whi h problems does quantum omputation have an advantage in

omparison with lassi al?

2. What system an be used for the physi al realization of a quantum

omputer? (This does not ne essarily have to be a system of spins.)

With regard to the �rst question we now know the following. First, on

a quantum omputer it is possible to model an arbitrary quantum system

in polynomially many steps. This will allow us (when quantum omputers

be ome available) to predi t the properties of mole ules and rystals and to

design mi ros opi ele troni devi es, say, 100 atoms in size. Presently su h

devi es lie at the edge of te hnologi al possibility, but in the future they will

likely be ommon elements of ordinary omputers. So, a quantum omputer

will not be a thing to have in every home or oÆ e, but it will be used to

make su h things.

A se ond example is fa toring integers into primes and analogous num-

ber-theoreti problems. In 1994 P. Shor [62℄ found a quantum algorithm

4

whi h fa tors an n-digit integer in about n

3

steps. This beautiful result

ould have an out ome that is more harmful than useful: fa toring allows

one to break the most ommonly used ryptosystem (RSA), to forge ele -

troni signatures, et . (But anyway, building a quantum omputer is su h

a diÆ ult task that ryptography users may have good sleep | at least, for

the next 10 years.) The method at the ore of Shor's algorithms deals with

Abelian groups. Some non-Abelian generalizations have been found, but it

remains to be seen if they an be applied to any pra ti al problem.

4

Without going into detail, a quantum algorithm is mu h the same thing as a quantum ir uit.

The di�eren e lies in the fa t that a ir uit is de�ned for problems of �xed size (n = onst), whereas

an algorithm applies to any n.

Introdu tion 5

A third example is a sear h for a needed entry in an unsorted database.

Here the gain is not so signi� ant: to lo ate one entry in N we need about

p

N steps on a quantum omputer, ompared to N steps on a lassi al one.

As of this writing, these are all known examples | not be ause quantum

omputers are useless for other problems, but be ause their theory has not

been worked out yet. We an hope that there will soon appear new mathe-

mati al ideas that will lead to new quantum algorithms.

The physi al realization of a quantum omputer is an ex eed-

ingly interesting, but diÆ ult problem. Only a few years ago doubts were

expressed about its solvability in prin iple. The trouble is that an arbitrary

unitary transformation an be realized only with ertain a ura y. Apart

from that, a system of spins or a similar quantum system annot be fully pro-

te ted from the disturban es of the surrounding environment. All this leads

to errors that a umulate in the omputational pro ess. In L � Æ

�1

steps

(where Æ is the pre ision of ea h unitary transformation) the probability of

an error will be of the order of 1, whi h renders the omputation useless. In

part this diÆ ulty an be over ome using quantum error- orre ting odes. In

1996 P. Shor [65℄ proposed a s heme of error orre tion in the quantum om-

puting pro ess (fault-tolerant quantum omputation). The original method

was not optimal but it was soon improved by a number of authors. The

end result amounts to the following. There exists some threshold value Æ

0

su h that for any pre ision Æ < Æ

0

arbitrarily long quantum omputation is

possible. However, for Æ > Æ

0

errors a umulate faster than we an su eed

in orre ting them. By various estimates, Æ

0

lies in the interval from 10

�6

to 10

�2

(the exa t value depends on the hara ter of the disturban es and

the ir uit that is used for error orre tion).

So, there are no obsta les in prin iple for the realization of a quantum

omputer. However, the problem is so diÆ ult that it an be ompared to

the problem of ontrolled thermonu lear synthesis. In fa t, it is essential to

simultaneously satisfy several almost ontradi tory demands:

1. The elements of a quantum omputer | quantum bits (spins or some-

thing similar) | must be isolated from one another and from the envi-

ronment.

2. It is essential to have the possibility to a t sele tively on ea h pair of

quantum bits (at least, on ea h neighboring pair). Generally, one needs

to implement several types of elementary operations ( alled quantum

gates) des ribed by di�erent unitary operators.

3. Ea h of the gates must be realized with pre ision Æ < Æ

0

(see above).

4. The quanum gates must be suÆ iently nontrivial, so that any other

operator is, in a ertain sense, expressible in terms of them.

6 Introdu tion

At the present time there exist several approa hes to the problem of

realizing a quantum omputer.

1. Individual atoms or ions. This �rst-proposed and best-developed idea

exists in several variants. For representing a quantum bit one an employ

both the usual ele tron levels and the levels of �ne and super�ne stru tures.

There is an experimental te hnique for keeping an individual ion or atom in

the trap of a steady magneti or alternating ele tri �eld for a reasonably

long time (of the order of 1 hour). The ion an be \ ooled down" (i.e.,

its vibrational motion eliminated) with the aid of a laser beam. Sele ting

the duration and frequen y of the laser pulses, it is possible to prepare an

arbitrary superposition of the ground and ex ited states. In this way it

is rather easy to ontrol individual ions. Within the trap, one an also

pla e two or more ions at distan es of several mi rons one from another,

and ontrol ea h of them individually. However, it is rather diÆ ult to

horeograph the intera tions between the ions. To this end it has been

proposed that olle tive vibrational modes (ordinary me hani al vibrations

with a frequen y of several MHz) be used. Dipole-dipole intera tions ould

also be used, with the advantage of being a lot faster. A se ond method

(for neutral atoms) is as follows: pla e atoms into separate ele tromagneti

resonators that are oupled to one another (at the moment it is un lear how

to a hieve this te hni ally). Finally, a third method: using several laser

beams, one an reate a periodi potential (\opti al latti e") whi h traps

unex ited atoms. However, an atom in an ex ited state an move freely.

Thus, by ex iting one of the atoms for a ertain time, one lets it move

around and intera t with its neighbors. This �eld of experimental physi s

is now developing rapidly and seems to be very promising.

2. Nu lear magneti resonan e. In a mole ule with several di�erent

nu lear spins, an arbitrary unitary transformation an be realized by a su -

ession of magneti �eld pulses. This has been tested experimentally at

room temperature. However, for the preparation of a suitable initial state,

a temperature < 10

�3

K is required. Apart from diÆ ulties with the ooling,

undesirable intera tions between the mole ules in rease dramati ally as the

liquid freezes. In addition, it is nearly impossible to address a given spin

sele tively if the mole ule has several spins of the same kind.

3. Super ondu ting granules and \quantum dots". Under super-

ool temperatures, the unique degree of freedom of a small (submi ron size)

super ondu ting granule is its harge. It an hange in magnitude by a

multiple of two ele tron harges (sin e ele trons in a super ondu tor are

bound in pairs). Changing the external ele tri potential, one an a hieve

a situation where two harge states have almost the same energy. These

Introdu tion 7

two states an be used as basis states of a quantum bit. The granules in-

tera t with ea h other by means of Josephson jun tions and mutual ele tri

apa itan e. This intera tion an be ontrolled. A quantum dot is a mi-

rostru ture whi h an ontain few ele trons or even a single ele tron. The

spin of this ele tron an be used as a qubit. The diÆ ulty is that one needs

to ontrol ea h granule or quantum dot individually with high pre ision.

This seems harder than in the ase of free atoms, be ause all atoms of the

same type are identi al while parameters of fabri ated stru tures u tuate.

This approa h may eventually su eed, but a new te hnology is required for

its realization.

4. Anyons. Anyons are quasi-parti les (ex itations) in ertain two-dimen-

sional quantum systems, e.g. in a two-dimensional ele tron liquid in mag-

neti �eld. What makes them spe ial is their topologi al properties, whi h

are stable to moderate variation of system parameters. One of the authors

(A.K.) onsiders this approa h espe ially interesting (in view of it being

his own invention, f. [35℄), so that we will des ribe it in more detail. (At

a more abstra t level, the onne tion between quantum omputation and

topology was dis ussed by M. Freedman [25℄.)

The fundamental diÆ ulty in onstru ting a quantum omputer is the

ne essity for realizing unitary transformations with pre ision Æ < Æ

0

, where

Æ

0

is between 10

�2

and 10

�6

. To a hieve this it is ne essary, as a rule, to

ontrol the parameters of the system with still greater pre ision. However,

we an imagine a situation where high pre ision is a hieved automati ally,

i.e., where error orre tion o urs on the physi al level. An example is given

by two-dimensional systems with anyoni ex itations.

All parti les in three-dimensional spa e are either bosons or fermions.

The wave fun tion of bosons does not hange if the parti les are permuted.

The wave fun tion of fermions is multiplied by �1 under a transposition

of two parti les. In any ase, the system is un hanged when ea h of the

parti les is returned to its prior position. In two-dimensional systems, more

omplex behavior is possible. Note, however, that the dis ussion is not about

fundumental parti les, su h as an ele tron, but about ex itations (\defe ts")

in a two-dimensional ele tron liquid. Su h ex itations an move, transform

to ea h other, et ., just like \genuine" parti les.

5

However, ex itations in

the two-dimensional ele tron liquid display some unusual properties. An

ex itation an have a fra tional harge (for example, 1=3 of the harge of

an ele tron). If one ex itation makes a full turn around another, the state

of the surrounding ele tron liquid hanges in a pre isely de�ned manner

5

Fundamental parti les an also be onsidered as ex itations in the va uum whi h is, a tually,

a nontrivial quantum system. The di�eren e is that the va uum is unique, whereas the ele tron

liquid and other \quantum media" an be designed to meet our needs.

8 Introdu tion

that depends on the types of the ex itations and on the topology of the

path, but not on the spe i� traje tory. In the simplest ase, the wave

fun tion gets multiplied by a number (whi h is equal to e

2�i=3

for anyons in

the two-dimensional ele tron liquid in a magneti �eld at the �lling fa tor

1=3). Ex itations with su h properties are alled Abelian anyons. Another

example of Abelian anyons is des ribed (in a mathemati al language) in

Se tion 15.11.

More interesting are non-Abelian anyons, whi h have not yet been ob-

served experimentally. (Theory predi ts their existen e in a two-dimensional

ele tron liquid in a magneti �eld at the �lling fa tor 5=2.) In the presen e

of non-Abelian anyons, the state of the surrounding ele tron liquid is de-

generate, the multipli ity of the degenera y depending on the number of

anyons. In other words, there exist not one, but many states, whi h an

display arbitrary quantum superpositions. It is utterly impossible to a t on

su h a superposition without moving the anyons, so the system is ideally

prote ted from perturbations. If one anyon is moved around another, the

superposition undergoes a ertain unitary transformation. This transforma-

tion is absolutely pre ise. (An error an o ur only if the anyon \gets out of

hand" as a result of quantum tunneling.)

At �rst glan e, the design using anyons seems least realisti . Firstly,

Abelian anyons will not do for quantum omputation, and non-Abelian ones

are still awaiting experimental dis overy. But in order to realize a quantum

omputer, it is ne essary to ontrol (i.e., dete t and drag by a spe i�ed

path) ea h ex itation in the system, whi h will probably be a fra tion of

a mi ron apart from ea h other. This is an ex eedingly omplex te hni al

problem. However, taking into a ount the high demands for pre ision, it

may not be at all easier to realize any of the other approa hes we have men-

tioned. Beyond that, the idea of topologi al quantum omputation, lying at

the foundation of the anyoni approa h, might be expedited by other means.

For example, the quantum degree of freedom prote ted from perturbation,

might shoot up at the end of a \quantum wire" (a one-dimensional ondu -

tor with an odd number of propagating ele troni modes, pla ed in onta t

with a three-dimensional super ondu tor).

Thus, the idea of a quantum omputer looks so very attra tive, and

so very unreal. It is likely that the design of an ordinary omputer was

per eived in just that way at the time of Charles Babbage, whose invention

was realized only a hundred years later. We may hope that in our time the

s ien e and the industry will develop faster, so that we will not have to wait

that long. Perhaps a ouple of fresh ideas plus a few years for working out

a new te hnology will do.

Part 1

Classi al Computation

1. Turing ma hines

Note. In this se tion we address the abstra t notion of omputability, of

whi h we only need a few basi properties. Therefore our exposition here

is very brief. For the most part, the omitted details are simply exer ises

in programming a Turing ma hine, whi h is but a primitive programming

language. A little of programming experien e (in any language) suÆ es to

see that these tasks are doable but tedious.

Informally, an algorithm is a set of instru tions; using it, \we need only to

arry out what is pres ribed as if we were robots: neither understanding, nor

leverness, nor imagination is required of us" [39℄. Applying an algorithm

to its input (initial data) we get some output (result). (It is quite possible

that omputation never terminates for some inputs; in this ase we get no

result.)

Usually inputs and outputs are strings. A string is a �nite sequen e

of symbols ( hara ters, letters) taken from some �nite alphabet. Therefore,

before asking for an algorithm that, say, fa tors polynomials with integer

oeÆ ients, we should spe ify the en oding, i.e., spe ify some alphabet A

and the representation of polynomials by strings over A. For example, ea h

polynomial may be represented by a string formed by digits, letter x, signs

+, � and �. In the answer, two fa tors an be separated by a spe ial

delimiter, et .

One should be areful here be ause sometimes the en oding be omes

really important. For example, if we represent large integers as bit strings (in

binary), it is rather easy to ompare them (to �nd whi h of two given integers

is larger), but multipli ation is more diÆ ult. On the other hand, if an

9

10 1. Classi al Computation

integer is represented by its remainders modulo di�erent primes p

1

; p

2

; : : : ; p

n

(using the Chinese remainder theorem; see Theorem A.5 in Appendix A), it

is easy to multiply them, but omparison is more diÆ ult. So we will spe ify

the en oding in ase of doubt.

We now give a formal de�nition of an algorithm.

1.1. De�nition of a Turing ma hine.

De�nition 1.1. A Turing ma hine (TM) onsists of the following ompo-

nents:

{ a �nite set S alled the alphabet ;

{ an element 2 S (blank symbol);

{ a subset A � S alled the external alphabet ; we assume that the blank

symbol does not belong to A;

{ a �nite set Q whose elements are alled states of the TM;

{ an initial state q

0

2 Q;

{ a transition fun tion, spe i� ally, a partial fun tion

(1.1) Æ : Q� S ! Q� S � f�1; 0; 1g:

(The term \partial fun tion" means that the domain of Æ is a tually a subset

of Q� S. A fun tion that is de�ned everywhere is alled total.)

Note that there are in�nitely many Turing ma hines, ea h representing a

parti ular algorithm. Thus the above omponents are more like a omputer

program. We now des ribe the \hardware" su h programs run on.

A Turing ma hine has a tape that is divided into ells. Ea h ell arries

one symbol from the ma hine alphabet S. We assume that the tape is

in�nite to the right. Therefore, the ontent of the tape is an in�nite sequen e

� = s

0

; s

1

; : : : (where s

i

2 S).

A Turing ma hine also has a read-write head that moves along the tape

and hanges symbols: if we denote its position by p = 0; 1; 2; : : : , the head

an read the symbol s

p

and write another symbol in its pla e.

Position of head O

Cells s

0

s

1

: : : s

p

: : :

Cell numbers 0 1 p

The behavior of a Turing ma hine is determined by a ontrol devi e,

whi h is a �nite-state automaton. At ea h step of the omputation this

devi e is in some state q 2 Q. The state q and the symbol s

p

under the

head determine the a tion performed by the TM: the value of the transition

fun tion, Æ(q; s

p

) = (q

0

; s

0

;�p), ontains the new state q

0

, the new symbol


s

0

, and the shift �p (for example, �p = �1 means that the head moves to

the left).

More formally, the on�guration of a TM is a triple h�; p; qi, where �

is an in�nite sequen e s

0

; : : : ; s

n

; : : : of elements of S, p is a nonnegative

integer, and q 2 Q. At ea h step the TM hanges its on�guration h�; p; qi

as follows:

(a) it reads the symbol s

p

;

(b) it omputes the value of the transition fun tion: Æ(q; s

p

) = (q

0

; s

0

;�p)

(if Æ(q; s

p

) is unde�ned, the TM stops);

( ) it writes the symbol s in ell p of the tape, moves the head by �p, and

passes to state q

0

. In other words, the new on�guration of the TM is

the triple hs

0

; : : : ; s

p�1

; s

0

; s

p+1

; : : : ; p+�p; q

0

i. (If p+�p < 0, the TM

stops.)

Perhaps everyone would agree that these a tions require neither lever-

ness, nor imagination.

It remains to de�ne how the input is given to the TM and how the result

is obtained. Inputs and outputs are strings over A. An input string � is

written on the tape and is padded by blanks. Initially the head is at the

left end of the tape; the initial state is q

0

. Thus the initial on�guration is

h� : : : ; 0; q

0

i. Subsequently, the on�guration is transformed step by step

using the rules des ribed above, and we get the sequen e of on�gurations

h� : : : ; 0; q

0

i; h�

1

; p

1

; q

1

i; h�

2

; p

2

; q

2

i; : : : :

As we have said, this pro ess terminates if Æ is unde�ned or the head bumps

into the (left) boundary of the tape (p + �p < 0). After that, we read

the tape from left to right (starting from the left end) until we rea h some

symbol that does not belong to A. The string before that symbol will be

the output of the TM.

1.2. Computable fun tions and de idable predi ates. Every Turing

ma hine M omputes a partial fun tion '

M

: A

�

! A

�

, where A

�

is the set

of all strings over A. By de�nition, '

M

(�) is the output string for input �.

The value '

M

(�) is unde�ned if the omputation never terminates.

De�nition 1.2. A partial fun tion f from A

�

to A

�

is omputable if there

exists a Turing ma hine M su h that '

M

= f . In this ase we say that f is

omputed by M .

Not all fun tions are omputable be ause the set of all fun tions of type

A

�

! A

�

is un ountable, while the set of all Turing ma hines is ountable.

For on rete examples of non omputable fun tions see Problems 1.3{1.5.


By a predi ate we mean a fun tion with Boolean values: 1 (\true") or

0 (\false"). Informally, a predi ate is a property that an be true or false.

Normally we onsider predi ates whose domain is the set A

�

of all strings

over some alphabet A. Su h predi ates an be identi�ed with subsets of A

�

:

a predi ate P orresponds to the set fx : P (x)g, i.e., the set of strings x for

whi h P (x) is true. Subsets of A

�

are also alled languages over A.

As has been said, a predi ate P is a fun tion A

�

! f0; 1g. A predi ate is

alled de idable if this fun tion is omputable. In other words, a predi ate

P is de idable if there exists a Turing ma hine that answers question \is

P (�) true?" for any � 2 A

�

, giving either 1 (\yes") or 0 (\no"). (Note that

this ma hine must terminate for any � 2 A

�

.)

The notions of a omputable fun tion and a de idable predi ate an be

extended to fun tions and predi ates in several variables in a natural way.

For example, we an �x some separator symbol # that does not belong to

A and onsider a Turing ma hine M with external alphabet A[f#g. Then

a partial fun tion '

M;n

: (A

�

)

n

! A

�

is de�ned as follows:

'

M;n

(�

1

; : : : ; �

n

) = output of M for the input �

1

#�

2

# � � �#�

n

:

The value '

M;n

(�

1

; : : : ; �

n

) is unde�ned if the omputation never terminates

or the output string does not belong A

�

.

De�nition 1.3. A partial fun tion f from (A

�

)

n

to A

�

is omputable if

there is a Turing ma hine M su h that '

M;n

= f .

The de�nition of a de idable predi ate an be given in the same way.

We say that a Turing ma hine works in time T (n) if it performs at most

T (n) steps for any input of size n. Analogously, a Turing ma hine M works

in spa e s(n) if it visits at most s(n) ells for any omputation on inputs of

size n.

1.3. Turing's thesis and universal ma hines. Obviously a TM is an

algorithm in the informal sense. The onverse assertion is alled the Turing

thesis:

\Any algorithm an be realized by a Turing ma hine."

It is alled also the Chur h thesis be ause Chur h gave an alternative

de�nition of omputable fun tions that is formally equivalent to Turing's

de�nition. Note that the Chur h-Turing thesis is not a mathemati al theo-

rem, but rather a statement about our informal notion of algorithm, or the

physi al reality this notion is based upon. Thus the Chur h-Turing thesis

annot be proved, but it is supported by empiri al eviden e. At the early

age of mathemati al omputation theory (1930's), di�erent de�nitions of


algorithm were proposed (Turing ma hine, Post ma hine, Chur h's lambda-

al ulus, G�odel's theory of re ursive fun tions), but they all turned out to

be equivalent to ea h other. The reader an �nd a detailed exposition of the

theory of algorithms in [5, 39, 40, 48, 54, 60, 61℄.

We make some informal remarks about the apabilities of Turing ma-

hines. A Turing ma hine behaves like a person with a restri ted memory,

pen il, eraser, and a notebook with an in�nite number of pages. Pages are

of �xed size; therefore there are �nitely many possible variants of �lling a

page, and these variants an be onsidered as letters of the alphabet of a

TM. The person an work with one page at a time but an then move to the

previous or to the next page. When turning a page, the person has a �nite

amount of information ( orresponding to the state of the TM) in her head.

The input string is written on several �rst pages of the notebook (one

letter per page); the output should be written in a similar way. The om-

putation terminates when the notebook is losed (the head rosses the left

boundary) or when the person does not know what to do next (Æ is unde-

�ned).

Think about yourself in su h a situation. It is easy to realize that by

memorizing a few letters near the head you an perform any a tion in a

�xed-size neighborhood of the head. You an also put extra information

(in addition to letters from the external alphabet) on pages. This means

that you extend the tape alphabet by taking the Cartesian produ t with

some other �nite set that represents possible notes. You an leaf through

the notebook until you �nd a note that is needed. You an reate a free ell

by moving all information along the tape. You an memorize symbols and

then opy them onto free pages of the notebook. Extra spa e on pages may

also be used to store auxiliary strings of arbitrary length (like the initial

word, they are written one symbol per page). These auxiliary strings an

be pro essed by \subroutines". In parti ular, auxiliary strings an be used

to implement ounters (integers that an be in remented and de remented).

Using ounters, we an address a memory ell by its number, et .

Note that in the de�nition of omputability for fun tions of type A

�

!

A

�

we restri t neither the number of auxiliary tape symbols (the set of

symbols S ould be mu h bigger than A) nor the number of states. It

is easy to see, however, that one auxiliary symbol (the blank) is enough.

Indeed, we an represent ea h letter from S nA by a ombination of blanks

and nonblanks. (The details are left to the reader as an exer ise.)

Sin e a Turing ma hine is a �nite obje t (a ording to De�nition 1.1),

it an be en oded by a string. (Note that Turing ma hines with arbitrary

numbers of states and alphabets of any size an be en oded by strings over a

�xed alphabet.) Then for any �xed alphabet A we an onsider a universal


Turing ma hine U . Its input is a pair ([M ℄; x), where [M ℄ is the en oding

of a ma hine M with external alphabet A, and x is a string over A. The

output of U is '

M

(x). Thus U omputes the fun tion u de�ned as follows:

u

�

[M ℄; x

�

= '

M

(x):

This fun tion is universal for the lass of omputable fun tions of type

A

�

! A

�

in the following sense: for any omputable fun tion f : A

�

! A

�

,

there exists someM su h that u([M ℄; x) = f(x) for all x 2 A

�

. (The equality

a tually means that either both u(M;x) and f(x) are unde�ned, or they are

de�ned and equal. Sometimes the notation u([M ℄; x) ' f(x) is used to stress

that both expressions an be unde�ned.)

The existen e of a universal ma hine U is a onsequen e of the Chur h-

Turing thesis sin e our des ription of Turing ma hines was algorithmi . But,

unlike the Chur h-Turing thesis, this is also a mathemati al theorem: the

ma hine U an be onstru ted expli itly and proved to ompute the fun tion

u. The onstru tion is straightforward but boring. It an be explained as

follows: the notebook begins with pages where instru tions (i.e., [M ℄) are

written; the input string x follows the instru tions. The universal ma hine

interprets the instru tions in the following way: it marks the urrent page,

goes ba k to the beginning of the tape, �nds the instru tion that mat hes the

urrent state and the urrent symbol, then returns to the urrent page and

performs the a tion required. A tually, the situation is a bit more omplex:

both the urrent state and the urrent symbol of M have to be represented

in U by several symbols on the tape (be ause the number of states of the

universal ma hine U is �xed whereas the alphabet and the number of states

of the simulated ma hine M are arbitrary). Therefore, we need subroutines

to move strings along the tape, ompare them with instru tions, et .

1.4. Complexity lasses. The omputability of a fun tion does not guar-

antee that we an ompute it in pra ti e: an algorithm omputing it an

require too mu h time or spa e. So from the pra ti al viewpoint we are

interested in e�e tive algorithms.

The idea of an e�e tive algorithm an be formalized in di�erent ways,

leading to di�erent omplexity lasses. Probably the most important is the

lass of polynomial algorithms.

We say that a fun tion f(n) is of polynomial growth if f(n) � n

d

for some onstants ; d and for all suÆ iently large n. (Notation: f(n) =

poly(n).)

Let B be the set f0; 1g. In the sequel we usually onsider fun tions and

predi ates de�ned on B

�

, i.e., on binary strings.


De�nition 1.4. A fun tion F on B

�

is omputable in polynomial time if

there exists a Turing ma hine that omputes it in time T (n) = poly(n),

where n is the length of the input. If F is a predi ate, we say that it is

de idable in polynomial time.

The lass of all fun tions omputable in polynomial time, or all predi ates

de idable in polynomial time (sometimes we all them \polynomial predi-

ates" for brevity), is denoted by P. (Some other omplexity lasses onsid-

ered below are only de�ned for predi ates.) Note that if F is omputable

in polynomial time, then jF (x)j = poly(jxj), sin e the output length annot

ex eed the maximum of the input length and the number of omputation

steps. (Here jtj stands for the length of the string t.)

The omputability in polynomial time is still a theoreti notion: if the

degree of the polynomial is large (or the onstant is large), an algorithm

running in polynomial time may be quite impra ti al.

One may use other omputational models instead of Turing ma hines to

de�ne the lass P. For example, we may use a usual programming language

dealing with integer variables, if we require that all integers used in the

program have at most poly(n) bits.

In speaking about polynomial time omputation, one should be areful

about en oding. For example, it is easy to see that the predi ate that is true

for all unary representations of prime numbers (i.e., strings 1 : : : 1 whose

length N is a prime number) is polynomial. Indeed, the obvious algorithm

that tries to divideN by all numbers�

p

N runs in polynomial time, namely,

poly(N). On the other hand, we do not know whether the predi ate P (x) =

\x is a binary representation of a prime number" is polynomial or not. For

this to be true, there should exist an algorithm with running time poly(n),

where n = blog

2

N is the length of the binary string x. (A probabilisti

polynomial algorithm for this problem is known; see below.)

De�nition 1.5. A fun tion (predi ate) F on B

�

is omputable (de idable)

in polynomial spa e if there exists a Turing ma hine that omputes F and

runs in spa e s(n) = poly(n), where n is the length of the input.

The lass of all fun tions (predi ates) omputable (de idable) in polynomial

spa e is alled PSPACE.

Note that any ma hine that runs in polynomial time also runs in polyno-

mial spa e, therefore P � PSPACE. Most experts believe that this in lusion

is stri t, i.e., P 6= PSPACE, although nobody has su eeded in proving it so

far. This is a famous open problem.

Problems


[1℄ 1.1. Constru t a Turing ma hine that reverses its input (e.g., produ es

\0010111" from \1110100").

[1℄ 1.2. Constru t a Turing ma hine that adds two numbers written in

binary. (Assume that the numbers are separated by a spe ial symbol \+"

that belongs to the external alphabet of the TM.)

[3!℄ 1.3 (\The halting problem is unde idable"). Prove that there is no algo-

rithm that determines, for given Turing ma hine and input string, whether

the ma hine terminates at that input or not.

[2℄ 1.4. Prove that there is no algorithm that enumerates all Turing ma-

hines that do not halt when started with the empty tape.

(Informally, enumeration is a pro ess whi h produ es one element of a set

after another so that every element is in luded in this list. Exa t de�nition:

a set X � A

�

is alled enumerable if it is the set of all possible outputs of

some Turing ma hine E.)

[3℄ 1.5. Let T (n) be the maximum number of steps performed by a Turing

ma hine with � n states and � n symbols before it terminates starting

with the empty tape. Prove that the fun tion T (n) grows faster than any

omputable total fun tion b(n), i.e., lim

n!1

T (n)=b(n) =1.

The mode of operation of Turing ma hines is rather limited and an be

extended in di�erent ways. For example, one an onsider multitape Turing

ma hines that have a �nite number of tapes. Ea h tape has its own head

that an read and write symbols on the tape. There are two spe ial tapes:

an input read-only tape, and an output write-only tape (after writing a

symbol the head moves to the right). A k-tape ma hine has an input tape,

an output tape, and k work tapes.

At ea h step the ma hine reads symbols on all of the tapes (ex ept for

the output tape), and its a tion depends upon these symbols and the urrent

state. This a tion is determined by a transition fun tion that says what the

next state is, what symbols should be written on ea h work tape, what

movement is pres ribed for ea h head (ex ept for the output one), and what

symbol should be written on the output tape (it is possible also that no

symbol is written; in this ase the output head does not move).

Initially all work tapes are empty, and the input string is written on the

input tape. The output is the ontent of the output tape after the TM halts

(this happens when the transition fun tion is unde�ned or when one of the

heads moves past the left end of its tape).

More tapes allow Turing ma hine to work faster; however, the di�eren e

is not so great, as the following problems show.


[2℄ 1.6. Prove that a 2-tape Turing ma hine working in time T (n) for inputs

of length n an be simulated by an ordinary Turing ma hine working in time

O(T

2

(n)).

[3℄ 1.7. Prove that a 3-tape Turing ma hine working in time T (n) for in-

puts of length n an be simulated by a 2-tape ma hine working in time

O

�

T (n) log T (n)

�

.

[3℄ 1.8. LetM be a (single-tape) Turing ma hine that dupli ates the input

string (e.g., produ es \blabla" from \bla"). Let T (n) be its maximum run-

ning time when pro essing input strings of length n. Prove that T (n) � "n

2

for some " and for all n. What an be said about T

0

(n), the minimum

running time for inputs of length n?

[2℄ 1.9. Consider a programming language that in ludes 100 integer vari-

ables, the onstant 0, in rement and de rement statements, and onditions

of type \variable = 0". One may use if-then-else and while onstru ts,

but re ursion is not allowed. Prove that any omputable fun tion of type

Z! Z has a program in this language.

2. Boolean ir uits

2.1. De�nitions. Complete bases. A Boolean ir uit is a representation

of a given Boolean fun tion as a omposition of other Boolean fun tions.

By a Boolean fun tion of n variables we mean a fun tion of type B

n

! B .

(For n = 0 we get two onstants 0 and 1.) Assume that some set A of

Boolean fun tions (basis) is �xed. It may ontain fun tions with di�erent

arity (number of arguments).

A ir uit C over A is a sequen e of assignments. These assignments in-

volve n input variables x

1

; : : : ; x

n

and several auxiliary variables y

1

; : : : ; y

m

;

the j-th assignment has the form y

j

:= f

j

(u

1

; : : : ; u

r

). Here f

j

is some fun -

tion from A, and ea h of the variables u

1

; : : : ; u

r

is either an input variable

or an auxiliary variable that pre edes y

j

. The latter requirement guarantees

that the right-hand side of the assignment is de�ned when we perform it (we

assume that the values of the input variables are de�ned at the beginning;

then we start de�ning y

1

, y

2

, et .).

The value of the last auxiliary variable is the result of omputation.

A ir uit with n input variables x

1

; : : : ; x

n

omputes a Boolean fun tion

F : B

n

! B if the result of omputation is equal to F (x

1

; : : : ; x

n

) for any

values of x

1

; : : : ; x

n

.

If we sele t m auxiliary variables (instead of one) to be the output, we

get the de�nition of the omputation of a fun tion F : B

n

! B

m

by a ir uit.


y

1

y

0

x

1

x

0

z

2

z

1

z

0

^ � ^ �

^ �

�

Fig. 2.1. Cir uit over the basis f^;�g for the addition of two 2-digit

numbers: z

2

z

1

z

0

= x

1

x

0

+ y

1

y

0

.

A ir uit an also be represented by an a y li dire ted graph (as in

Figure 2.1), in whi h verti es of in-degree 0 (inputs | we put them on top

of the �gure) are labeled by input variables; all other verti es (gates) are

labeled with fun tions from A in su h a way that the in-degree of a vertex

mat hes the arity of the fun tion pla ed at that vertex, and ea h in oming

edge is linked to one of the fun tion arguments. Some verti es are alled

output verti es. Sin e the graph is a y li , any value assignment to the

input variables an be extended (uniquely) to a onsistent set of values for

all verti es. Therefore, the set of values at the output verti es is a fun tion

of input values. This fun tion is omputed by a ir uit. It is easy to see

that this representation of a ir uit an be transformed into a sequen e of

assignments, and vi e versa. (We will not use this representation mu h, but

it explains the name \ ir uit".)

A ir uit for a Boolean fun tion is alled a formula if ea h auxiliary

variable, ex ept the last one, is used (i.e., appears on the right-hand side of

an assignment) exa tly on e. The graph of a formula is a tree whose leaves

are labeled by input variables; ea h label may appear any number of times.

(In a general ir uit an auxiliary variable may be used more than on e, in

whi h ase the out-degree of the orresponding vertex is more than 1.)

Why the name \formula"? If ea h auxiliary variable is used only on e, we

an repla e it by its de�nition. Performing all these \inline substitutions",

we get an expression for f that ontains only input variables, fun tions from

the basis, and parentheses. The size of this expression approximately equals

the total length of all assignments. (It is important that ea h auxiliary

variable is used only on e; otherwise we would need to repla e all o urren es

of ea h auxiliary variable by their de�nitions, and the size might in rease

exponentially.)


A basis A is alled omplete if, for any Boolean fun tion f , there is a

ir uit over A that omputes f . (It is easy to see that in this ase any

fun tion of type B

n

! B

m

an be omputed by an appropriate ir uit.)

The most ommon basis ontains the following three fun tions:

NOT(x) = :x; OR(x

1

; x

2

) = x

1

_ x

2

; AND(x

1

; x

2

) = x

1

^ x

2

(negation, disjun tion, onjun tion). Here are the value tables for these

fun tions:

x :x x

1

x

2

x

1

_ x

2

x

1

x

2

x

1

^ x

2

0 1 0 0 0 0 0 0

1 0 0 1 1 0 1 0

1 0 1 1 0 0

1 1 1 1 1 1

Theorem 2.1. The basis fNOT;OR;ANDg = f:;_;^g is omplete.

Proof. Any Boolean fun tion of n arguments is determined by its value

table, whi h ontains 2

n

rows. Ea h row ontains the values of the arguments

and the orresponding value of the fun tion.

If the fun tion takes value 1 only on e, it an be omputed by a onjun -

tion of literals; ea h literal is either a variable or the negation of a variable.

For example, if f(x

1

; x

2

; x

3

) is true (equals 1) only for x

1

= 1; x

2

= 0; x

3

= 1,

then

f(x

1

; x

2

; x

3

) = x

1

^ :x

2

^ x

3

(the onjun tion is asso iative, so we omit parentheses; the order of literals

is also unimportant).

In the general ase, a fun tion f an be represented in the form

(2.1) f(x) =

_

fu: f(u)=1g

�

u

(x);

where u = (u

1

; : : : ; u

n

), and �

u

is the fun tion su h that �

u

(x) = 1 if x = u,

and �

u

(x) = 0 otherwise. �

A representation of type (2.1) is alled a disjun tive normal form (DNF).

By de�nition, a DNF is a disjun tion of onjun tions of literals. Later

we will also need the onjun tive normal form (CNF) | a onjun tion of

disjun tions of literals. Any Boolean fun tion an be represented by a CNF.

This fa t is dual to Theorem 2.1 and an be proved in a dual way (we start

with fun tions that have only one zero in the table). Or we an represent

:f by a DNF and then get a CNF for f by negation using De Morgan's

identities

x ^ y = :(:x _ :y); x _ y = :(:x ^ :y):


These identities show that the basis f:;_;^g is redundant: the subsets

f:;_g and f:;^g also onstitute omplete bases. Another useful example

of a omplete basis is f^;�g.

The number of assignments in a ir uit is alled its size. The minimal

size of a ir uit over A that omputes a given fun tion f is alled the ir uit

omplexity of f (with respe t to the basis A) and is denoted by

A

(f). The

value of

A

(f) depends on A, but the transition from one �nite omplete

basis to another hanges the ir uit omplexity by at most a onstant fa tor:

ifA

1

andA

2

are two �nite omplete bases, then

A

1

(f) = O(

A

2

(f)) and vi e

versa. Indeed, ea h A

2

-assignment an be repla ed by O(1) A

1

-assignments

sin e A

1

is a omplete basis.

We are interested in asymptoti estimates for ir uit omplexity (up to

an O(1)-fa tor); therefore the parti ular hoi e of a omplete basis is not

important. We use the notation (f) for the ir uit omplexity of f with

respe t to some �nite omplete basis.

[2℄ Problem 2.1. Constru t an algorithm that determines whether a given

set of Boolean fun tions A onstitutes a omplete basis. (Fun tions are

represented by tables.)

[2!℄ Problem 2.2. Let

n

be the maximum omplexity (f) for Boolean

fun tions f in n variables. Prove that 1:99

n

<

n

< 2:01

n

for suÆ iently

large n.

2.2. Cir uits versus Turing ma hines. Any predi ate F on B

�

an be

restri ted to strings of �xed length n, giving rise to the Boolean fun tion

F

n

(x

1

; : : : ; x

n

) = F (x

1

x

2

� � � x

n

):

Thus F may be regarded as the sequen e of Boolean fun tions F

0

; F

1

; F

2

; : : : .

Similarly, in most ases of pra ti al importan e a (partial) fun tion of

type F : B

�

! B

�

an be represented by a sequen e of (partial) fun tions

F

n

: B

n

! B

p(n)

, where p(n) is a polynomial with integer oeÆ ients. We

will fo us on predi ates for simpli ity, though.

De�nition 2.1. A predi ate F belongs to the lass P=poly (\nonuniform

P") if

(F

n

) = poly(n):

(The term \nonuniform" indi ates that a separate pro edure, i.e., a Boolean

ir uit, is used to perform omputation with input strings of ea h individual

length.)

Theorem 2.2. P � P=poly.


Proof. Let F be a predi ate de idable in polynomial time. We have to

prove that F 2 P=poly. Let M be a TM that omputes F and runs in

polynomial time (and therefore in polynomial spa e). The omputation by

M on some input x of length n an be represented as a spa e-time diagram �

that is a re tangular table of size T �s, where T = poly(n) and s = poly(n).

t = 0 �

0;1

t = 1

: : :

t = j �

j;k�1

�

j;k

�

j;k+1

t = j + 1 �

j+1;k

: : :

t = T : : :

| {z }

s ells

In this table the j-th row represents the on�guration ofM after j steps:

�

j;k

orresponds to ell k at time j and onsists of two parts: the symbol

on the tape and the state of the TM if its head is in k-th ell (or a spe ial

symbol � if it is not). In other words, all �

j;k

belong to S �

�

f�g [ Q

�

.

(Only one entry in a row an have the se ond omponent di�erent from �.)

For simpli ity we assume that after the omputation stops all subsequent

rows in the table repeat the last row of the omputation.

There are lo al rules that determine the ontents of a ell �

j+1;k

if we

know the ontents of three neighboring ells in row j, i.e., �

j;k�1

, �

j;k

and

�

j;k+1

. Indeed, the head speed is at most one ell per step, so no other ell

an in uen e �

j+1;k

. Rules for boundary ells are somewhat spe ial; they

take into a ount that the head annot be lo ated outside the table.

Now we onstru t a ir uit that omputes F (x) for inputs x of length n.

The ontents of ea h table ell an be en oded by a onstant (i.e., indepen-

dent of n) number of Boolean variables. These variables (for all ells) will

be the auxiliary variables of the ir uit.

Ea h variable en oding the ell �

j+1;k

depends only on the variables

that en ode �

j;k�1

, �

j;k

, and �

j;k+1

. This dependen e is a Boolean fun tion

with a onstant number of arguments. These fun tions an be omputed

by ir uits of size O(1). Combining these ir uits, we obtain a ir uit that

omputes all of the variables whi h en ode the state of every ell. The size

of this ir uit is O(sT )O(1) = poly(n).

It remains to note that the variables in row 0 are determined by the

input string, and this dependen e leads to additional poly(n) assignments.

Similarly, to �nd out the result of M it is enough to look at the symbol

written in the 0-th ell of the tape at the end of the omputation. So the


output is a fun tion of �

T;0

and an be omputed by an additional O(1)-size

ir uit. Finally we get a poly(n)-size ir uit that simulates the behavior

of M for inputs of length n and therefore omputes the Boolean fun tion

F

n

. �

Remark 2.1. The lass P=poly is bigger than P. Indeed, let ' : N ! B

be an arbitrary fun tion (maybe even a non omputable one). Consider the

predi ate F

'

su h that F

'

(x) = '(jxj), where jxj stands for the length of

string x. The restri tion of F

'

to strings of length n is a onstant fun tion

(0 or 1), so the ir uit omplexity of (F

'

)

n

is O(1). Therefore F

'

for any '

belongs to P=poly, although for a non omputable ' the predi ate F

'

is not

omputable and thus does not belong to P.

Remark 2.2. That said, P=poly seems to be a good approximation of P

for many purposes. Indeed, the lass P=poly is relatively small: out of

2

2

n

Boolean fun tions in n variables only 2

poly(n)

fun tions have polynomial

ir uit omplexity (see solution to Problem 2.2). The di�eren e between

uniform and nonuniform omputation is more important for bigger lasses.

For example, EXPTIME, the lass of predi ates de idable in time 2

poly(n)

,

is a nontrivial omputational lass. However, the nonuniform analog of this

lass in ludes all predi ates!

The arguments used to prove Theorem 2.2 an also be used to prove the

following riterion:

Theorem 2.3. F belongs to P if and only if these onditions hold:

(1) F 2 P=poly;

(2) the fun tions F

n

are omputed by polynomial-size ir uits C

n

with the

following property: there exists a TM that for ea h positive integer n

runs in time poly(n) and onstru ts the ir uit C

n

.

A sequen e of ir uits C

n

with this property is alled polynomial-time uni-

form.

Note that the TM mentioned in (2) is not running in polynomial time

sin e its running time is polynomial in n but not in logn (the number of bits

in the binary representation of n). Note also that we impli itly use some

natural en oding for ir uits when saying \TM onstru ts a ir uit".

Proof. ) The ir uit for omputing F

n

onstru ted in Theorem 1.2 has

regular stru ture, and it is lear that the orresponding sequen e of assign-

ments an be produ ed in polynomial time when n is known.

( This is also simple. We ompute the size of the input string x,

then apply the TM to onstru t a ir uit C

jxj

that omputes F

jxj

. Then

we perform the assignments indi ated in C

jxj

, using x as the input, and


get F (x). All these omputations an be performed in polynomial (in jxj)

time. �

[1℄ Problem 2.3. Prove that there exists a de idable predi ate that be-

longs to P=poly but not to P.

2.3. Basi algorithms. Depth, spa e and width. We hallenge the

reader to study these topi s by working on problems. (Solutions are also

provided, of ourse.) In Problems 2.9{2.16 we introdu e some basi algo-

rithms whi h are used universally throughout this book. The algorithms

are des ribed in terms of uniform (i.e., e�e tively onstru ted) sequen es of

ir uits. In this book we will be satis�ed with polynomial-time uniformity ;

f. Theorem 2.3. [This property is intuitive and usually easy to he k. An-

other useful notion is logarithmi -spa e uniformity : the ir uits should be

onstru ted by a Turing ma hine with work spa e O(log n) (ma hines with

limited work spa e are de�ned below; see 2.3.3). Most of the ir uits we

build satisfy this stronger ondition, although the proof might not be so

easy.℄

2.3.1. Depth. In pra ti e (e.g., when implementing ir uits in hardware),

size is not the only ir uit parameter that ounts. Another important pa-

rameter is depth. Roughly, it is the time that is needed to arry out all

assignments in the ir uit, if we an do more than one in parallel. In-

terestingly enough, it is also related to the spa e needed to perform the

omputation (see Problems 2.17 and 2.18). In general, there is a trade-o�

between size and depth. In our solutions we will be willing to in rease the

size of a ir uit a little to gain a onsiderable redu tion of the depth (see

e.g., Problem 2.14). As a result, we ome up with ir uits of polynomial size

and poly-logarithmi depth. (With a ertain notion of uniformity, the fun -

tions omputed by su h ir uits form the so- alled lass NC, an interesting

sub lass of P.)

More formally, the depth of a Boolean ir uit is the maximum number

of gates on any path from the input to the output. The depth is � d if and

only if one an arrange the gates into d layers, so that the input bits of any

gate at layer j ome from the layers 1; : : : ; j� 1. For example, the ir uit in

Figure 2.1 has depth 3.

Unless stated expli itly otherwise, we assume that all gates have bounded

fan-in, i.e., the number of input bits. (This is always the ase when ir uits

are built over a �nite basis. Unbounded fan-in an o ur, for example, if

one uses OR gates with an arbitrary number of inputs.) We also assume

that the fan-out (the number of times an input or an auxiliary variable is


used) is bounded.

1

If it is ne essary to use the variable more times, one may

insert additional \trivial gates" (identity transformations) into the ir uit,

at the ost of some in rease in size and depth. Note that a formula is a

ir uit in whi h all auxiliary variables have fan-out 1, whereas the fan-out

of the input variables is unbounded.

[1℄ 2.4. Let C be an O(log n)-depth ir uit whi h omputes some fun tion

f : B

n

! B

m

. Prove that after eliminating all extra variables and gates in

C (those whi h are not onne ted to the output), we get a ir uit of size

poly(n+m).

[1!℄ 2.5. Let C be a ir uit (over some basis B) whi h omputes a fun tion

f : B

n

! B . Prove that C an be onverted into an equivalent (i.e., omput-

ing the same fun tion) formula C

0

of the same depth over the same basis.

(It follows from the solution that the size of C

0

does not ex eed

d

, where d

is the depth and is the maximal fan-in.)

[3℄ 2.6 (\balan ed formula"). Let C be a formula of size L over the basis

fNOT;OR;ANDg (with fan-in � 2). Prove that it an be onverted into an

equivalent formula of depth O(logL) over the same basis.

(Therefore, it does not matter whether we de�ne the formula omplexity

of a fun tion in terms of size or in terms of depth. This is not true for the

ir uit omplexity.)

[1℄ 2.7. Show that any fun tion an be omputed by a ir uit of depth � 3

with gates of type NOT, AND, and OR, if we allow AND- and OR-gates

with arbitrary fan-in and fan-out.

[2℄ 2.8. By de�nition, the fun tion PARITY is the sum of n bits modulo 2.

Suppose it is omputed by a ir uit of depth 3 ontaining NOT-gates, AND-

gates and OR-gates with arbitrary fan-in. Show that the size of the ir uit

is exponential (at least

n

for some > 1 and for all n).

2.3.2. Basi algorithms.

[3!℄ 2.9. Comparison. Constru t a ir uit of size O(n) and depth O(logn)

that tells whether two n-bit integers are equal and if they are not, whi h

one is greater.

[2℄ 2.10. Let n = 2

l

. Constru t ir uits of size O(n) and depth O(logn)

for the solution of the following problems.

1

This restri tion is needed to allow onversion of Boolean ir uits into quantum ir uits with-

out extra ost (see Se tion 7). However, it is in no way standard: in most studies in omputational

omplexity, unbounded fan-out is assumed.


a) A ess by index. Given an n-bit string x = x

0

� � � x

n�1

(a table)

and an l-bit number j (an index), �nd x

j

.

b) Sear h. Evaluate the disjun tion y = x

0

_� � �_x

n�1

, and if it equals

1, �nd the smallest j su h that x

j

= 1.

We now des ribe one rather general method of parallelizing omputation.

A �nite-state automaton is a devi e with an input alphabet A

0

, an output

alphabet A

00

, a set of states Q and a transition fun tionD : Q�A

0

! Q�A

00

(re all that su h a devi e is a part of a Turing ma hine). It is initially set

to some state q

0

2 Q. Then it re eives input symbols x

0

; : : : ; x

m�1

2 A

0

and

hanges its state from q

0

to q

1

to q

2

, et ., a ording to the rule

2

(q

j+1

; y

j

) = D(q

j

; x

j

):

The iterated appli ation of D de�nes a fun tion D

m

: (q

0

; x

0

; : : : ; x

m�1

) 7!

(q

m

; y

0

; : : : ; y

m�1

). We may assume without loss of generality that Q = B

r

,

A

0

= B

r

0

, A

00

= B

r

00

; then D : B

r+r

0

! B

r+r

00

whereas D

m

: B

r+r

0

m

!

B

r+r

00

m

.

The work of the automaton an be represented by this diagram:

x

m�1

x

1

x

0

? ? ?

q

m

�

D

q

m�1

�

� � �

q

2

�

D

q

1

�

D

�

q

0

? ? ?

y

m�1

y

1

y

0

[3!℄ 2.11 (\parallelization of iteration"). Let the integers r, r

0

, r

00

and m be

�xed; set k = r+ r

0

+ r

00

. Constru t a ir uit of size exp(O(k))m and depth

O(k logm) that re eives a transition fun tion D : B

r+r

0

! B

r+r

00

(as a value

table), an initial state q

0

and input symbols x

0

; : : : ; x

m�1

, and produ es the

output (q

m

; y

0

; : : : ; y

m�1

) = D

m

(q

0

; x

0

; : : : ; x

m�1

).

[2!℄ 2.12. Addition. Constru t a ir uit of size O(n) and depth O(logn)

that adds two n-bit integers.

[3!℄ 2.13. The following two problems are losely related.

a) Iterated addition. Constru t a ir uit of size O(nm) and depth

O(logn+ logm) for the addition of m n-digit numbers.

b) Multipli ation. Constru t a ir uit with the same parameters for

the multipli ation of an n-digit number by an m-digit number.

2

Our de�nition of an automaton is not standard. We require that the automaton reads and

outputs one symbol at ea h step. Traditionally, an automaton is allowed to either read a symbol,

or output a symbol, or both, depending on the urrent state. The operation of su h a general

automaton an be represented as the appli ation of an automaton in our sense (with a suitable

output alphabet B) followed by substitution of a word for ea h output symbol.


[3!℄ 2.14. Division. This problem also omes in two variants.

a) Compute the inverse of a real number x=1:x

1

x

2

� � � =1+

P

1

j=1

2

�j

x

j

with pre ision 2

�n

. By de�nition, this requires to �nd a number z su h that

jz � x

�1

j � 2

�n

. For this to be possible, x must be known with pre ision

2

�n

or better; let us assume that x is represented by an n+ 1-digit number

x

0

su h that jx

0

� xj � 2

�(n+1)

. Constru t an O(n

2

logn)-size, O((log n)

2

)-

depth ir uit for the solution of this problem.

b) Divide two integers with remainder: (a; b) 7!

�

ba=b ; (a mod b)

�

,

where 0 � a < 2

k

b and 0 < b < 2

n

. In this ase, onstru t a ir uit of size

O(nk + k

2

log k) and depth O(log n+ (log k)

2

).

[2!℄ 2.15. Majority. The majority fun tion MAJ : B

n

! B equals 1 for

strings in whi h the number of 1s is greater than the number of 0s, and

equals 0 elsewhere. Constru t a ir uit of size O(n) and depth O(log n) that

omputes the majority fun tion.

[3!℄ 2.16. Conne ting path. Constru t a ir uit of size O(n

3

logn) and

depth O((log n)

2

) that he ks whether two �xed verti es of an undire ted

graph are onne ted by a path. The graph has n verti es, labeled 1; : : : ; n;

there arem = n(n�1)=2 input variables x

ij

(where i < j) indi ating whether

there is an edge between i and j.

2.3.3. Spa e and width. In the solution to the above problems we strove

to provide parallel algorithms, whi h were des ribed by ir uits of poly-

logarithmi depth. We now show that, if the ir uit size is not taken into

a ount, then uniform omputation with poly-logarithmi depth

3

is equiva-

lent to omputation with poly-logarithmi spa e.

We are going to study omputation with very limited spa e | so small

that the input string would not �t into it. So, let us assume that the input

string x = x

0

� � � x

n�1

is stored in a supplementary read-only memory, and

the Turing ma hine an a ess bits of x by their numbers. We may think

of the input string as a fun tion X : j 7! x

j

omputed by some external

agent, alled \ora le". The length of an \ora le query" j is in luded into

the ma hine work spa e, but the length of x is not. This way we an a ess

all input bits with spa e O(logn) (but no smaller).

De�nition 2.2. Let X : A

�

! A

�

be a partial fun tion. A Turing ma hine

with ora le X is an ordinary TM with a supplementary tape, in whi h it an

write a string z and have the value of X(z) available for inspe tion at the

next omputation step.

3

As is usual, we onsider ir uits over a �nite omlete basis; the fan-in and fan-out are

bounded.


In our ase X(z) = x

j

, where z is the binary representation of j (0 � j �

n� 1) by dlog

2

ne digits; otherwise X(z) is unde�ned.

The omputation of a fun tion f : B

n

! B

m

by su h a ma hine is de�ned

as follows. Let x (jxj = n) be the input. We write n and another number

k (0 � k < m) on the ma hine tape and run the ma hine. We allow it to

query bits of x. When the omputation is omplete, the �rst ell of the tape

must ontain the k-th bit of f(x). Note that if the work spa e is limited

by s, then n � 2

s

and m � 2

s

. The omputation time is also bounded:

the ma hine either stops within exp(O(s)) steps, or enters a y le and runs

forever.

[2!℄ 2.17 (\Small depth ir uits an be simulated in small spa e"). Prove

that there exists a Turing ma hine that evaluates the output variables of a

ir uit of depth d over a �xed basis, using work spa e O(d + logm) (where

m is the number of the output variables). The input to the ma hine onsists

of a des ription of the ir uit and the values of its input variables.

[3!℄ 2.18 (\Computation with small spa e is parallelizable"). Let M be a

Turing ma hine. For ea h hoi e of n;m; s, let f

n;m;s

: B

n

! B

m

be the

fun tion omputed by the ma hine M with spa e s (it may be a partial

fun tion). Prove that there exists a family of exp(O(s))-size, O(s

2

)-depth

ir uits C

n;m;s

whi h ompute the fun tions f

n;m;s

.

(These ir uits an be onstru ted by a TM with spa e O(s), but we

will not prove that.)

The reader might wonder why we dis uss the spa e restri tion in terms

of Turing ma hines while the ir uit language is apparently more onvenient.

So, let us ask this question: what is the ir uit analog of the omputation

spa e? The obvious answer is that it is the ir uit width.

Let C be a ir uit whose gates are arranged into d layers. The width

of C is the maximum amount of information arried between the layers,

not in luding the input variables. More formally, for ea h l = 1; : : : ; d we

de�ne w

l

to be the number of auxiliary variables from layers 1; : : : ; l that

are output variables or onne ted to some variables from layers l+1; : : : ; d

(i.e., used in the right-hand side of the orresponding assignments). Then

the width of C is w = maxfw

1

; : : : ; w

d

g.

But here omes a little surprise: any Boolean fun tion an be omputed

by a ir uit of bounded width (see Problem 2.19 below). Therefore the width

is rather meaningless parameter, unless we put some other restri tions on

the ir uit. To hara terize omputation with limited spa e (e.g., the lass

PSPACE), one has to use either Turing ma hines, or some lass of ir uits

with regular stru ture.


[3℄ 2.19 (Barrington [8℄). Let C be a formula of depth d that omputes a

Boolean fun tion f(x

1

; : : : ; x

n

). Constru t a ir uit of size exp(O(d)) and

width O(1) that omputes the same fun tion.

3. The lass NP: Redu ibility and ompleteness

3.1. Nondeterministi Turing ma hines. NP is the lass of predi ates

re ognizable in polynomial time by \nondeterministi Turing ma hines."

(The word \nondeterministi " is not appropriate but widely used.)

The lass NP is de�ned only for predi ates. One says, for example, that

\the property that a graph has a Hamiltonian y le belongs to NP". (A

Hamiltonian y le is a y le that traverses all verti es exa tly on e.)

We give several de�nitions of this lass. The �rst uses nondeterministi

Turing ma hines. A nondeterministi Turing ma hine (NTM) resembles

an ordinary (deterministi ) ma hine, but an nondeterministi ally hoose

one of several a tions possible in a given on�guration. More formally, a

transition fun tion of an NTM is multivalued: for ea h pair (state, symbol)

there is a set of possible a tions. Ea h a tion has a form (new state, new

symbol, shift). If the set of possible a tions has ardinality at most 1 for

ea h state-symbol ombination, we get an ordinary Turing ma hine.

A omputational path of an NTM is determined by a hoi e of one of the

possible transitions at ea h step; di�erent paths are possible for the same

input.

De�nition 3.1. A predi ate L belongs to the lass NP if there exists an

NTM M and a polynomial p(n) su h that

L(x) = 1 ) there exists a omputational path that gives answer

\yes" in time not ex eeding p(jxj);

L(x) = 0 ) (version 1) there is no path with this property;

(version 2) . . . and, moreover, there is no path (of any

length) that gives answer \yes".

Remark 3.1. Versions 1 and 2 are equivalent. Indeed, let an NTM M

1

satisfy version 1 of the de�nition. To ex lude \yes" answers for long om-

putational paths, it suÆ es to simulate M

1

while ounting its steps, and to

abort the omputation after p(jxj) steps.

Remark 3.2. The argument in Remark 3.1 has a subtle error. If the oeÆ-

ients of the polynomial p are non omputable, diÆ ulties may arise when we

have to ompare the number of steps with the value of the polynomial. In

order to avoid this ompli ation we will add to De�nition 3.1 an additional

requirement: p(n) has integer oeÆ ients.


Remark 3.3. By de�nition P � NP. Is this in lusion stri t? Rather intense

although unsu essful attempts have been made over the past 30 years to

prove the stri tness. Re ently S. Smale in luded the P

?

= NP problem in

the list of most important mathemati al problems for the new entury (the

other problems are the Riemann hypothesis and the Poin ar�e onje ture).

More pra ti al people an dream of $1,000,000 that Clay Institute o�ers for

the solution of this problem.

Now we give another de�nition of the lass NP, whi h looks more natu-

ral. It uses the notion of a polynomially de idable predi ate of two variables:

a predi ate R(x; y) (where x and y are strings) is polynomially de idable (de-

idable in polynomial time) if there is a (deterministi ) TM that omputes

it in time poly(jxj; jyj) (whi h means poly(jxj+ jyj) or poly

�

maxfjxj; jyjg

�

|

these two expressions are equivalent).

De�nition 3.2. A predi ate L belongs to the lass NP if it an be repre-

sented as

L(x) = 9y

�

�

jyj < q(jxj)

�

^R(x; y)

�

;

where q is a polynomial (with integer oeÆ ients), and R is a predi ate of

two variables de idable in polynomial time.

Remark 3.4. Let R(x; y) = \y is a Hamiltonian y le in the graph x".

More pre isely, we should say: \x is a binary en oding of some graph, and y

is an en oding of a Hamiltonian y le in that graph". Take q(n) = n. Then

L(x) means that graph x has a Hamiltonian y le. (We assume that the

en oding of any y le in a graph is shorter than the en oding of the graph

itself.)

Theorem 3.1. De�nitions 3.1 and 3.2 are equivalent.

Proof. De�nition 3.1 ) De�nition 3.2. Let M be an NTM and let p(n) be

the polynomial of the �rst de�nition. Consider the predi ate R(x; y) = \y

is a des ription of a omputational path that starts with input x, ends with

answer `yes', and takes at most p(jxj) steps". Su h a des ription has length

proportional to the omputation time if an appropriate en oding is used

(and even if we use a table as in the proof of Theorem 2.2, the des ription

length is at most quadrati ). Therefore for q(n) in the se ond de�nition we

an take O(p(n)) (or O(p

2

(n)) if we use less eÆ ient en oding).

It remains to prove that predi ate R belongs to P. This is almost obvi-

ous. We must he k that we are presented with a valid des ription of some

omputational path (this is a polynomial task), and that this path starts

with x, takes at most p(jxj) steps, and ends with \yes" (that is also easy).

De�nition 3.2 ) De�nition 3.1. Let R; q be as in De�nition 3.2. We

onstru t an NTM M for De�nition 3.1. M works in two stages.


First, M nondeterministi ally guesses y. More pre isely, this means

that M goes to the end of the input string, moves one ell to the right,

writes #, moves on e more to the right, and then writes some string y (M 's

nondeterministi rules allow it to write any symbol and then move to the

right, or �nish writing). After that, the tape has the form x#y for some y,

and M goes to the se ond stage.

At the se ond stage M he ks that jyj < q(jxj) (note that M an write

a very long y for any given x) and omputes R(x; y) (using the polynomial

algorithm that exists a ording to De�nition 3.2). If x 2 L, then there

is some y su h that jyj < q(jxj) and R(x; y). Therefore M has, for x; a

omputational path of polynomial length ending with \yes". If x =2 L, no

omputational path ends with \yes". �

Now we pro eed to yet another des ription of NP that is just a refor-

mulation of De�nition 3.2, but whi h has a form that an be used to de�ne

other omplexity lasses.

De�nition 3.3. Imagine two persons: King Arthur, whose mental abilities

are polynomially bounded, and a wizard Merlin, who is intelle tually om-

nipotent and knows everything. A is interested in some property L(x) (he

wants, for example, to be sure that some graph x has a Hamiltonian y le).

M wants to onvin e A that L(x) is true. But A does not trust M (\he

is too lever to be loyal") and wants to make sure L(x) is true, not just

believe M.

So Arthur arranges that, after both he and Merlin see input string x,

M writes a note to A where he \proves" that L(x) is true. Then A veri�es

this proof by some polynomial proof- he king pro edure.

The proof- he king pro edure is a polynomial predi ate

R(x; y) = \y is a proof of L(x)":

It should satisfy two properties:

L(x) = 1 ) M an onvin e A that L(x) is true by presenting some

proof y su h that R(x; y);

L(x) = 0 ) whatever M says, A is not onvin ed: R(x; y) is false for

any y.

Moreover, the proof y should have polynomial (in jxj) length, otherwise

A annot he k R(x; y) in polynomial (in jxj) time.

In this way, we arrive pre isely at De�nition 3.2.

3.2. Redu ibility and NP- ompleteness. The notion of redu ibility al-

lows us to verify that a predi ate is at least as diÆ ult as some other pred-

i ate.


De�nition 3.4 (Karp redu ibility). A predi ate L

1

is redu ible to a pred-

i ate L

2

if there exists a fun tion f 2 P su h that L

1

(x) = L

2

(f(x)) for any

input string x.

We say that f redu es L

1

to L

2

. Notation: L

1

_ L

2

.

Karp redu ibility is also alled \polynomial redu ibility" (or just \re-

du ibility").

Lemma 3.2. Let L

1

_ L

2

. Then

(a) L

2

2 P) L

1

2 P;

(b) L

1

=2 P) L

2

=2 P;

( ) L

2

2 NP) L

1

2 NP.

Proof. To prove (a) let us note that jf(x)j = poly(jxj) for any f 2 P.

Therefore, we an de ide L

1

(x) in polynomial time as follows: we ompute

f(x) and then ompute L

2

(f(x)).

Part (b) follows from (a).

Part ( ) is also simple. It an be explained in various ways. Using the

Arthur{Merlin metaphor, we say that Merlin ommuni ates to Arthur a

proof that L

2

(f(x)) is true (if it is true). Then Arthur omputes f(x) by

himself and he ks whether L

2

(f(x)) is true, using Merlin's proof.

Using De�nition 3.2, we an explain the same thing as follows:

L

1

(x) , L

2

(f(x)) , 9 y

�

�

jyj < q(jf(x)j)

�

^R(f(x); y)

�

, 9 y

�

�

jyj < r(jxj)

�

^R

0

(x; y)

�

:

Here R

0

(x; y) stands for

�

jyj < q(jf(x)j)

�

^R(f(x); y), and r(n) = q(h(n)),

where h(n) is a polynomial bound for the time needed to ompute f(x) for

any string x of length n (and, therefore, jf(x)j � h(jxj) for any x). �

De�nition 3.5. A predi ate L 2 NP is NP- omplete if any predi ate in NP

is redu ible to it.

If some NP- omplete predi ate an be omputed in time T (n), then any

NP-predi ate an be omputed in time poly(n) + T (poly(n)). Therefore, if

some NP- omplete predi ate belongs to P, then P = NP. Put is this way:

if P 6= NP (whi h is probably true), then no NP- omplete predi ate belongs

to P.

If we measure running time \up to a polynomial", then we an say that

NP- omplete predi ates are the \most diÆ ult" ones in NP.

The key result in omputational omplexity says that NP- omplete pred-

i ates do exist. Here is one of them, alled satis�ability : SAT (x) means that


x is a propositional formula ( ontaining Boolean variables and operations

:, ^, and _) that is satis�able, i.e., true for some values of the variables.

Theorem 3.3 (Cook, Levin).

(1) SAT 2 NP;

(2) SAT is NP- omplete.

Corollary. If SAT 2 P, then P = NP.

Proof of Theorem 3.3. (1) To onvin e Arthur that a formula is satis�-

able, Merlin needs only to show him the values of the variables that make it

true. Then Arthur an ompute the value of the whole formula by himself.

(2) Let L(x) be an NP-predi ate and

L(x) = 9y

�

�

jyj < q(jxj)

�

^R(x; y)

�

for some polynomial q and some predi ate R de idable in polynomial time.

We need to prove that L is redu ible to SAT . LetM be a Turing ma hine

that omputes R in polynomial time. Consider the omputation table (see

the proof of Theorem 2.2) for M working on some input x#y. We will use

the same variables as in the proof of Theorem 2.2. These variables en ode

the ontents of ells in the omputation table.

Now we write a formula that says that values of variables form an en-

oding of a su essful omputation (with answer \yes"), starting with the

input x#y. To form a valid omputation table, values should obey some

lo al rules for ea h four ells on�gured as follows:

These lo al rules an be written as formulas in 4t variables (if t variables are

needed to en ode one ell). We write the onjun tion of these formulas and

add formulas saying that the �rst line ontains the input string x followed

by # and some binary string y, and that the last line ontains the answer

\yes".

The satisfying assignment for our formula will be an en oding of a su -

essful omputation of M on input x#y (for some binary string y). On the

other hand, any su essful omputation that uses at most S tape ells and

requires at most T steps (where T � S is the size of the omputation table

that is en oded) an be transformed into a satisfying assignment.

Therefore, if we onsider a omputational table that is large enough to

ontain the omputation of R(x; y) for any y su h that jyj < q(jxj), and


write the orresponding formula as explained above, we get a polynomial-

size formula that is satis�able if and only if L(x) is true. Therefore L is

redu ible to SAT . �

Other examples of NP- omplete problems (predi ates) an be obtained

using the following lemma.

Lemma 3.4. If SAT _ L and L 2 NP, then L is NP- omplete. More

generally, if L

1

is NP- omplete, L

1

_ L

2

, and L

2

2 NP, then L

2

is NP-

omplete.

Proof. The redu ibility relation is transitive: if L

l

_ L

2

and L

2

_ L

3

, then

L

1

_ L

3

. (Indeed, the omposition of two fun tions from P belongs to P).

A ording to the hypothesis, any NP-problem is redu ible to L

1

, and L

1

is

redu ible to L

2

. Therefore any NP-problem is redu ible to L

2

. �

Now let us onsider the satis�ability problem restri ted to 3-CNF. Re all

that a CNF ( onjun tive normal form) is a onjun tion of lauses; ea h

lause is a disjun tion of literals; ea h literal is either a variable or a negation

of a variable. If ea h lause ontains at most three literals, we get a 3-CNF.

By 3-SAT we denote the following predi ate:

3-SAT (x) = x is a satis�able 3-CNF:

Evidently, 3-SAT is redu ible to SAT (be ause any 3-CNF is a formula).

The next theorem shows that the reverse is also true: SAT is redu ible to

3-SAT . Therefore 3-SAT is NP- omplete (by Lemma 3.4).

Theorem 3.5. SAT _ 3-SAT.

Proof. For any propositional formula (and even for any ir uit over the

standard basis fAND;OR;NOTg), we onstru t a 3-CNF that is satis�able

if and only if the given formula is satis�able (the given ir uit produ es

output 1 for some input).

Let x

1

; : : : ; x

n

be input variables of the ir uit, and let y

1

; : : : ; y

s

be

auxiliary variables (see the de�nition of a ir uit). Ea h assignment involves

at most three variables (1 on the left-hand side, and 1 or 2 on the right-hand

side).

Now we onstru t a 3-CNF that has variables x

1

; : : : ; x

n

; y

1

; : : : ; y

s

and

is true if and only if the values of all y

j

are orre tly omputed (i.e., they

oin ide with the right-hand sides of the assignments) and the last variable is

true. To this end, we repla e ea h assignment by an equivalen e (of Boolean


expressions) and represent this equivalen e as a 3-CNF:

�

y , (x

1

_ x

2

)

�

= (x

1

_ x

2

_ :y) ^ (:x

1

_ x

2

_ y) ^ (x

1

_ :x

2

_ y)

^ (:x

1

_ :x

2

_ y);

�

y , (x

1

^ x

2

)

�

= (x

1

_ x

2

_ :y) ^ (:x

1

_ x

2

_ :y) ^ (x

1

_ :x

2

_ :y)

^ (:x

1

_ :x

2

_ y);

�

y , :x

�

= (x _ y) ^ (:x _ :y):

Finally, we take the onjun tion of all these 3-CNF's and the variable y

s

(the latter represents the ondition that the output of the ir uit is 1).

Let us assume that the resulting 3-CNF is satis�ed by some x

1

; : : : ; x

n

,

y

1

; : : : ; y

s

. If we plug the same values of x

1

; : : : ; x

n

into the ir uit, then

the auxiliary variables will be equal to y

1

; : : : ; y

s

, so the ir uit output will

be y

s

= 1. Conversely, if the ir uit produ es 1 for some inputs, then the

3-CNF is satis�ed by the same values of x

1

; : : : ; x

n

and appropriate values

of the auxiliary variables.

So our transformation (of a ir uit into a 3-CNF) is indeed a redu tion

of SAT to 3-SAT . �

Here is another simple example of redu tion.

ILP (integer linear programming). Given a system of linear inequalities

with integer oeÆ ients, is there an integer solution? (In other words, is the

system onsistent?)

In this problem the input is the oeÆ ient matrix and the ve tor of the

right-hand sides of the inequalities. It is not obvious that ILP 2 NP. Indeed,

the solution might exist, but Merlin might not be able to ommuni ate it to

Artur be ause it is not immediately lear that the number of bits needed to

represent the solution is polynomial.

However, it is in fa t true that, if a system of inequalities with integer

oeÆ ients has an integer solution, then it has an integer solution whose

binary representation has size bounded by a polynomial in the bit size of

the system; see [55, vol. 2, x17.1℄. Therefore, ILP is in NP.

Now we redu e 3-SAT to ILP . Assume that a 3-CNF is given. We

onstru t a system of inequalities that has integer solutions if and only if

the 3-CNF is satis�able. For ea h Boolean variable x

i

we onsider an integer

variable p

i

. The negation :x

i

orresponds to the expression 1 � p

i

. Ea h

lause X

j

_X

k

_X

m

(where X

�

are literals) orresponds to the inequality

P

j

+ P

k

+ P

m

� 1, where P

j

; P

k

; P

m

are the expressions orresponding to

X

j

;X

k

;X

m

. It remains to add the inequalities 0 � p

i

� 1 for all i, and we

get a system whose solutions are satisfying assignments for the given 3-CNF.


Remark 3.5. If we do not require the solution to be integer-valued, we get

the standard linear programming problem. Polynomial algorithms for the

solution of this problem (due to Kha hiyan and Karmarkar) are des ribed,

e. g., in [55, vol. 1, xx13, 15.1℄.

An extensive list of NP- omplete problems an be found in [29℄. Usually

NP- ompleteness is proved by some redu tion. Here are several examples of

NP- omplete problems.

3- oloring. For a given graph G determine whether it admits a 3-

oloring. (By a 3- oloring we mean oloring of the verti es with 3 olors

su h that ea h edge has endpoints of di�erent olors.)

(It turns out that a similar 2- oloring problem an be solved in polyno-

mial time.)

Clique. For a graph G and an integer k determine whether the graph

has a k- lique (a set of k verti es su h that every two of its elements are

onne ted by an edge).

Problems

[3℄ 3.1. Prove that one an he k the satis�ability of a 2-CNF (a onjun -

tion of disjun tions, ea h ontaining two literals) in polynomial time.

[2℄ 3.2. Prove that the problem of the existen e of an Euler y le in an

undire ted graph (an Euler y le is a y le that traverses ea h edge exa tly

on e) belongs to P.

[1!℄ 3.3. Suppose we have an NP-ora le | a magi devi e that an imme-

diately solve any instan e of the SAT problem for us. In other words, for

any propositional formula the ora le tells whether it is satis�able or not.

Prove that there is a polynomial-time algorithm that �nds a satisfying as-

signment to a given formula by making a polynomial number of queries to

the ora le. (A similar statement is true for the Hamiltonian y le: �nding a

Hamiltonian y le in a graph is at most polynomially harder than he king

for its existen e.)

[3!℄ 3.4. There are n boys and n girls. It is known whi h boys agree to

dan e with whi h girls and vi e versa. We want to know whether there

exists a perfe t mat hing (the boys and the girls an dan e in pairs so that

everybody is satis�ed). Prove that this problem belongs not only to NP

(whi h is almost evident), but also to P.

3.5. Constru t

[2!℄ (a) a polynomial redu tion of the 3-SAT problem to the lique problem;


[3℄ (b) a polynomial redu tion of 3-SAT toClique that onserves the num-

ber of solutions (if a 3-CNF F is transformed into a graph H and an integer

k, then the number of satisfying assignments for F equals the number of

k- liques in H).

3.6. Constru t

[2!℄ (a) a polynomial redu tion of 3-SAT to 3- oloring;

[3℄ (b) the same as for (a), with the additional requirement that the number

of satisfying assignments is one sixth of the number of 3- olorings of the

orresponding graph.

[2℄ 3.7. A tile is a (1� 1)-square whose sides are marked with letters. We

want to over an (n � n)-square with n

2

tiles; it is known whi h letters

are allowed at the boundary of the n � n-square and whi h letters an be

adja ent.

The tiling problem requires us to �nd, for a given set of tile types ( on-

taining at most poly(n) types) and for given restri tions, whether or not

there exists a tiling of the (n� n)-square.

Prove that the tiling problem is NP- omplete.

[1℄ 3.8. Prove that the predi ate \x is the binary representation of a om-

posite integer" belongs to NP.

[3℄ 3.9. Prove that the predi ate \x is the binary representation of a prime

integer" belongs to NP.

4. Probabilisti algorithms and the lass BPP

4.1. De�nitions. Ampli� ation of probability. A probabilisti Turing

ma hine (PTM) is somewhat similar to a nondeterministi one; the di�er-

en e is that hoi e is produ ed by oin tossing, not by guessing. More for-

mally, some (state; symbol) ombinations have two asso iated a tions, and

the hoi e between them is made probabilisti ally. Ea h instan e of this

hoi e is ontrolled by a random bit. We assume that ea h random bit is 0

or 1 with probability 1=2 and that di�erent random bits are independent.

(In fa t we an repla e 1=2 by, say, 1=3 and get almost the same de�ni-

tion; the lass BPP (see below) remains the same. However, if we repla e

1=2 by some non omputable real p, we get a rather strange notion whi h is

better avoided.)

For a given input string a PTM generates not a unique output string,

but a probability distribution on the set of all strings (di�erent values of

the random bits lead to di�erent omputation outputs, and ea h possible

output has a ertain probability).


De�nition 4.1. Let " be a onstant su h that 0 < " < 1=2. A predi ate

L belongs to the lass BPP if there exist a PTM M and a polynomial p(n)

su h that the ma hine M running on input string x always terminates after

at most p(jxj) steps, and

L(x) = 1 ) M gives the answer \yes" with probability � 1� ";

L(x) = 0 ) M gives the answer \no" with probability � ".

In this de�nition the admissible error probability " an be, say, 0:49

or 10

�10

| the lass BPP will remain the same. Why? Assume that the

PTM has probability of error at most " < 1=2. Take k opies of this ma hine,

run them all for the same input (using independent random bits) and let

them vote. Formally, what we do is applying the majority fun tion MAJ

(see Problem 2.15) to k individual outputs. The \majority opinion" will be

wrong with probability

(4.1)

p

error

�

X

S�f1;:::;kg;

jSj�k=2

(1� ")

jSj

"

k�jSj

=

�

(1� ")"

�

k=2

X

S�f1;:::;kg;

jSj�k=2

�

"

1� "

�

k=2�jSj

<

�

p

(1� ")"

�

k

2

k

= �

k

; where � = 2

p

"(1 � ") < 1:

If k is big enough, the e�e tive error probability will be as small as we

wish. This is alled ampli� ation of probability. To make the quantity p

error

smaller than a given number "

0

, we need to set k = �(log(1="

0

)). (Sin e "

and "

0

are onstants, k does not depend on the input. Even if we require

that "

0

= exp(�p(n)) for an arbitrary polynomial p, the omposite TM still

runs in polynomial time.)

Let us we give an equivalent de�nition of the lass BPP using predi ates

in two variables (this de�nition is similar to De�nition 3.2).

De�nition 4.2. A predi ate L belongs to BPP if there exist a polynomial

p and a predi ate R, de idable in polynomial time, su h that

L(x) = 1 ) the fra tion of strings r of length p(jxj) satisfying R(x; r)

is greater than 1� ";

L(x) = 0 ) the fra tion of strings r of length p(jxj) satisfying R(x; r)

is less than ".

Theorem 4.1. De�nitions 4.1 and 4.2 are equivalent.


Proof. De�nition 4.1 ) De�nition 4.2. Let R(x; r) be the following pred-

i ate: \M says `yes' for the input x using r

1

; : : : ; r

p

n

as the random bits"

(we assume that a oin is tossed at ea h step of M). It is easy to see that

the requirements of De�nition 4.2 are satis�ed.

De�nition 4.2 ) De�nition 4.1. Assume that p and R are given. Con-

sider a PTM that (for input x) randomly hooses a string r of length p(jxj),

making p(jxj) oin tosses, and then omputes R(x; r). This ma hine satis�es

De�nition 4.1 (with a di�erent polynomial p

0

instead of p). �

y

x

x 2 L

x 62 L

Fig. 4.1. The hara teristi set of the predi ate R(x; y). Verti al lines

represent the sets S

x

.

De�nition 4.2 is illustrated in Figure 4.1. We represent a pair (x; y) of

strings as a point and draw the set S =

�

(x; y) : (jyj = p(jxj)) ^ R(x; y)

.

For ea h x we onsider the x-se tion of S de�ned as S

x

= fy : (x; y) 2 Sg.

The set S is rather spe ial in the sense that, for any x, either S

x

is large

( ontains at least 1 � " fra tion of all strings of length p(jxj) ) or is small

( ontains at most " fra tion of them). Therefore all values of x are divided

into two ategories: for one of them L(x) is true and for the other L(x) is

false.

Remark 4.1. Probabilisti Turing ma hines (unlike nondeterministi ones,

whi h depend on almighty Merlin for guessing the omputational path) an

be onsidered as real omputing devi es. Physi al pro esses like the Nyquist

noise or radioa tive de ay are believed to provide sour es of random bits; in

the latter ase it is guaranteed by the very nature of quantum me hani s.

4.2. Primality testing. A lassi example of a BPP problem is he king

whether a given integer q (represented by n = dlog

2

qe bits) is prime or not.

We all this problem Primality. We will des ribe a probabilisti primality

test, alled Miller{Rabin test. For reader's onvenien e all ne essary fa ts

from number theory are given in Appendix A.


4.2.1. Main idea. We begin with a mu h simpler \Fermat test", though

its results are generally in on lusive. It is based on Fermat's little theorem

(Theorem A.9), whi h says that

if q is prime, then a

q�1

� 1 (mod q) for x 2 f1; : : : ; q � 1g:

We may regard a as a (mod q)-residue and simply write a

q�1

= 1, assuming

that arithmeti operations are performed with residues rather than integers.

So, the test is this: we pi k a random a and he k whether a

q�1

= 1.

If this is true, then q may be a prime; but if this is false, then q is not a

prime. Su h a an be alled a witness saying that a is omposite. This kind

of eviden e is indire t (it does not give us any fa tor of q) but usually easy

to �nd: it often suÆ es to he k a = 2. But we will do a better job if we

sample a from the uniform distribution over the set f1; : : : ; q� 1g (i.e., ea h

element of this set is taken with probability 1=(q � 1) ).

Suppose q is omposite. We want to know if the test a tually shows that

with nonnegligible probability. Consider two ases.

1) g d(a; q) = d 6= 1. Then a

q�1

� 0 6� 1 (mod d), therefore a

q�1

6�

1 (mod q). The test dete ts that q is omposite. Unfortunately, the

probability to get su h an a is usually small.

2) g d(a; q) = 1, i.e. a 2 (Z=qZ)

�

(where (Z=qZ)

�

denotes the group of

invertible (mod q)-residues). This is the typi al ase; let us onsider it

more losely.

Lemma 4.2. If a

q�1

6= 1 for some element a 2 (Z=qZ)

�

, then the Fermat

test dete ts the ompositeness of q with probability � 1=2.

Proof. Let G = (Z=qZ)

�

. For any integer k de�ne the following set:

(4.2) G

(m)

=

�

x 2 G : x

m

= 1

:

This is a subgroup in G (due to the identity a

m

b

m

= (ab)

m

for elements

of an Abelian group). If a

q�1

6= 1 for some a, then a =2 G

(q�1)

, therefore

G

(q�1)

6= G. By Lagrange's theorem, the ratio jGj = jG

(m)

j is an integer,

hen e jGj = jG

(m)

j � 2. It follows that a

q�1

6= 1 for at least half of a 2

(Z=qZ)

�

. And, as we already know, a

q�1

6= 1 for all a =2 (Z=qZ)

�

. �

Is it a tually possible that q is omposite but a

q�1

= 1 for all invert-

ible residues a? Su h numbers q are rare, but they exist (they are alled

Carmi hael numbers). Example: q = 561 = 3 � 11 � 17. Note that the num-

bers 3 � 1, 11 � 1 and 17 � 1 divide q � 1. Therefore a

q�1

= 1 for any

a 2 (Z=qZ)

�

�

=

Z

3�1

� Z

11�1

� Z

17�1

.

We see that the Fermat test alone is not suÆ ient to dete t a omposite

number. The Miller{Rabin test uses yet another type of witnesses for the

ompositeness: if b

2

� 1 (mod q), and b 6� �1 (mod q) for some b, then q is


omposite. Indeed, in this ase b

2

� 1 = (b� 1)(b+1) is a multiple of q but

b � 1 and b + 1 are not, therefore q has nontrivial fa tors in ommon with

both b� 1 and b+ 1.

4.2.2. Required subroutines and their omplexity. Addition (or sub-

tra tion) of n-bit integers is done by an O(n)-size ir uit; multipli ation and

division are performed by O(n

2

)-size ir uits. These estimates refer to the

standard algorithms learned in s hool, though they are not the most eÆ ient

for large integers. In the solutions to Problems 2.12, 2.13 and 2.14 we de-

s ribed alternative algorithms, whi h are mu h better in terms of the ir uit

depth, but slightly worse in terms of the size. If only the size is important,

the standard addition algorithm is optimal, but the ones for the multipli-

ation and division are not. For example, an O(n logn log log n)-size ir uit

for the multipli ation exists; see [5, Se . 7.5℄ or [43, vol. 2, Se . 4.3.3℄.

Eu lid's algorithm for g d(a; b) has omplexity O(n

3

), but there is also

a so- alled \binary" g d algorithm (see [43, vol. 2, Se . 4.5.2℄) of omplexity

O(n

2

). It does not solve the equation xa+ yb = g d(a; b), though.

We will also use modular arithmeti . It is lear that the addition of

(mod q)-residues is done by a ir uit of size O(n), whereas modular multi-

pli ation an be redu ed to integer multipli ation and division; therefore

it is performed by a ir uit of size O(n

2

) (by the standard te hnique).

To invert a residue a 2 (Z=qZ)

�

, we need to �nd an integer x su h that

xa � 1 (mod q), i.e., xa + yq = 1. This is done by extended Eu lid's

algorithm, whi h has omplexity O(n

3

).

It is also possible to ompute (a

m

mod q) by a polynomial time algo-

rithm. (Note that we speak about an algorithm that is polynomial in the

length n of its input (a;m; q); therefore performing m multipli ations is out

of the question. Note also that the size of the integer a

m

is exponential.) But

we an ompute (a

2

k

mod q) for k = 1; 2; : : : ; blog

2

m by repeated squaring,

applying the (mod q) redu tion at ea h step. Then we multiply some of the

results in su h a way that the powers, i.e., the numbers 2

k

, add to m (us-

ing the binary representation of m). This takes O(logm) = O(n) modular

multipli ations, whi h translates to ir uit size O(n

3

).

4.2.3. The algorithm. Assume that a number q is given.

Step 1. If q is even (and q 6= 2), then q is omposite. If q is odd, we

pro eed to Step 2.

Step 2. Let q � 1 = 2

k

l, where k > 0, and l is odd.

Step 3. We hoose a random a 2 f1; : : : ; q � 1g.

Step 4. We ompute a

l

; a

2l

; a

4l

; : : : ; a

2

k

l

= a

q�1

modulo q.


Test 1. If a

q�1

6= 1 (where modular arithmeti is assumed), then q is

omposite.

Test 2. If the sequen e a

l

; a

2l

; : : : ; a

2

k

l

(Step 4) ontains a 1 that is

pre eded by anything ex ept �1, then q is omposite. In other words, if

there exists j su h that a

2

j

l

6= �1 but a

2

j+1

l

= 1, then q is omposite.

In all other ases the algorithm says that \q is prime" (though in fa t it

is not guaranteed).

Theorem 4.3. If q is prime, then the algorithm always (with probability 1)

gives the answer \prime".

If q is omposite, then the answer \ omposite" is given with probability

at least 1=2.

Remark 4.2. To get a probabilisti algorithm in sense of De�nition 4.1,

we repeat this test twi e: the probability of an error (a omposite number

being undete ted twi e) is at most 1=4 < 1=2.

Proof of Theorem 4.3. As we have seen, the algorithm always gives the

answer \prime" for prime q.

Assume that q is omposite (and odd). If g d(a; q) > 1 then Test 1

shows that q is omposite. So, we may assume that that a is uniformly

distributed over the group G = (Z=qZ)

�

. We onsider two major ases.

Case A: q = p

�

, where p is an odd prime, and � > 1. We show

that there is an invertible (mod q)-residue x su h that x

q�1

6= 1, namely

x = (1 + p

��1

) mod q. Indeed, x

�1

= (1� p

��1

) mod q, and

4

x

q�1

� (1 + p

��1

)

q�1

= 1 + (q � 1)p

��1

+ higher powers of p

� 1� p

��1

6� 1 (mod q):

Therefore Test 1 dete ts the ompositeness of q with probability � 1=2 (due

to Lemma 4.2).

Case B: q has at least two distin t prime fa tors. Then q = uv, where u

and v are odd numbers, u; v > 1, and g d(u; v) = 1. The Chinese remainder

theorem (Theorem A.5) says that the group G = (Z=qZ)

�

is isomorphi to

the dire t produ t U � V , where U = (Z=uZ)

�

and V = (Z=vZ)

�

, and that

ea h element x 2 G orresponds to the pair

�

(x mod u); (x mod v)

�

.

For any m we de�ne the following subgroup ( f. formula (4.2)):

G

(m)

=

�

x

m

: x 2 G

= Im'

m

; where '

m

: x 7! x

m

:

Note that G

(m)

= f1g if and only if G

(m)

= G; this is yet another way to

formulate the ondition that a

m

= 1 for all a 2 G. Also note that if a is

4

A similar argument is used to prove that the group (Z=p

�

Z)

�

is y li ; see Theorem A.11.


uniformly distributed over G, then a

m

is uniformly distributed over G

(m)

.

Indeed, the map '

m

: x 7! x

m

is a group homomorphism; therefore the

number or pre-images is the same for all elements of G

(m)

. It is lear that

G

(m)

�

=

U

(m)

� V

(m)

. Now we have two sub ases.

Case 1. U

(q�1)

6= f1g or V

(q�1)

6= f1g. This ondition implies that

G

(q�1)

6= f1g, so that Test 1 dete ts q being omposite with probability at

least 1=2.

Case 2. U

(q�1)

= f1g and V

(q�1)

= f1g. In this ase Test 1 is always

passed, so we have to study Test 2. Let us de�ne two sequen es of subgroups:

U

(l)

� U

(2l)

� � � � � U

(2

k

l)

= f1g; V

(l)

� V

(2l)

� � � � � V

(2

k

l)

= f1g:

Note that U

(l)

6= f1g and V

(l)

6= f1g. Spe i� ally, both U

(l)

and V

(l)

ontain

the residues that orrespond to �1. Indeed, both U and V ontain �1, and

(�1)

l

= �1 sin e l is odd.

Going from right to left, we �nd the �rst pla e where one of the sets

U

(m)

, V

(m)

ontains an element di�erent from 1. It other words, we �nd

t = 2

s

l su h that 0 � s < k, U

(2t)

= f1g, V

(2t)

= f1g, and either U

(t)

6= f1g

or V

(t)

6= f1g.

We will prove that Test 2 shows (with probability at least 1=2) that q is

omposite. By our assumption a

2t

= 1, so we need to know the probability

of the event a

t

6= �1. Let us onsider two possibilities.

Case 2a. One of the sets U

(t)

, V

(t)

equals f1g (for example, let U

(t)

=

f1g). This means that for any a the pair

�

(a

t

mod u); (a

t

mod v)

�

has 1 as

the �rst omponent. Therefore a

t

6= �1, sin e �1 is represented by the pair

(�1;�1).

At the same time, V

(t)

6= f1g; therefore the probability that a

t

has 1 in

the se ond omponent is at most 1=2 (by Lagrange's theorem; f. proof of

Lemma 4.2). Thus Test 2 says \ omposite" with probability at least 1=2.

Case 2b. Both sets U

(t)

and V

(t)

ontain at least two elements: jU

(t)

j =

� 2, jV

(t)

j = d � 2. In this ase a

t

has 1 in the �rst omponent with

probability 1= (there are equiprobable possibilities) and has 1 in the

se ond omponent with probability 1=d. These events are independent due

to the Chinese remainder theorem, so the probability of the event a

t

= 1

is 1= d. For similar reasons the probability of the event a

t

= �1 is either

0 or 1= d. In any ase the probability of the event a

t

= �1 is at most

2= d � 1=2. �

4.3. BPP and ir uit omplexity.

Theorem 4.4. BPP � P=poly.


Proof. Let L(x) be a BPP-predi ate, and M a probabilisti TM that de-

ides L(x) with probability at least 1 � ". By running M repeatedly we

an de rease the error probability. Re all that a polynomial number of rep-

etitions leads to the exponential de rease. Therefore we an onstru t a

polynomial probabilisti TM M

0

that de ides L(x) with error probability

less that "

0

< 1=2

n

for inputs x of length n.

The ma hine M

0

uses a random string r (one random bit for ea h step).

For ea h input x of length n, the fra tion of strings r that lead to an in orre t

answer is less than 1=2

n

. Therefore the overall fra tion of \bad" pairs (x; r)

among all su h pairs is less than 1=2

n

. [If one represents the set of all pairs

(x; r) as a unit square, the \bad" subset has area < 1=2

n

.℄ It follows that

there exists r = r

�

su h that the fra tion of bad pairs (x; r) is less than

1=2

n

among all pairs with r = r

�

. However, there are only 2

n

su h pairs

( orresponding to 2

n

possibilities for x). The only way the fra tion of bad

pairs an be smaller than 1=2

n

is that there are no bad pairs at all!

Thus we on lude that there is a parti ular string r

�

that produ es or-

re t answers for all x of length n.

The ma hine M

0

an be transformed into a polynomial-size ir uit with

input (x; r). Then we �x the value of r (by setting r = r

�

) and obtain a

polynomial-size ir uit with input x that de ides L(x) for all n-bit strings x.

�

This is a typi al non onstru tive existen e proof: we know that a \good"

string r

�

exists (by probability ounting) but have no means of �nding it,

apart from exhaustive sear h.

Remark 4.3. It might well be the ase that BPP = P. Let us explain

brie y the motivation of this onje ture and why it is hard to prove.

Speaking about proved results, it is lear that BPP � PSPACE. Indeed,

the algorithm that ounts all values of the string r that lead to the answer

\yes" runs in polynomial spa e. Note that the running time of this algorithm

is 2

N

poly(n), where N = jrj is the number of random bits.

On the other hand, there is empiri al eviden e that probabilisti algo-

rithms usually work well with pseudo-random bits instead of truly random

ones. So attempts have been made to onstru t a mathemati al theory of

pseudo-random numbers. The general idea is as follows. A pseudo-random

generator is a fun tion g : B

l

! B

L

whi h transforms short truly random

strings (of length l, whi h an be as small as O(logL)) into mu h longer

pseudo-random strings (of length L). \Pseudo-random" means that any

omputational devi e with limited resour es (say, any Boolean ir uit of a

given size N omputing a fun tion F : B

L

! B ) is unable to distinguish

between truly random and pseudo-random strings of length L. Spe i� ally,


we require that

�

�

�

Pr

�

F (g(x)) = 1

�

�Pr

�

F (y) = 1

�

�

�

�

� Æ; x 2 B

l

; y 2 B

L

for some onstant Æ < 1=2, where x and y are sampled from the uniform

distributions. (In this de�nition the important parameters are l and N ,

while L should �t the number of input bits of the ir uit. For simpli ity

let us require that L = N : it will not hurt if the pseudo-random generator

produ es some extra bits.)

It is easy to show that pseudo-random generators g : B

O(log L)

! B

L

ex-

ist: if we hoose the fun tion g randomly, it ful�lls the above ondition with

high probability. What we a tually need is a sequen e of eÆ iently om-

putable pseudo-random generators g

L

: B

l(L)

! B

L

, where l(L) = O(logL).

If su h pseudo-random generators exist, we an use their output instead of

truly random bits in any probabilisti algorithm. The de�nition of pseudo-

randomness guarantees that this will work, provided the running time of

the algorithm is limited by

p

L (for a suitable onstant ) and the error

probability " is smaller than 1=2 � Æ. (With pseudo-random bits the error

probability will be " + Æ, whi h is still less than 1=2. The estimate

p

L

omes from the simulation of a Turing ma hine by Boolean ir uits, see

Theorem 2.2.) Thus we de rease the number of needed genuine random

bits from L to l. Then we an derandomize the algorithm by trying all

2

l

possibilities. If l = O(logL), the resulting omputation has polynomial

omplexity.

We do not know whether eÆ iently omputable pseudo-random gener-

ators exist. The trouble is that the de�nition has to be satis�ed for \any

Boolean ir uit of a given size"; this ondition is extremely hard to deal

with. But we may try to redu e this problem to a more fundamental one

| obtaining lower bounds for the ir uit omplexity of Boolean fun tions.

Even this idea, whi h sets the most diÆ ult part of the problem aside, took

many years to realize. Mu h work in this area was done in 1980's and early

1990's, but the results were not as strong as needed. Re ently there has

been a dramati progress leading to more eÆ ient onstru tions of pseudo-

random generators and new derandomization te hniques. It has been proved

that BPP = P if there is an algorithm with running time exp(O(n)) that

omputes a sequen e of �n tions with ir uit omplexity exp((n)) [26℄. We

also mention a new work [69℄ in whi h pseudo-random generators are on-

stru ted from arbitrary hard fun tions in an optimal (up to a polynomial)

way.


5. The hierar hy of omplexity lasses

Re all that we identify languages (sets of strings) and predi ates (and x 2 L

means L(x) = 1).

De�nition 5.1. Let A be some lass of languages. The dual lass o-A

onsists of the omplements of all languages in A. Formally,

L 2 o-A, (B

�

nL) 2 A:

It follows immediately from the de�nitions that P = o-P, BPP =

o-BPP, PSPACE = o-PSPACE.

5.1. Games ma hines play. Consider a game with two players alled

White (W) and Bla k (B). A string x is shown to both players. After that,

players alternately hoose binary strings: W starts with some string w

1

, B

replies with b

1

, then W says w

2

, et . Ea h string has length polynomial in

jxj. Ea h player is allowed to see the strings already hosen by his opponent.

The game is ompleted after some pres ribed number of steps, and

the referee, who knows x and all the strings and who a ts a ording to

a polynomial-time algorithm, de lares the winner. In other words, there is a

predi ate W (x;w

1

; b

1

; w

2

; b

2

; : : : ) that is true when W is the winner, and we

assume that this predi ate belongs to P. If this predi ate is false, B is the

winner (there are no ties). This predi ate (together with polynomial bounds

for the length of strings and the number of steps) determines the game.

Let us note that in fa t the termination rule an be more ompli ated,

but we always assume that the number of moves is bounded by a polynomial.

Therefore we an \pad" the game with dummy moves that are ignored by

the referee and onsider only games where the number of moves is known in

advan e and is a polynomial in the input length.

Sin e this game is �nite and has no ties, for ea h x either B or W has

a winning strategy. (One an formally prove this using indu tion over the

number of moves.) Therefore, any game determines two omplementary

sets,

L

w

= fx : W has a winning strategyg ;

L

b

= fx : B has a winning strategyg :

Many omplexity lasses an be de�ned as lasses formed by the sets L

w

(or

L

b

) for some lasses of games. Let us give several examples.

P: the sets L

w

(or L

b

) for games of zero length (the referee de lares the

winner after he sees the input)


NP: the sets L

w

for games that are �nished after W's �rst move. In

other words, NP-sets are sets of the form

fx : 9w

1

W (x;w

1

)g:

o-NP: the sets L

b

for games that are �nished after W's �rst move. In

other words, o-NP-sets are sets of the form

fx : 8w

1

B(x;w

1

)g

(here B = :W means that B wins the game.)

�

2

: the sets L

w

for games where W and B make one move ea h and then

the referee de lares the winner. In other words, �

2

-sets are sets of the form

fx : 9w

1

8 b

1

W (x;w

1

; b

1

)g:

(W an make a winning move w

1

after whi h any move b

1

of B makes B

lose).

�

2

: the sets L

b

for the same lass of games, i.e., the sets of the form

fx : 8w

1

9 b

1

B(x;w

1

; b

1

)g:

: : :

�

k

: the sets L

w

for games of length k (the last move is made by W if k

is odd or by B is k is even), i.e., the sets

fx : 9w

1

8 b

1

: : : Q

k

y

k

W (x;w

1

; b

1

; : : : )g

(if k is even, then Q

k

= 8; y

k

= b

k=2

; if k is odd, then Q

k

= 9; y

k

=

w

(k+1)=2

).

�

k

: the sets L

b

for the same lass of games, i.e., the sets

fx : 8w

1

9 b

1

: : : Q

k

y

k

B(x;w

1

; b

1

; : : : )g

(if k is even, then Q

k

= 9; y

k

= b

k=2

; if k is odd, then Q

k

= 8; y

k

=

w

(k+1)=2

).

: : :

Evidently, omplements of �

k

-sets are �

k

-sets and vi e versa: �

k

=

o-�

k

, �

k

= o-�

k

:

Theorem 5.1 (Lautemann [46℄). BPP � �

2

\�

2

.

Proof. Sin e BPP= o-BPP, it suÆ es to show that BPP � �

2

.

Let us assume that L 2 BPP. Then there exist a predi ate R ( om-

putable in polynomial time) and a polynomial p su h that the fra tion

jS

x

j=2

N

, where

(5.1) S

x

=

�

y 2 B

N

: R(x; y)

; N = p(jxj);


is either large (greater than 1� " for x 2 L) or small (less than " for x =2 L).

To show that L 2 �

2

, we need to reformulate the property \X is a large

subset of G" (where G is the set of all strings y of length N) using existential

and universal quanti�ers.

This ould be done if we impose a group stru ture on G. Any group

stru ture will work if the group operations are polynomial-time omputable.

For example, we an onsider an additive group formed by bit strings of a

given length with bit-wise addition modulo 2.

The property that distinguishes large sets from small ones is the follow-

ing: \several opies of X shifted by some elements over G", i.e.,

(5.2) 9 g

1

; : : : ; g

m

�

S

i

(g

i

+X) = G

�

;

where \+" denotes the group operation. To hoose an appropriate value for

m, let us see when (5.2) is guaranteed to be true (or false).

It is obvious that ondition (5.2) is false if

(5.3) mjXj < jGj:

On the other hand, (5.2) is true if for independent random g

1

; : : : ; g

m

2 G

the probability of the event

S

i

(g

i

+ X) = G is positive; in other words, if

Pr

�

S

i

(g

i

+X) 6= G

�

< 1. Let us estimate this probability.

The probability that a random shift g + X does not ontain a �xed

element u 2 G (for a given X and random g) is 1�jXj=jGj. When g

1

; : : : ; g

m

are hosen independently, the orresponding sets g

1

+X; : : : ; g

m

+X do not

over u with probability (1� jXj=jGj)

m

. Summing these probabilities over

all u 2 G, we see that the probability of the event

S

i

(g

i

+X) 6= G does not

ex eed jGj(1� jXj=jGj)

m

.

Thus ondition (5.2) is true if

(5.4) jGj

�

1� jXj=jGj

�

m

< 1:

Let us now apply these results to the set X = S

x

(see formula (5.1)).

We want to satisfy (5.3) and (5.4) when jS

x

j=2

N

< " (i.e., x =2 L) and when

jS

x

j=2

N

> 1�" (i.e., x 2 L), respe tively. Thus we get the inequlities "m < 1

and 2

N

"

m

< 1, whi h should be satis�ed simultaneously by a suitable hoi e

of m. This is not always possible if N and " are �xed. Fortunately, we have

some exibility in the hoi e of these parameters. Using \ampli� ation of

probability" by repeating the omputation k times, we in rease N by fa tor

of k, while de reasing " exponentially. Let the initial value of " be a onstant,

and � given by (4.1). The ampli� ation hanges N and " to N

0

= kN and

"

0

= �

k

. Thus we need to solve the system

�

k

m < 1; 2

kN

�

km

< 1


by adjusting m and k. It is obvious that there is a solution with m = O(N)

and k = O(logN).

We have proved that x 2 L is equivalent to the following �

2

- ondition:

9g

1

; : : : ; g

m

8y

�

�

jyj = p

0

(jxj)

�

)

�

(y 2 g

1

+ S

0

x

) _ � � � _ (y 2 g

m

+ S

0

x

)

�

�

:

Here p

0

(n) = kp(jxj) (k and m also depend on jxj), whereas S

0

x

is the \am-

pli�ed" version of S

x

.

In other words, we have onstu ted a game where W names m strings

(group elements) g

1

; : : : ; g

m

, and B hooses some string y. If y is overed by

some g

i

+ S

0

x

(whi h is easy to he k: it means that y � g

i

belongs to S

0

x

),

then W wins; otherwise B wins. In this game W has a winning strategy if

and only if S

0

x

is big, i.e., if x 2 L. �

5.2. The lass PSPACE. This lass ontains predi ates that an be om-

puted by a TM running in polynomial (in the input length) spa e. The lass

PSPACE also has a game-theoreti des ription:

Theorem 5.2. L 2 PSPACE if and only if there exists a polynomial game

su h that

L = fx : W has a winning strategy for input xg:

By a polynomial game we mean a game where the number of moves is

bounded by a polynomial (in the length of the input), players' moves are

strings of polynomial length, and the referee's algorithm runs in polynomial

time.

Proof. ( We show that a language determined by a game belongs to

PSPACE. Let the number of turns be p (jxj). We onstru t a sequen e of

ma hines M

1

; : : : ;M

p(jxj)

. Ea h M

k

gets a pre�x x;w

1

; b

1

; : : : of the play

that in ludes k moves and determines who has the winning strategy in the

remaining game.

The ma hine M

p(jxj)

just omputes the predi ate W (x;w

1

; : : : ) using

referee's algorithm. The ma hineM

k

tries all possibilities for the next move

and onsults M

k+1

to determine the �nal result of the game for ea h of

them. Then M

k

gives an answer a ording to the following rule, whi h says

whether W wins. If it is W's turn, then it suÆ es to �nd a single move for

whi h M

k+1

de lares W to be the winner. If it is B's turn, then W needs to

win after all possible moves of B.

The ma hine M

0

says who is the winner before the game starts and

therefore de ides L(x). Ea h ma hine in the sequen e M

1

; : : : ;M

p(jxj)

uses

only a small (polynomially bounded) amount of memory, so that the om-

posite ma hine runs in polynomial spa e. (Note that the omputation time

is exponential sin e ea h of the M

k

alls M

k+1

many times.)


) Let M be a ma hine that de ides the predi ate L and runs in poly-

nomial spa e s. We may assume that omputation time is bounded by 2

O(s)

.

Indeed, there are 2

O(s)

di�erent on�gurations, and after visiting the same

on�guration twi e the omputation repeats itself, i.e., the omputation be-

omes y li .

[To see why there are at most 2

O(s)

on�gurations note that on�guration

is determined by head position (in f0; 1; : : : ; sg), internal state (there are jQj

of them) and the ontents of the s ells of the tape (jAj

s

possibilities where

A is the alphabet of TM); therefore the total number of on�gurations is

jAj

s

� jQj � s = 2

O(s)

.℄

Therefore we may assume without loss of generality that the running

time of M on input x is bounded by 2

q

, where q = poly(jxj).

In the des ription of the game given below we assume that TM keeps its

on�guration un hanged after the omputation terminates.

During the game, W laims thatM 's result for an input string x is \yes",

and B wants to disprove this. The rules of the game allow W to win ifM(x)

is indeed \yes" and allow B to win if M(x) is not \yes".

In his �rst move W de lares the on�guration of M after 2

q�1

steps

dividing the omputation into two parts. B an hoose any of the parts:

either the time interval [0; 2

q�1

℄ or the interval [2

q�1

; 2

q

℄. (B tries to at h

W by hoosing the interval where W is heating.) Then W de lares the

on�guration of M at the middle of the interval hosen by B and divides

this interval into two halves, B sele ts one of the halves, W de lares the

on�guration of M at the middle, et .

The game ends when the length of the interval be omes equal to 1. Then

the referee he ks whether the on�gurations orresponding to the ends of

this interval mat h (the se ond is obtained from the �rst a ording to M 's

rules). If they mat h, then W wins; otherwise B wins.

If M 's output on x is really \yes", then W wins if he is honest and

de lares the a tual on�guration of M . If M 's output is \no", then W is

for ed to heat: his laim is in orre t for (at least) one of the halves. If B

sele ts this half at ea h move, than B an �nally at h W \on the spot" and

win. �

[2!℄ Problem 5.1. Prove that any predi ate L(x) that is re ognized by a

nondeterministi ma hine in spa e s = poly(jxj) belongs to PSPACE. (A

predi ate L is re ognized by an NTM M in spa e s(jxj) if for any x 2 L

there exists a omputational path of M that gives the answer \yes" using

at most s(jxj) ells and, for ea h x =2 L, no omputational path of M ends

with \yes".)


Theorem 5.2 shows that all the lasses �

k

; �

k

are subsets of PSPACE.

Relations between these lasses are represented by the in lusion diagram in

Figure 5.1. The smallest lass is P (games of length 0); P is ontained in

both lasses NP and o-NP(whi h orrespond to games of length 1); lasses

NP and o-NP are ontained in �

2

and �

2

(games with two moves) and so

on. We get the lass PSPACE, allowing the number of moves in a game be

polynomial in jxj.

P

BPP NP o-NP

PSPACE

NP \ o-NP

�

2

�

2

�

2

\ �

2

� � � � � � � � � � � �

Fig. 5.1. In lusion diagram for omputational lasses. An arrow from

A to B means that B is a subset of A.

We do not know whether the in lusions in the diagram are stri t. Com-

puter s ientists have been working hard for several de ades trying to prove

at least something about these lasses, but the problem remains open. It is

possible that P = PSPACE (though this seems very unlikely). It is also pos-

sible that PSPACE = EXPTIME, where EXPTIME is the lass of languages

de idable in time 2

poly(n)

. Note, however, that P 6= EXPTIME | one an

prove this by a rather simple \diagonalization argument" ( f. solution to

Problem 1.3).

[3!℄ Problem 5.2. A Turing ma hine with ora le for language L uses a

de ision pro edure for L as an external subroutine ( f. De�nition 2.2). The

ma hine has a supplementary ora le tape, where it an write strings and

then ask the \ora le" whether the string written on the ora le tape belongs

to L.

Prove that any language that is de idable in polynomial time by a TM

with ora le for some L 2 �

k

(or L 2 �

k

) belongs to �

k+1

\�

k+1

.

The lass PSPACE has omplete problems (to whi h any problem from

PSPACE is redu ible). Here is one of them.


The TQBF Problem is given by the predi ate

TQBF (x) , x is a True Quanti�ed Boolean Formula, i.e., a true state-

ment of type Q

1

y

1

: : : Q

n

y

n

F (y

1

; : : : ; y

n

), where variables

y

i

range over B = f0; 1g, F is some propositional formula

(involving y

1

; : : : ; y

n

, :;^;_), and Q

i

is either 8 or 9.

By de�nition, 8y A(y) means (A(0) ^A(1)) and 9y A(y) means (A(0) _

A(1)).

Theorem 5.3. TQBF is PSPACE- omplete.

Proof. We redu e an arbitrary language L 2 PSPACE to TQBF . Using

Theorem 5.2, we onstru t a game that orresponds to L. Then we onvert

a TM that omputes the result of the game (a predi ate W ) into a ir uit.

Moves of the players are en oded by Boolean variables. Then the existen e

of the winning strategy for W an be represented by a quanti�ed Boolean

formula

9w

1

1

9w

2

1

: : : 9w

p(jxj)

1

8 b

1

1

: : : 8 b

p(jxj)

1

9w

1

2

: : : 9w

p(jxj)

2

: : : S(x;w

1

1

; w

2

1

; : : : );

where S (�) denotes the Boolean fun tion omputed by the ir uit. (Boolean

variables w

1

1

; : : : ; w

p(jxj)

1

en ode the �rst move of W, variables b

1

1

; : : : ; b

p(jxj)

1

en ode B's answer, w

1

2

; : : : ; w

p(jxj)

2

en ode the se ond move of W, et .)

In order to onvert S into a Boolean formula, re all that a ir uit is

a sequen e of assignments y

i

:= R

i

that determine the values of auxiliary

Boolean variables y

i

. Then we an repla e S (�) by a formula

9y

1

; : : : ;9y

s

�

(y

1

, R

1

) ^ � � � ^ (y

s

, R

s

) ^ y

s

�

;

where s is the size of the ir uit.

After this substitution we obtain a quanti�ed Boolean formula whi h is

true if and only if x 2 L. �

Remark 5.1. Note the similarity between Theorem 5.3 (whi h is about

polynomial spa e omputation) and Problems 2.17 and 2.18 (whi h are

basi ally about poly-logarithmi spa e omputation). Also note that a

polynomial-size quanti�ed Boolean formula may be regarded as a polyno-

mial depth ir uit (though of very spe ial stru ture): the 8 and 9 quanti�ers

are similar to the ^ and _ gates. It is not surprising that the solutions are

based on mu h the same ideas. In parti ular, the redu tion from NMT to

TQBF is similar to the parallel simulation of a �nite-state automaton (see

Problem 2.11). However, in the ase of TQBF we ould a�ord reasoning at

a more abstra t level: with greater amount of omputational resour es we

were sure that all bookkeeping onstru tions ould be implemented. This

is one of the reasons why \big" omputational lasses (like PSPACE) are

popular among omputer s ientists, in spite of being apparently impra ti al.


In fa t, it is sometimes easier to prove something about big lasses, and then

s ale down the problem parameters while �xing some details.

Part 2

Quantum Computation

As already mentioned in the introdu tion, ordinary omputers do not employ

all possibilities o�ered by Nature. Their internal work is based on operations

with 0s and 1s, while in Nature there is possibility of performing unitary

transformations, i.e., of operating on an in�nite set.

1

This possibility is

des ribed by quantum me hani s. Devi es (real or imaginary) using this

possibility are alled quantum omputers.

It is not lear a priori whether the omputational power is really in-

reased in passing from Boolean fun tions to unitary transformations on

�nite-dimensional spa es. However, there is strong eviden e that su h an

in rease is a tually a hieved. For example, onsider the fa toring problem

| de omposition of an integer into prime fa tors. No polynomial algorithm

is known for solving this problem on ordinary omputers. But for quantum

omputers, su h algorithms do exist.

Ordinary omputers operate with states built from a �nite number of

bits. Ea h bit may exist in one of the two states, 0 or 1. The state of the

whole system is given by spe ifying the values of all the bits. Therefore, the

set of states B

n

=f0; 1g

n

is �nite and has ardinality 2

n

.

A quantum omputer works with a �nite set of obje ts alled qubits.

Ea h qubit has two separate states, also denoted by 0 and 1 (if we think

of qubits as spins, then these states are \spin up" and \spin down"). The

2

n

assignments of individual states to ea h qubit do not yield all possible

states of the system, but they form a basis in a spa e of states. Arbitrary

linear ombinations of the basis states, with omplex oeÆ ients, are also

1

Of ourse, a tual in�nity does not o ur in Nature. In the given ase the essential fa t

is that a unitary transformation an be performed only with some pre ision | for details see

Se tion 8.

53

54 2. Quantum Computation

possible. We will denote the basis states by jx

1

; : : : ; x

n

i, where x

j

2 B , or

by jxi, where x 2 B

n

. An arbitrary state of the system may be represented

in the form

2

j i =

X

(x

1

;:::;x

n

)2B

n

x

1

;:::;x

n

jx

1

; : : : ; x

n

i; where

X

(x

1

;:::;x

n

)2B

n

j

x

1

;:::;x

n

j

2

= 1:

The state spa e for su h a system is a linear spa e of dimension 2

n

over the

�eld C of omplex numbers.

State

of an ordinary omputer of a quantum omputer

� � � � � � bits

x

1

x

2

: : : x

n

x

j

2 B

� � � � � � qubits

basis: jx

1

; x

2

; : : : ; x

n

i; x

j

2 B

arbitrary:

P

x2B

n

x

jxi; where

P

x2B

n

j

x

j

2

= 1

One detail to add: if we multiply the ve tor

P

x

x

jxi by a phase fa tor

e

i'

(' real), we obtain a physi ally indistinguishable state. Therefore, a

state of a quantum omputer is a unit ve tor de�ned up to a phase fa tor.

Computation may be imagined as a sequen e of transformations on the

set of states of the system. Let us des ribe whi h transformations are pos-

sible in the lassi al and in the quantum ase:

Classi al ase: Quantum ase:

transformations are fun tions

from B

n

to B

n

:

transformations are unitary operators, i.e., oper-

ators that preserve the length

P

x2B

n

j

x

j

2

of ea h

ve tor

P

x2B

n

x

jxi.

Remark. All that has been said pertains only to isolated systems. A real

quantum omputer is (will be) a part of a larger system (the Universe),

intera ting with the remaining world. Quantum states and transformations

of open systems will be onsidered in Se tions 10, 11.

Now we must give a formal de�nition of quantum omputation. As in the

lassi al ase, one an de�ne both quantum Turing ma hines and quantum

ir uits. We hoose the se ond approa h, whi h is more onvenient for a

number of reasons.

2

The bra kets j : : : i in the notation j i do not signify any operation on the obje t | they

merely indi ate that represents a ve tor.


6. De�nitions and notation

6.1. The tensor produ t. A system of n qubits has a state spa e C

2

n

,

whi h an be represented as a tensor produ t, C

2

� � � C

2

= (C

2

)

n

. The

fa tors orrespond to a spa e of a single qubit.

The tensor produ t of linear spa es L and M an be de�ned as an

arbitrary spa e N of dimension (dimL)(dimM). The idea is that if L and

M are endowed with some bases, fe

1

; : : : ; e

l

g � L and ff

1

; : : : ; f

m

g � M,

then N possesses a standard basis whose elements are asso iated with pairs

(e

j

; f

k

). We denote these elements by e

j

f

k

. Using this standard basis

�

e

j

f

k

: j = 1; : : : ; l; k = 1; : : : ;m

;

one an de�ne the tensor produ t of arbitrary two ve tors, u =

P

j

u

j

e

j

and

v =

P

k

v

k

f

k

(u

j

; v

k

2 C ) in su h a way that the map : (u; v) 7! u v is

linear in both u and v:

(6.1) u v =

X

j;k

(u

j

v

k

) e

j

f

k

:

We will mostly use this \pedestrian" de�nition of the tensor produ t,

although it is not invariant, i.e., it depends on the hoi e of bases in L

and M. An invariant de�nition is abstra t and hard to grasp, but it is

indispensable if we really want to prove something. The tensor produ t of

two spa es, L andM, is a spa e N = LM, together with a bilinear map

H : L �M ! N (also denoted by , i.e., u v

def

= H(u; v)) whi h satisfy

the following universality property :

for any spa e F and any bilinear fun tion F : L�M! F , there is a

unique linear fun tion G : LM! F su h that F (u; v) = G(uv)

(for every pair of u 2 L, v 2M.)

[1℄ Problem 6.1. Prove that the tensor produ t de�ned in the \pedes-

trian" way (i.e., using bases) satis�es the universality property.

[1℄ Problem 6.2. Consider two linear maps, A : L ! L

0

and B : M !

M

0

. Prove that there is a unique linear map C = AB : LM! L

0

M

0

su h that C(u v) = A(u) B(v) for any u 2 L, v 2M.

[2℄ Problem 6.3. Show that the pair (N ;H) in the abstra t de�nition of

the tensor produ t is unique. (Figure out in what sense).

6.2. Linear algebra in Dira 's notation. In our ase, there is a pre-

hosen lassi al basis: fj0i; j1ig for C

2

, and fjx

1

; : : : ; x

n

i : x

j

2 B g for

(C

2

)

n

. The spa e C

2

furnished with a basis is denoted by B. The basis is

onsidered orthonormal, whi h yields an inner produ t on the spa e of states.

The oeÆ ients

x

1

:::x

n

of the de omposition of a ve tor j i relative to this


basis are alled amplitudes. Their physi al meaning is that the square of the

absolute value j

x

1

:::x

n

j

2

of the amplitude is interpreted as the probability of

�nding the system in the given state of the basis. As must be the ase, the

sum of the probabilities is equal to 1 sin e the length of the ve tor is assumed

to be 1. (Probabilities will be fully dis ussed later; for some time we will

be o upied with linear algebra, namely with studying unitary operators on

the spa e B

n

.)

We will use (and have already used) notation ustomary in physi s,

introdu ed by Dira , for ve tors and the inner produ t. Ve tors are denoted

like this: j i ; the inner produ t of two ve tors is denoted by h�j�i. If

j�i =

P

x

a

x

jxi and j�i =

P

x

b

x

jxi, then h�j�i =

P

x

a

�

x

b

x

. (From now on,

a

�

stands for the omplex onjugate.) In the notation for ve tors the bra kets

are needed only \for elegan e" | they indi ate the type of the obje t and

o�er a symmetri designation (see below). In pla e of j�i we ould simply

write �, even though this is not ustomary. Thus j�

1

+ �

2

i = j�

1

i+ j�

2

i, and

both expressions mean �

1

+ �

2

.

The inner produ t is Hermitian. It is onjugate-linear in the �rst argu-

ment

3

and linear in the se ond, i.e.,

h�

1

+ �

2

j�i = h�

1

j�i+ h�

2

j�i; h�j�

1

+ �

2

i = h�j�

1

i+ h�j�

2

i;

h �j�i =

�

h�j�i; h�j �i = h�j�i:

If we take the left half of the inner produ t symbol, we get a bra-ve tor

h�j, i.e., a linear fun tional on ket-ve tors (i.e., the ve tors from our spa e).

Bra- and ket-ve tors are in a one-to-one orresponden e to one another.

(Nonetheless, it is ne essary to distinguish them in some way | and it is

just for this purpose the angle bra kets were introdu ed.) Be ause of the

onjugate linearity of the inner produ t with respe t to the �rst argument,

we have the equation h �j =

�

h�j. A bra-ve tor may be written as a row,

and a ket-ve tor as a olumn (so as to be able to multiply it on the left by

a matrix):

h�j =

�

0

h0j+

�

1

h1j = (

�

0

;

�

1

); j�i =

0

j0i +

1

j1i =

�

0

1

�

:

The notation h�jAj�i, where A is a linear operator, an be interpreted in

two ways: either as the produ t of the bra-ve tor h�j and the ket-ve tor Aj�i,

or as the produ t of h�jA and j�i. The �rst interpretation is ompletely lear,

whereas the se ond should be viewed as a de�nition of the linear fun tional

h j = h�jA. The orresponding ket-ve tor j i is related to j�i by a linear

operator A

y

alled Hermitian adjoint to A. Thus we write j i = A

y

j�i,

3

Note that mathemati ians often de�ne the inner produ t to be onjugate-linear in the se ond

argument.


h j = hA

y

�j, so the de�ning property of A

y

is

hA

y

�j�i = h�jAj�i:

Operators an be spe i�ed as matri es relative to the lassi al basis (or

any other orthonormal basis),

A =

X

j;k

a

jk

jjihkj; where a

jk

= hjjAjki:

It is lear that jjihkj is a linear operator:

�

jjihkj

�

j�i = hkj�i jji.

The set of linear operators on a spa e M is denoted by L(M). Some-

times we will have to onsider linear maps between di�erent spa es, say, from

N to M. The spa e of su h maps is denoted by L(N ;M). It is naturally

isomorphi to MN

�

: the isomorphism takes an operator

P

a

jk

jjihkj 2

L(N ;M) to the ve tor

P

a

jk

jji hkj 2 MN

�

.

A unitary operator on a spa eM is an invertible operator that preserves

the inner produ t. The ondition

h�j�i = hU�jU j�i = h�jU

y

U j�i

is equivalent to U

y

U = I (where I is the identity operator). Sin e the

spa eM has �nite dimension, the above ondition implies that jdetU j = 1,

so the existen e of U

�1

follows automati ally. Unitary operators are also

hara terized by the property U

�1

= U

y

. The set of unitary operators is

denoted by U(M).

Our de�nition of the inner produ t in B

n

is onsistent with the tensor

produ t:

�

h�

1

j h�

2

j

��

j�

1

i j�

2

i

�

= h�

1

j�

1

ih�

2

j�

2

i:

Later on we will use the tensor produ t of operators ( f. Problem 6.2.) It is

an operator a ting on the tensor produ t of the spa es on whi h the fa tors

a t. The a tion is de�ned by the rule

(AB)j�i j�i = Aj�i Bj�i:

If the operators are given in the matrix form relative to some basis, i.e.,

A =

X

j;k

a

jk

jjihkj; B =

X

j;k

b

jk

jjihkj;

then the matrix elements of the operator C = AB have the form

(jk)(lm)

=

a

jl

b

km

.

[2!℄ Problem 6.4. Let X : L ! N be a linear map. Prove that the oper-

ators XX

y

and X

y

X have the same set of nonzero eigenvalues �

2

j

, ounted


with multipli ities. (The numbers �

j

> 0 are alled the singular values of

X.) Moreover, the following singular value de omposition holds:

(6.2) X =

X

j

�

j

j�

j

ih�

j

j;

where fj�

j

ig, fj�

j

ig are orthonormal eigenve tor systems for XX

y

andX

y

X,

respe tively. (There is freedom in the hoi e of ea h system, but if one is

�xed, the other is determined uniquely.)

6.3. Quantum gates and ir uits. Computation onsists of transforma-

tions, regarded as elementary and performed one at a time.

Elementary transformation in the

lassi al ase: a map from B

n

to

B

n

whi h alters and depends upon

a small number (not depending on

n) of bits; the remaining bits are

not used.

Elementary transformation in the quan-

tum ase: the tensor produ t of an arbitrary

unitary operator a ting on a small number

(r = O(1)) of qubits, denoted altogether by

B

r

, and the identity operator a ting on the

remaining qubits.

The tensor produ t of an operator U a ting on an ordered set A of

qubits and the identity operator a ting on the remaining qubits, is denoted

by U [A℄. In this situation, we say that the operator U is applied to the

register A. This de�nition is somewhat vague, but the formal onstru tion

of the operator U [A℄ is pretty straightforward.

First, let us de�ne X[A℄ when A onsists of just one qubit, say p. In this

ase, X[p℄ = I

B

(p�1)

X I

B

(n�p)

. Note that X[p℄ and Y [q℄ ommute if

p 6= q. In the general ase A = (p

1

; : : : ; p

r

), we an represent U as follows:

U =

X

j

1

;:::;j

r

;k

1

;:::;k

r

u

j

1

;:::;j

r

; k

1

;:::;k

r

�

jj

1

ihk

1

j

�

� � �

�

jj

r

ihk

r

j

�

:

A tually, all we need is a representation of the form

(6.3) U =

X

m

X

m;1

� � � X

m;r

;

where X

m;1

; : : : ;X

m;r

2 L(B) are arbitrary one-qubit operators. Then, by

de�nition,

(6.4) U [p

1

; : : : ; p

r

℄ =

X

m

X

m;1

[p

1

℄ � � �X

m;r

[p

r

℄:

The result does not depend on the hoi e of the representation (6.3) due to

the universality property of the tensor produ t (see p. 55). In the ase at

hand, we have a multilinear map F (X

m;1

; : : : ;X

m;r

) = X

m;1

[p

1

℄ � � �X

m;r

[p

r

℄,

whereas the orresponding linear map

G : U 7! U [p

1

; : : : ; p

r

℄ : L(B

r

)! L(B

n

)

is given by (6.4).


Example 6.1. Let U =

�

u

00

u

01

u

10

u

11

�

. Then the operators U [1℄ and U [2℄,

a ting on the spa e B

2

, are represented by these matri es:

U [1℄ =

0

B

B

�

u

00

0 u

01

0

0 u

00

0 u

01

u

10

0 u

11

0

0 u

10

0 u

11

1

C

C

A

; U [2℄ =

0

B

B

�

u

00

u

01

0 0

u

10

u

11

0 0

0 0 u

00

u

01

0 0 u

10

u

11

1

C

C

A

:

The rows and olumns are asso iated with the basis ve tors arranged in the

lexi ographi order: j00i; j01i; j10i; j11i.

[1℄ Problem 6.5.

a) Let H =

1

p

2

�

1 1

1 �1

�

. Write the matrix of the operator H[2℄

a ting on the spa e B

3

.

b) Let U be an arbitrary two-qubit operator with matrix elements u

jk

=

hjjU jki, where j; k 2 f00; 01; 10; 11g. Write the matrix for U [3; 1℄.

At this point the omputational omplexity begins, whi h makes quan-

tum omputers so powerful. Let U a t on two qubits, i.e., U is a 4 � 4

matrix. Then U [1; 2℄ = U I is a matrix of size 2

n

� 2

n

that onsists of

2

n�2

opies of U pla ed along the prin ipal diagonal. This matrix repre-

sents one elementary step. When we apply several su h operators to various

pairs of qubits, the result will appear onsiderably more ompli ated. There

is no obvious way of determining this result, apart from dire t multipli a-

tion of the orresponding matri es. Inasmu h as the size of the matri es is

exponentially large, exponential time is required for their multipli ation.

We remark, however, that the al ulation of matrix elements is possible

with polynomially bounded memory. Suppose we need to �nd the matrix

element U

xy

of the operator

U = U

(l)

[j

l

; k

l

℄U

(l�1)

[j

l�1

; k

l�1

℄ � � �U

(2)

[j

2

; k

2

℄U

(1)

[j

1

; k

1

℄:

It is obvious that

�

U

(l)

� � �U

(1)

�

x

l

x

0

=

X

x

l�1

;:::;x

1

U

(l)

x

l

x

l�1

� � �U

(1)

x

1

x

0

:

(Here x

0

; : : : ; x

l

are n-bit strings.) To ompute this sum, it suÆ es to allo-

ate l�1 registers for keeping the urrent values of x

l�1

; : : : ; x

1

, one register

for keeping the partial sum, and some onstant number of registers for the

al ulation of the produ t U

(l)

x

l

x

l�1

� � �U

(1)

x

1

x

0

.

De�nition 6.1 (Quantum ir uit). Let A be a �xed set of unitary op-

erators. (We all A a basis, or a gate set, whereas its elements are alled


gates.) A quantum ir uit over the basis A is a sequen e U

1

[A

1

℄; : : : ; U

L

[A

L

℄,

where U

j

2 A, and A

j

is an ordered set of qubits.

The operator realized by the ir uit is U = U

L

[A

L

℄ � � �U

1

[A

1

℄ (U :

B

n

! B

n

). The number L is alled the size of the ir uit.

We usually assume that A is losed under inversion: if X 2 A, then

X

�1

2 A. In this ase U and U

�1

are realized by ir uits of the same size.

Note that several gates, say U

j

1

; : : : ; U

j

s

, an be applied simultaneously

to disjoint sets of qubits (su h that A

j

a

\ A

j

b

= ; if a 6= b). We say that

a ir uit has depth � d if it an be arranged in d layers of simultaneously

applied gates. The depth an be also hara terized as the maximum length

of a path from input to output. (By a path we mean a sequen e of gates

U

k

1

; : : : ; U

k

d

(k

1

< � � � < k

d

) su h that ea h pair of adja ent gates, k

l

and

k

l+1

, share a qubit they a t upon, but no other gate a ts on this qubit

between the appli ations of U

k

l

and U

k

l+1

.)

De�nition 6.1 is not perfe t be ause it ignores the possibility to use

additional qubits (an illas) in the omputational pro ess. Therefore we

give yet another de�nition.

(Operator realized by a quantum ir uit using an illas). This is an

operator U : B

n

! B

n

su h that the produ t

W = U

L

[A

L

℄ � � �U

1

[A

1

℄;

a ting on N qubits (N � n), satis�es the ondition W

�

j�i j0

N�n

i

�

=

(U j�i) j0

N�n

i for any ve tor j�i 2 B

n

.

In this manner we \borrow" additional memory, �lled with zeros, that we

must ultimately return to its prior state. What sense does su h a de�nition

make? Why is it ne essary to insist that the additional qubits return to

the state j0

N�n

i? A tually, this ondition is rather te hni al. However, it

is important that at the end of the omputation the quantum state is a

produ t state, i.e., has the form j�

0

i j�

0

i (with arbitrary j�

0

i). If this is the

ase, then the �rst subsystem will be in the spe i�ed state j�

0

i, so that the

se ond subsystem (the added memory) may be forgotten. In the opposite

ase, the joint state of the two subsystems will be entangled, so that the �rst

subsystem annot be separated from the se ond.

7. Corresponden e between lassi al and

quantum omputation

Quantum omputation is supposed to be more general than lassi al om-

putation. However, quantum ir uits do not in lude Boolean ir uits as a

spe ial ase. Therefore some work is required to spe ialize the de�nition


of a quantum ir uit and prove that the resulting omputational model is

equivalent to the Boolean ir uit model.

The lassi al analogue of a unitary operator is an invertible map on

a �nite set, i.e., a permutation. An arbitrary permutation G : B

k

! B

k

orresponds naturally to a unitary operator

b

G on the spa e B

k

a ting

a ording to the rule

b

Gjxi

def

= jGxi:

By analogy with De�nition 6.1, we may de�ne reversible lassi al ir-

uits, whi h realize permutations.

De�nition 7.1 (Reversible lassi al ir uit). Let A be a set of permu-

tations of the form G : B

k

! B

k

. (The set A is alled a basis; its elements

are alled gates.) A reversible lassi al ir uit over the basis A is a sequen e

of permutations G

1

[A

1

℄; : : : ; G

l

[A

L

℄, where A

j

is a set of bits and G

j

2 A.

(Permutation realized by a reversible ir uit). This is the produ t of

permutations G

l

[A

l

℄ � � �G

1

[A

1

℄.

(Permutation realized by a reversible ir uit using an illas). This

is a permutation G su h that the produ t of permutations

W = G

l

[A

l

℄ � � �G

1

[A

1

℄

(a ting on N bits, N � n) satis�es the ondition W (x; 0

N�n

) = (Gx; 0

N�n

)

for arbitrary x 2 B

n

.

In what ases a fun tion given by a Boolean ir uit an be realized by

a reversible ir uit? Reversible ir uits realize only permutations, i.e., in-

vertible fun tions. This diÆ ulty an be over ome in this way: instead of

omputing a general Boolean fun tion F : B

n

! B

m

, we ompute the per-

mutation F

�

: B

n+m

! B

n+m

given by the formula F

�

(x; y) = (x; y�F (x))

(here � denotes bitwise addition modulo 2). Then F

�

(x; 0) = (x; F (x))

ontains the value of F (x) we need.

Note that two-bit permutation gates do not allow to realize all fun tions

of the form F

�

. It turns out that any permutation on two-bit states, g :

B

2

! B

2

, is a linear fun tion (under the natural identi� ation of the set B

with the two-element �eld F

2

): g(x; y) = (ax � by � ; dx � ey � f), where

a; b; ; d; e; f 2 F

2

. Therefore all fun tions realized by reversible ir uits over

the basis of permutations on two bits, are linear.

However, permutations on three bits already suÆ e to realize any per-

mutation. In fa t, the following two fun tions form a omplete basis for

reversible ir uits: negation : and the To�oli gate,

V

�

: (x; y; z) 7�!

(x; y; z � xy). Here we mean realization using an illas, i.e., it is allowed

to borrow bits in the state 0 under the ondition that they return to the

same state after the omputation is done.


Lemma 7.1. Let a fun tion F : B

n

! B

m

be realized by a Boolean ir-

uit of size L and depth d over some basis A (the fan-in and fan-out being

bounded by a onstant). Then we an realize a map of the form (x; 0) 7�!

(F (x); G(x)) by a reversible ir uit of size O(L) and depth O(d) over the ba-

sis A

�

onsisting of the fun tions f

�

(f 2 A) and the fun tion

he

: (x; y) 7�!

(x; x� y).

Remark 7.1. In addition to the \useful" result F (x), the indi ated map

produ es some \garbage" G(x).

Remark 7.2. The gate

he

is usually alled \Controlled NOT" for reasons

that will be ome lear later. Note that

he

= I

�

, where I is the identity

map on a single bit. The essential meaning of the operation

he

is reversible

opying of the bit x (if the initial value of y is 0).

Remark 7.3. The gate

he

allows one to inter hange bits in memory, sin e

the fun tion ($) : (a; b) 7�! (b; a) an be represented as follows:

($)[j; k℄ =

he

[j; k℄

he

[k; j℄

he

[j; k℄:

Proof of Lemma 7.1. Consider the Boolean ir uit that omputes F . Let

the input variables be x

1

; : : : ; x

n

, and the auxiliary variables (in luding the

result bits) x

n+1

; : : : ; x

n+L

. A reversible ir uit we are to onstru t will also

have n+ L bits; the bits x

n+1

; : : : ; x

n+L

are initialized by 0.

Ea h assignment in the original (Boolean) ir uit has the form x

n+k

:=

f

k

(x

j

k

; : : : ; x

l

k

), f

k

2 A, j

k

; : : : ; l

k

< n+ k. In the orresponding reversible

ir uit, the analogue of this assignment will be the a tion of the permutation

(f

k

)

�

, i.e., x

n+k

:= x

n+k

� f

k

(x

j

k

; : : : ; x

l

k

):

Sin e the initial values of the auxiliary variables were equal to 0, their

�nal values will be just as in the original ir uit. In order to obtain the

required form of the result, it remains to hange positions of the bits.

In this argument, we may assume that the original ir uit has a layered

stru ture, so that several assignments an o ur simultaneously. However,

the on urrent assignments should not share their input variables. If this

is not the ase, we need to insert expli it opy gates between the layers;

ea h opy gate will be repla ed by

he

in the reversible ir uit. This results

in depth in rease by at most onstant fa tor, due to the bounded fan-out

ondition.

The entire omputational pro ess is onveniently represented by the fol-

lowing diagram (above the re tangles is written the number of bits and,

inside, their ontent).


n

L�m

m

x

0 0

| assignments by the ir uit

x

x

n+1

. . . x

L�m

F (x)

| permutations of bits

F (x) G(x)

�

Lemma 7.2 (Garbage removal). Under the onditions of Lemma 7.1,

one an realize the fun tion F

�

by a reversible ir uit of size O(L+ n+m)

and depth O(d) using an illas.

Proof. We perform the omputation from the proof of Lemma 7.1, add ea h

bit of the result to the orresponding bit of y with

he

, and undo the above

omputation.

n

L

m

x

0

y

| omputation by the ir uit from

the proof of Lemma 7.1

m

L+ n�m

m

F (x) G(x)

y

| addition of F (x) to y modulo 2

F (x) G(x) F (x)� y

| reversal of the omputation that

was done in the �rst step

x

0

F (x)� y

�

Remark 7.4. Reversible omputation provides an answer to the following

question: how mu h energy is required to ompute a given Boolean fun -

tion [10, 45℄? Theoreti ally, reversible operations an be performed at no

energy ost. On the other hand, irreversible operations, like bit erasure,

pose a fundamental problem. When su h an operation is performed, two

di�erent logi al states (0 and 1) be ome identi al (0). However, physi al

laws on a mi ro-s ale are reversible. The solution to this apparent paradox

is that the di�eren e between the initial states, 0 and 1, is onverted into

a di�eren e between two physi al states that both orrespond to the same

logi al value 0. This may be interpreted as an in rease in disorder (entropy)

in physi al degrees of freedom beyond our ontrol, whi h eventually appears

in the surrounding environment in the form of heat. The amount of energy

required to erase a single bit is very small (kT ln 2), but still nonzero. The

theoreti al energy ost of information erasure on a hard disk of apa ity

1 gigabyte is equal to 2:4 � 10

�11

Joules, whi h orresponds to the energy

spent in moving the disk head by a fra tion of the size of an atom. This is

many orders of magnitude smaller than the a tual displa ement of the head

through formatting.


On the other hand, if the apa ity of disks were to ontinue growing as

fast as now, then at the end of the twenty-third entury formatting of a hard

disk would require as mu h energy as the Sun generates in a year.

The garbage removal lemma shows that it is possible to avoid su h losses

of energy onne ted with irreversible operations.

It is likewise possible to show that arbitrary omputation performed

with memory s, an be realized in a reversible manner through the use of

memory not ex eeding s

O(1)

. We will give a sket h of the proof. However,

the reader should keep in mind that omputation with bounded spa e is

not easily de�ned in terms of ir uits. Indeed, if a ir uit is allowed to

be exponentially large (though of polynomial \width"), it an ontain the

value table of the desired fun tion, whi h makes its omputation trivial.

Therefore, a rigorous proof should either deal with ir uits of some regular

stru ture, or involve a de�nition of a reversible Turing ma hine.

An arbitrary omputation with a given memory s an be redu ed to

solving a poly(s)-size instan e of TQBF, sin e TQBF is PSPACE- omplete.

We will show how to ompute reversibly, with a small amount of memory,

the value of the formula

(7.1) 9x

1

8y

1

� � � 9x

M

8y

M

f(x

1

; y

1

; : : : ; x

M

; y

M

; z);

where f(�) is omputed by a Boolean ir uit of size L. A tually, in this ase

the value of the formula (7.1) an be represented by a reversible ir uit with

O(L +M) bits. The omputation will be organized re ursively, beginning

with the innermost quanti�ers.

In order to ompute 8xF (x; z), we ompute F (0; z) and put the result

into a supplementary bit. Then we ompute F (1; z) and put the result into

another bit. Next we ompute 8x F (x; z) = F (0; z) ^ F (1; z) and save the

result in a third bit. In order to remove the garbage, we undo all al ulations,

ex ept for the �nal step.

Dealing with the formula 9xF (x; y) in a similar manner, we arrive at

the following result: adding a quanti�er in one Boolean variable in reases

the required memory by at most a onstant number of bits.

In on lusion, we formulate a theorem on omputation of reversible fun -

tions, whi h is a dire t generalization of Lemma 7.2.

Theorem 7.3. Let F and F

�1

be omputed by Boolean ir uits of size � L

and depth � d. Then F an be realized by a reversible ir uit of size O(L+n)

and depth O(d) using an illas.

Proof. The omputation is performed a ording to the following s heme

(for simpli ity we do not show the an illas that are used in the omputation

from Lemma 7.2).


n n

x

0

| omputation of F

�

by the ir uit from the proof of

Lemma 7.2

x

F (x)

| permutation of bits

F (x)

x

| applying (F

�1

)

�

(by the ir uit from the proof of

Lemma 7.2) yields x � F

�1

(F (x)) = 0 in the right regis-

ter

F (x)

0

�

[1!℄ Problem 7.1. Prove that negation and the To�oli gate form a omplete

basis for reversible ir uits.

8. Bases for quantum ir uits

How do we hoose a basis (gate set) for omputation by quantum ir uits?

There are un ountably many unitary operators. So, either a omplete basis

must ontain an in�nite (un ountable) number of gates, or else we have

to weaken the ondition of exa t realization of an operator by a ir uit,

hanging it to a ondition of approximate realization. We will examine both

possibilities.

8.1. Exa t realization.

Theorem 8.1. The basis onsisting of all one-qubit and two-qubit unitary

operators allows the realization of an arbitrary unitary operator.

The rest of the se tion onstitutes a proof of this theorem.

8.1.1. Operators with quantum ontrol.

De�nition 8.1. For ea h operator U : B

n

! B

n

, an operator �(U) :

B B

n

! B B

n

(\ ontrolled U") is de�ned by the following relations:

(8.1)

�(U)j0i j�i = j0i j�i;

�(U)j1i j�i = j1i U j�i:

X

Graphi ally, we represent an operator �(X) as

shown in the �gure. The top line orresponds to the

ontrol qubit (the �rst tensor fa tor in (8.1)) while

the bottom line represents the other qubits. The

dire tion of the arrows orresponds to the order in whi h operators a t on

an input ve tor. For example, in Figure 8.1 (see below) the �rst operator is

�(Y

�1

). In this book, we draw arrows from right to left, whi h is onsistent

with the onvention that ABj�i means \take j�i, apply B, then apply A".


We will also need operators with several ontrolling qubits:

(8.2) �

k

(U)jx

1

; : : : ; x

k

i j�i =

(

jx

1

; : : : ; x

k

i j�i if x

1

� � � x

k

= 0;

jx

1

; : : : ; x

k

i U j�i if x

1

� � � x

k

= 1:

Example 8.1. Let �

x

def

= b: =

�

0 1

1 0

�

: Then �(�

x

) =

he

, and �

2

(�

x

) =

�

�

(the To�oli gate).

8.1.2. The realization of the To�oli gate. Now we onstru t the Tof-

foli gate using transformations on two qubits. To start, we �nd a pair of

operators that satisfy the relation XY X

�1

Y

�1

= i�

x

. For example, the

following pair will do:

(8.3) X =

1

p

2

�

�i �1

1 i

�

; Y =

�

0 1

�1 0

�

:

Let us larify the geometri meaning of this onstru tion. The uni-

tary group U(2) a ts on three-dimensional Eu lidean spa e. To de�ne this

a tion, we note that 2� 2 Hermitian matri es with zero tra e form a three-

dimensional Eu lidean spa e: the inner produ t between A and B is given

by

1

2

Tr(AB) and an orthonormal basis is formed by the Pauli matri es

(8.4) �

x

=

�

0 1

1 0

�

; �

y

=

�

0 �i

i 0

�

; �

z

=

�

1 0

0 �1

�

:

A unitary operator U 2 U(2) a ts on this spa e by this rule: U : E 7!

UEU

�1

: It is possible to show (see [44, x11.12℄) that the a tion we have just

de�ned yields an isomorphism U(2)=U(1)

�

=

SO(3), where U(1) =

�

2 C :

j j = 1

is the subgroup of phase shifts, and SO(3) is the group of rotations

of three-dimensional spa e (i.e., the group of orthogonal transformations

with determinant 1).

Under this a tion, �

x

orresponds to a rotation about the x axis by 180

Æ

,

X to a rotation about the ve tor (0; 1; 1) by 180

Æ

, and Y to a rotation about

the y axis by 180

Æ

.

Shown in Figure 8.1 is a ir uit that realizes the To�oli gate by using

the operators �(X), �(Y ) and �

2

(�i). The last of these is a phase shift

(multipli ation by �i) ontrolled by two bits.

Let us test this ir uit. Suppose the input ve tor is ja; b; i = jai jbi j i,

where a; b; 2 B . If a = b = 1, then the operator �iXY X

�1

Y

�1

= �

x

is applied to j i, whi h hanges j0i to j1i and vi e versa. However, if at

least one of the ontrolling bits is 0, then j i is multiplied by the identity

operator. This is exa tly how the To�oli gate a ts on basis ve tors. This

a tion extends to the whole spa e B

3

by linearity.


X

�1

Y

�1

X Y

�i

Fig. 8.1. Implementation of the To�oli gate.

8.1.3. The realization of �

k

(U) for U 2 U(B). Let U be a unitary

operator a ting on one qubit. We will show how to realize the operator

�

k

(U) for arbitrary k by a ting only on pairs of qubits. Our �rst solution

uses an illas. We a tually onstru t an operator W whi h a ts on the spa e

of N + 1 qubits B

(N+1)

and satis�es the ondition

W

�

j�i j0

N�k

i

�

= �(U)j�i j0

N�k

i:

(Caution: this ondition does not mean that W = �(U) I.)

There exists a reversible ir uit P of size O(k) and depth O(log k) that

omputes the produ t of k input bits x

1

� � � x

k

(the result being a single bit),

and also produ es some garbage G(x

1

; : : : ; x

k

) (N�1 bits). It is represented

graphi ally in Figure 8.2 (written above ea h box is the number of bits in

the orresponding memory segment).

x

1

; : : : ; x

k

0x

1

x

2

� : : : � x

k

G(x

1

; : : : ; x

k

)

k

N � k

1

N � 1

P

Fig. 8.2

Figure 8.3 shows how to onstru t the operator W using the ir uit P

and an operator with one ontrolling qubit. The ir uit P is applied �rst,

followed by the reverse ir uit P

�1

, so that all N bits return to their initial

state. In the meantime, the �rst bit (the top line in Figure 8.3) takes the

value x

1

� � � x

k

. It is used as the ontrol qubit for �(U), whereas qubit N +1

is the target. The ir uit in the �gure an also be des ribed by the equation

W = P

�1

�(U)P or, more expli itly,

W [ 1; : : : ; k;N+1

| {z }

�(U)

; k+1; : : : ; N

| {z }

an illas

℄ = P

�1

[1; : : : ; N ℄ �(U)[1; N+1℄ P [1; : : : ; N ℄:

The use of an illas an be avoided at the ost of an in rease in the ir-

uit size. Let us onsider the operator �

k

(i�

x

) �rst. A ir uit C

k

for the

realization of this operator an be onstru ted re ursively: it onsists of two

opies of the ir uit C

dk=2e

, two opies of the ir uit C

bk=2

, and a onstant


P

P

�1

U

x

1

:

:

:

x

k

:

:

:

0

G(x)

.

.

.

.

.

.

x

1

� x

2

� : : : � x

k

x

1

x

2

0

Fig. 8.3. Implementation of the operator �

k

(U) using an illas.

number of one-qubit gates. Therefore we get a re urren e relation for the

ir uit size, L

k

= 2L

bk=2

+ 2L

dk=2e

+ , so that L

k

= O(k

2

). The on rete

onstru tion is shown in Figure 8.4 ( f. Figure 8.1). We again use the opera-

tors X and Y (see (8.3)) satisfying XY X

�1

Y

�1

= i�

x

. Now we apply them

with multiple ontrol qubits: Y is ontrolled by the qubits 1; : : : ; dk=2e,

whereas X is ontrolled by the qubits dk=2e+1; : : : ; k. It remains to noti e

that X and Y are onjugate to i�

x

, i.e., X = V (i�

x

)V

�1

, Y =W (i�

x

)W

�1

for some unitary V and W . Hen e �

b

(X) and �

a

(Y ) (where a = dk=2e,

b = bk=2 ) an be obtained if we onjugate �

b

(i�

x

) and �

a

(i�

x

) by V and

W (resp.) applied on the last qubit.

X

�1

Y

�1

X Y

x

1

x

2

x

dk=2e

x

dk=2e+1

x

k

Fig. 8.4. Implementation of the operator �

k

(i�

x

) without an illas.

The operator �

k

(Z) for an arbitrary Z 2 SU(2) an be realized by two

appli ations of �

k

(�

x

) and four appli ations of one-qubit gates, as in the

solution to Problem 8.2 (see Figure S8.1a). Note that one opy of �

k

(�

x

)

an be repla ed by �

k

(i�

x

), and the other by �

k

(�i�

x

).

Consider now the general ase, �

k

(U), where U 2 U(2). Let U =

U

0

= e

i'

1

Z

0

, where Z

0

2 SU(2). Then �

k

(e

i'

1

) = �

k�1

(U

1

), where U

1

=


�(e

i'

1

) 2 U(2). Thus we have

�

k

(U)[1; : : : ; k; k+1℄ = �

k�1

(U

1

)[1; : : : ; k℄ �

k

(Z

0

)[1; : : : ; k; k+1℄:

We pro eed by indu tion, obtaining the equation

(8.5)

�

k

(U)[1; : : : ; k; k+1℄

= U

k

[1℄ �

1

(Z

k�1

)[1; 2℄ � � � �

k

(Z

0

)[1; : : : ; k; k+1℄:

It is represented graphi ally in Figure 8.5. The size of the resulting ir uit

is O(n

3

). (This onstru tion an be made more eÆ ient; see Problem 8.6.)

U

k

Z

k�1

Z

2

Z

1

Z

0

x

1

x

2

x

k�1

x

k

Fig. 8.5. An illa-free realization of �

k

(U), U 2 U(2).

8.1.4. The realization on an arbitrary operator. We ontinue the

proof of Theorem 8.1. The a tion of �

k

(U) may be des ribed as follows:

the operator U a ts on the subspa e generated by the ve tors j1; : : : ; 1; 0i

and j1; : : : ; 1; 1i, and the identity operator a ts on the orthogonal omple-

ment of this subspa e. Our next task is to realize a similar operator in whi h

a nontrivial a tion is arried out on the subspa e spanned by an arbitrary

pair of basis ve tors. Suppose we want to realize an arbitrary operator on the

subspa e spanned by jxi and jyi, where x = (x

1

; : : : ; x

n

), y = (y

1

; : : : ; y

n

),

x

j

; y

j

2 B . Let f be a permutation su h that f(x) = (1; : : : ; 1; 0), f(y) =

(1; : : : ; 1; 1). We may assume that f is linear, i.e., f : x 7! Ax+ b, where A

is an invertible matrix, and b is a ve tor over the two-element �eld F

2

. Su h

permutations an be realized by reversible ir uits over the basis

�

:;

he

without an illas. Then the operator we need is represented in the form

b

f

�1

�

n�1

(U)

b

f . (Re all that

b

f is the operator orresponding to the permu-

tation f .)

Therefore we an a t arbitrarily on pairs of basis ve tors. Sin e we only

used ir uits of size poly(n), the onstru ted a tions are realized eÆ iently.

The �nal part in the proof of Theorem 8.1 is not eÆ ient. Now we forget


about qubits (i.e., the tensor produ t stru ture of our spa e), so we just have

a Hilbert spa e of dimension M = 2

n

. We want to represent an arbitrary

unitary operator U by the a tions on pairs of basis ve tors. This will be

polynomial in M , hen e exponential in n.

Lemma 8.2. An arbitrary unitary operator U on the spa e C

M

an be rep-

resented as a produ t of M(M � 1)=2 matri es of the form

(8.6)

0

B

B

B

B

B

B

B

B

B

B

B

B

�

1 0 : : : : : : : : : : : : : : : : : : : : : :

.

.

.

.

.

.

0 : : : : : : : : : : : : : : : : : : :

0 : : : 1 0 : : : : : : : : :

0 : : : : : :

�

a b

d

�

0 : : : : : :

0 : : : : : : : : : : : : : : : : 1 0 0

: : : : : : : : : : : : : : : : : : : : : :

.

.

.

0

0 : : : : : : : : : : : : : : : : : : : : : : : : 1

1

C

C

C

C

C

C

C

C

C

C

C

C

A

; where

�

a b

d

�

2 U(2):

Proof. First we note that for any numbers

1

;

2

there exists a 2�2 unitary

matrix V su h that

V

�

1

2

�

=

�

p

j

1

j

2

+ j

2

j

2

0

�

:

Consequently, for a unit ve tor j�i 2 C

M

there exists a sequen e of unitary

operators V

(1)

; : : : ; V

(M�1)

su h that V

(1)

� � � V

(M�1)

j�i = j1i, where V

(s)

a ts on the subspa e C (jsi; js + 1i) (as the matrix (8.6)) and leaves the

remaining basis ve tors un hanged.

Now let an M �M unitary matrix U be given. Multiplying U

�1

on

the left by suitable matri es U

(1;1)

; : : : ; U

(1;M�1)

, we an transform the �rst

olumn into the ve tor j1i. Sin e the olumns remain orthogonal, the �rst

row be omes h1j. A ting in the same way on the remaining olumns, we

obtain a set of matri es U

(j;s)

, 1 � j � s �M � 1, (where U

(j;s)

a ts on jsi

and js+ 1i) satisfying the ondition

U

(M�1;M�1)

�

U

(M�2;M�2)

U

(M�2;M�1)

�

� � �

�

U

(1;1)

� � �U

(1;M�1)

�

U

�1

= I:

This proof is onstru tive, i.e., it provides an algorithm for �nding the

matri es U

(j;s)

. The running time of this algorithm depends on M and an-

other parameter Æ, the pre ision of arithmeti operations with real numbers.

Spe i� ally, the algorithm omplexity is O(M

3

) � poly(log(1=Æ)). �

Problems

[2!℄ 8.1. Prove that any operator U 2 U(B) an be realized (without an il-

las) by a onstant size ir uit over the basis

�

�(e

i'

) : ' 2 R

[ fHg.


[1!℄ 8.2. Prove that any operator of the form �(U), U 2 U(B) an be

realized (without an illas) by a onstant size ir uit over the basis of one-

qubit gates and the gate �(�

x

).

(Therefore, this basis allows the realization of an arbitrary unitary op-

erator. Indeed, in the proof of Theorem 8.1 we only use gates of the form

�(U) and one-qubit gates.)

[2!℄ 8.3. Suppose that a basis A is losed under inversion and allows the

realization of any one-qubit operator up to a phase fa tor (e.g., A = SU(2) ).

Prove that the multipli ation by a phase fa tor an be realized over A using

one an illa.

[2!℄ 8.4. Suppose that a unitary operator U : B

n

! B

n

satis�es the

ondition U j0i = j0i. Constru t a ir uit of size 6n+ 1 realizing �(U) over

the basis

�

U;�

2

(�

x

)

, using an illas. (The gate U should be applied only

on e.)

[3℄ 8.5. Realize the operator �

k

(�

x

) (k � 2) by a ir uit of size O(k)

onsisting of To�oli gates. It is allowed to use a onstant number of an illas

in su h a way that their initial state does not matter, i.e., the ir uit should

a tually realize the operator W = �

k

(�

x

) I

B

r, where r = onst.

[2℄ 8.6. Realize the operator �

k

(U) by a ir uit of size O(k

2

) over the basis

of all two-qubit gates. The use of an illas is not allowed.

8.2. Approximate realization. We now pass to �nite bases. In this ase

it is only possible to obtain an approximate representation of operators as

produ ts of basis elements. In order to de�ne the approximate realization,

we need a norm on the operator spa e.

On the spa e of ve tors there is the Eu lidean norm

j�i

=

p

h�j�i. By

the de�nition of a norm, it satis�es the following onditions:

j�i

(

= 0 if j�i = 0;

> 0 if j�i 6= 0;

(8.7)

j�i+ j�i

�

j�i

+

j�i

;(8.8)

j�i

= j j

j�i

:(8.9)

We now introdu e a norm on the spa e of operators L(N ).

De�nition 8.2. The norm of an operator X (the so- alled operator norm;

in general, there are others) is

X

= sup

j�i6=0

Xj�i

j�i

:


We note that kXk

2

is the largest eigenvalue of the operator X

y

X:

This norm possesses all the properties of norms indi ated above and,

beyond these, several spe ial properties:

kXY k � kXk kY k;(8.10)

kX

y

k = kXk;(8.11)

kX Y k = kXk kY k;(8.12)

kUk = 1 if U is unitary:(8.13)

Now we give the de�nition of approximate realization. If the operator

in question is U , then its approximate realization will be denoted by

~

U .

De�nition 8.3. The operator

~

U approximates the operator U with pre ision

Æ if

(8.14) k

~

U � Uk � Æ:

This de�nition has two noteworthy orollaries. First, if

~

U approximates

U with pre ision Æ, then

~

U

�1

approximates U

�1

with the same pre ision

Æ. Indeed, if we multiply the expression

~

U � U by

~

U

�1

on the left and by

U

�1

on the right, the norm does not in rease (due to the properties (8.10)

and (8.13) ). Thus we obtain a orollary of the inequality (8.14): kU

�1

�

~

U

�1

k � Æ.

The se ond property is as follows. Consider the produ t of several opera-

tors, U = U

L

� � �U

2

U

1

. If ea h U

k

has an approximation

~

U

k

with pre ision Æ

k

,

then the produ t of these approximations,

~

U =

~

U

L

� � �

~

U

2

~

U

1

, approximates

U with pre ision

P

Æ

k

(i.e., errors a umulate linearly):

~

U

L

� � �

~

U

2

~

U

1

� U

L

� � �U

2

U

1

�

X

j

Æ

j

:

It suÆ es to look at the example with two operators:

~

U

2

~

U

1

� U

2

U

1

=

~

U

2

(

~

U

1

� U

1

) + (

~

U

2

� U

2

)U

1

�

~

U

2

(

~

U

1

� U

1

)

+

(

~

U

2

� U

2

)U

1

�

~

U

2

~

U

1

� U

1

+

~

U

2

� U

2

U

1

=

~

U

1

� U

1

+

~

U

2

� U

2

:

Note that we have used the fa t the the norm of a unitary operator is

1. (With nonunitary operators, the approximation errors ould a umulate

mu h faster, e.g., exponentially.)

Remark 8.1. Every model that aims at solving omputational problems by

real physi al pro esses, has to be s rutinized for stability to approximation

errors. (In real life the parameters of any physi al pro ess an be given only


with ertain pre ision.) In parti ular, omputation with exponential error

a umulation is almost de�nitely useless from the pra ti al point of view.

Now we generalize De�nition 8.3 to allow the use of an illas.

De�nition 8.4. The operator U : B

n

! B

n

is approximated by the

operator

~

U : B

N

! B

N

with pre ision Æ using an illas if, for arbitrary j�i

in B

n

, the inequality

(8.15)

~

U

�

j�i j0

N�n

i

�

� U j�i j0

N�n

i

� Æ

j�i

is satis�ed.

We an formulate this de�nition in another way. Let us introdu e a

linear map V : B

n

! B

N

whi h a ts by the rule V : j�i 7! j�i j0

N�n

i.

The map V is not unitary, but isometri , i.e., V

y

V = I

B

n. The ondition

from the last de�nition may be written as

(8.16)

~

UV � V U

� Æ:

The basi properties of approximation remain true for approximation

using an illas (whi h, of ourse, should be veri�ed; see Problem 8.8).

What bases allow the realization of an arbitrary unitary operator with

arbitrary pre ision? What is the size of the ir uit that is needed to a hieve

a given pre ision Æ? How to onstru t this ir uit eÆ iently? Unfortunately,

we annot give a universal answer to these questions. In onstru ting quan-

tum algorithms, we will use the following (widely adopted) standard basis.

De�nition 8.5. The basis Q = fH;K;K

�1

;�(�

x

);�

2

(�

x

)g, where

H =

1

p

2

�

1 1

1 �1

�

; K =

�

1 0

0 i

�

;

is alled standard.

Theorem 8.3. Any unitary operator U on a �xed number of qubits an be

realized with pre ision Æ by a poly(log(1=Æ))-size, poly(log log(1=Æ))-depth

ir uit over the standard basis, using an illas. There is a polynomial algo-

rithm that onstru ts this ir uit on the des ription of U .

This theorem will be proved and generalized in Se tion 13; see Theo-

rem 13.5 on page 134 for a sharper result. The proof is based on a so- alled

phase estimation pro edure ( f. Problem 13.4 | quantum Fourier trans-

form).

As far as general bases are on erned, we will use the following de�nition.

De�nition 8.6. Let A be a gate set that is losed under inversion. We all

A a omplete basis (or a universal gate set) if the appli ations of its elements


generate a dense subgroup in the groupU(B

k

)=U(1) for some k � 2. (Here

U(1) orresponds to multipli ation by phase fa tors.)

(The phase fa tors are unimportant from the physi al point of view, as well

as for the de�nition of quantum omputation that will be given in Se tion 9.

If we really need to realize phase shifts, we an use the result of Problem 8.3.)

Remark 8.2. Why don't we a ept an illas in the de�nition of a omplete

basis? Indeed, it seems more natural to all a basisA omplete if any unitary

operator U an be realized with an arbitrary pre ision Æ by a quantum ir uit

over this basis, using an illas. Unfortunately, with this de�nition it is not

lear how to estimate the size of the ir uit in question. On the ontrary,

De�nition 8.6 provides a rather general way of obtaining su h an estimate;

see Se tion 8.3. It is not known whether the two de�nitions of a omplete

basis are equivalent.

Remark 8.3. There is yet another de�nition of a omplete basis, whi h is

based on an even more general notion of realization of a unitary operator

than the realization using an illas. A basis is alled omplete if it an e�e t

an arbitrary unitary operator on \en oded qubits" with any given pre ision

(see Se tion 15 for exposition of quantum odes). The idea is that the

quantum state of ea h qubit is represented by a state of several qubits; it

is even permitted to have multiple representations of the same state.

4

This

situation is hara terized by an isometri map V : B F ! B

k

, in whi h

ase we say that a single logi al qubit is represented by k physi al qubits (the

spa e F orresponds to the nonuniqueness of the representation). The gates

of the basis a t on physi al qubits, whereas the operator we want to realize

a ts on logi al qubits.

In su h a general model, it is again possible to estimate the size of the

ir uit that is needed to a hieve the given pre ision Æ. Moreover, the gates

of the basis an be spe i�ed with a onstant pre ision Æ

0

, yet arbitrarily a -

urate realization is possible. This fundamental result is alled the threshold

theorem for fault-tolerant quantum omputation [65, 42, 3, 36℄.

Theorem 8.4 ( f. [36℄). The standard basis Q is omplete.

Note that this theorem does not follow from Theorem 8.3. The proof of

the theorem is ontained in the solutions to Problems 8.10{8.12.

Remark 8.4. If we remove the To�oli gate from the basis Q, it eases to be

omplete. However, many important omputations an be done even with

4

Traditionally, a quantum ode is de�ned as a single preferred representation, whereas the

other representations are regarded as the preferred one, subje ted to a \ orre table error". What-

ever the terminology, multiple representations allow us to perform omputation with ina urate

gates. Su h gates introdu e \errors", or un ertainty in the resulting state, but one an arrange

that it is only the hoi e of representation that is un ertain, the en oded state remaining inta t.


su h a redu ed basis. In parti ular, as will be evident later, error- orre ting

ir uits for quantum odes an be realized without the To�oli gate.

Problems

[1℄ 8.7. Prove the properties (8.10){(8.13) of the operator norm.

[1℄ 8.8. Prove the two basi properties of approximation with an illas:

a) If

~

U approximates U with pre ision Æ, then

~

U

�1

approximates U

�1

with the same pre ision Æ.

b) If unitary operators

~

U

k

approximate unitary operators U

k

(1 � k �

L) with pre ision Æ

k

, then

~

U

L

� � �

~

U

1

approximates U

L

� � �U

1

with pre ision

P

k

Æ

k

.

[3℄ 8.9. Suppose that a unitary operator

~

U approximates a unitary op-

erator U with pre ision Æ, using an illas. Prove that there exists an op-

erator W that realizes U pre isely (i.e., the equality W

�

j�i j0

N�n

i

�

=

(U j�i) j0

N�n

i holds) and satis�es

W �

~

U

� O(Æ).

[2!℄ 8.10. SupposeX and Y are non ommuting elements of the group SO(3)

that rotate by angles in ommensurate with �. Prove that the group gener-

ated by X and Y is an everywhere dense subset of SO(3).

[3!℄ 8.11. LetM be a Hilbert spa e of �nite dimensionM � 3. Consider the

subgroupH � U(M), the stabilizer of the 1-dimensional subspa e generated

by some unit ve tor j�i 2 M. Let V be an arbitrary unitary operator not

�xing the subspa e C (j�i). Prove that the set of operators H [ V

�1

HV

generates the whole group U(M).

(Note that under the onditions of this problem U(M) and H may be

fa tored by the subgroup of phase shifts U(1).)

[3!℄ 8.12. Prove that the appli ations of the operators from the standard

basis generate an everywhere dense subset of U

�

B

3

�

=U(1).

[2℄ 8.13. Let R = �i exp(�i��

x

), � irrational. Prove that the negation

�

x

, the Deuts h gate �

2

(R) and its inverse

5

form a omplete basis.

8.3. EÆ ient approximation over a omplete basis. How an one

estimate the omplexity of realizing a unitary operator U over a omplete

basis A with a given pre ision Æ? How to onstru t the orresponding ir uit

eÆ iently? This questions arise if we want to simulate ir uits over another

basis C by ir uits over A. We would like to prove that su h simulation does

5

The inverse of the Deuts h gate is not really ne essary; it is in luded solely to onform to

De�nition 8.6.


not in rease the size of the ir uit too mu h. In this regard, we may assume

that U 2 C is �xed, while Æ tends to zero.

Let U : B

n

! B

n

be an arbitrary unitary operator. It an be repre-

sented by a matrix with omplex entries, where ea h entry is a pair of real

numbers, and ea h number is an in�nite sequen e of binary digits. We set

the question of omputing these digits aside. Instead, we assume that ea h

matrix entry is spe i�ed with a suitable pre ision, namely, Æ=2

n+1

. In this

ase the overall error in U , measured by the operator norm, does not ex eed

Æ=2. (Taking this input error into a ount, the algorithm itself should work

with pre ision Æ=2, but we will rather ignore su h details.)

The problem an be divided into two parts. First, we realize U over the

in�nite basis A

0

that onsists of all one-qubit and two-qubit gates. Se ond,

we approximate ea h gate V of the resulting ir uit C

0

by a ir uit C over the

basis A. The �rst part is mostly done in the proof of Theorem 8.1; we just

need to add some details. By examining the proof, we �nd that the ir uit

C

0

has size L

0

= exp(O(n)). If we want to represent U with pre ision

Æ, we need to ompute all gates of the ir uit with pre ision Æ

0

= Æ=L =

exp(�O(n)) Æ, whi h amounts to omputing the entries of the orresponding

matri es with pre ision Æ

00

= Æ

0

=2

n

= exp(�O(n)) Æ. This an be done in

time T = exp(O(n)) �poly(log(1=Æ)). The presen e of the exponential fa tor

should not bother us, sin e in the pra ti al appli ation U is �xed, and so is

n. Thus the �rst part is �nished and we pro eed to the se ond part.

8.3.1. Initial (non onstru tive) stage. Let us onsider the problem of

approximating an element V 2 U(B

2

) � U(B

k

) by a ir uit C over the

basis A with pre ision Æ (the number k ame from De�nition 8.6). We are

looking for approximations up to a phase fa tor; therefore we may assume

that V 2 SU(M), A � SU(M), where M = 2

k

. Then the ir uit C is

simply a sequen e of elements U

1

; : : : ; U

L

2 A su h that kV � U

L

� � �U

1

k �

Æ. De�nition 8.6 guarantees that su h a sequen e exists, but its minimum

length L an be arbitrary large (e.g., in the ase where the elements of A

are very lose to the identity). So, before we an onstru t C in an e�e tive

fashion, some initial setup is required. We will now des ribe it brie y. In this

des ription, we refer to some new on epts and fa ts that will be explained

later.

First, we generate suÆ iently many produ ts of elements of A so that

they form an "-net, where " is a suitable onstant independent of M . This

may take an arbitrary long time. The net may ome out too rowded, but

we an make it \�-sparse" (� = onst) by removing redundant points. Su h

a net has at most exp(O(M

2

)) elements; see Problem 8.14. (This bound is

tight. For suÆ iently small ", any "-net has at least exp((M

2

)) elements;

this is due to the fa t that SU(M) is a manifold of dimension M

2

� 1.)


Then we onsider the net as a new basis. It is possible to obtain an upper

bound for the approximation omplexity relative to this basis, but not to

the original one. As a part of our main theorem we will prove that any "-net

in SU(M) generates a dense subgroup, provided " is small enough.

6

It is the moment to re all some basi on epts from geometry. A distan e

fun tion on a set S is a fun tion d : S�S ! R, su h that (i) d(x; x) = 0, (ii)

d(x; y) > 0 if x 6= y, (iii) d(x; y) = d(y; x), and (iv) d(x; z) � d(x; y)+d(y; z).

A Æ-net for R � S is a set � � S whose Æ-neighborhood ontains R, i.e.,

for any x 2 R there is a y 2 � su h that d(x; y) � Æ. We say that � has no

extra points if any point of � belongs to the Æ-neighborhood of R. The net

� is alled �-sparse (0 < � < 1) if it has no extra points, and the distan e

between any two distin t points of � is greater than �Æ.

The group SU(M) is equipped with the distan e given by the operator

norm, d(U; V ) = kU�V k. Note that the diameter of SU(M) (the maximum

possible distan e) is 2. Let r > Æ > 0. Then an (r; Æ)-net in SU(M) is a

Æ-net for the r-neighborhood of the identity; this neighborhood is denoted

by S

r

. The ratio q = r=Æ > 1, alled quality, will play an important role in

our arguments.

[2!℄ Problem 8.14 (sparse nets).

a) (removing redundant points). Let � be a Æ-net for R � S. Prove

that there is a subset �

�

� � whi h is an �-sparse Æ=(1 � �)-net for R.

b) (few points are left). Prove that any �-sparse (r; r=q)-net in SU(M)

has at most (q=�)

O(M

2

)

elements.

8.3.2. Main theorem. Let A � SU(M) be a �nite subset whi h is losed

under inversion. The elements of A will be alled generators. They an be

treated in two ways: as elements of SU(M) (represented by matri es with

a suitable pre ision), or as abstra t symbols | referen es to the initially

spe i�ed elements. A ir uit is a sequen e of su h symbols, i.e., a word in

the alphabet \A". (We indi ate the abstra t use of generators by quotation

marks, e.g., \U" 2 \A".) The words onstitute the free group F [\A"℄. The

produ t of two words is obtained by on atenation, whereas the inverse of

\U

1

" � � � \U

n

" is \U

�1

n

" � � � \U

�1

1

").

Theorem 8.5. There exists a universal onstant " > 0 su h that for any

� > 0 the following is true:

6

A more general statement is true: there is a universal onstant "

0

> 0 su h that any "

0

-net

in any ompa t semisimple Lie group generates a dense subgroup, where the distan e is measured

by the operator norm for the adjoint a tion [27℄.


1. For any M , an "-net A � SU(M) ( losed under inversion), a number

Æ > 0, and an element V 2 SU(M), there is a sequen e of generators

U

1

; : : : ; U

L

2 A, L = O

�

(log(1=Æ))

3+�

�

, su h that kV � U

L

� � �U

1

k � Æ.

2. Assuming that the net A is not too redundant, jAj = exp(O(M

2

)), the

fun tion (M; Æ;A; V ) 7! \U

L

" � � � \U

1

" an be omputed by an algorithm

with running time T = exp(O(M

2

)) (log(1=Æ))

3

+O(L).

Corollary 8.5.1. Let A be a omplete basis, and C an arbitrary �nite basis.

Then any ir uit C of size L and depth d over the basis C an be simulated by

a ir uit C

0

of size L

0

= O

�

L (log(L=Æ))

�

and depth d

0

= O

�

d (log(L=Æ))

�

over the basis A. (Here = 3+�, whereas Æ denotes the simulation pre ision:

C realizes a unitary operator U , C

0

realizes U

0

, and kU � U

0

k � Æ .)

The orollary is obvious: ea h gate of C should be approximated with

pre ision Æ=L. The simulation is very eÆ ient in terms of size, but it is not

so good in terms of depth. In a ommon situation d � (logL)

k

, k � 1, so

that d

0

= O(d

1+ =k

) (assuming that Æ = onst).

Remark 8.5. The upper bound on the number of generators L in Theo-

rem 8.5 an be improved if we drop the se ond ondition (the existen e of an

eÆ ient algorithm). Let A � SU(M) be an arbitrary subset that is losed

under inversion and generates a dense subgroup in SU(M). Then any el-

ement U 2 SU(M) an be approximated with pre ision Æ by a produ t of

L � C(A) log(1=Æ) generators [31℄. On the other hand, the lower bound

L � (M

2

) log(1=Æ)= log jAj follows from a volume onsideration.

Remark 8.6. The presen e of the exponential exp(O(M

2

)) in the algorithm

omplexity bound is rather disturbing (re all that M = 2

k

, where k is

the number of qubits). As far as the asymptoti behavior at Æ ! 0 is

on erned, it seems possible to make the omputation polynomial in M ,

that is, the exponential may be ome an additive term rather than a fa tor.

(To this end, one may try to use bases in the tangent spa e instead of nets

| the reader is wel ome to explore this idea.) However, it is a hallenge

to eliminate the exponential altogether. This may be only possible if one

hanges the assumptions of the theorem, e.g., by saying that produ ts of

poly(M) elements from A onstitute an "-net (rather than A being an "-

net itself). Su h a basis A an onsist of only poly(M) elements, so it is

reasonable to ask whether there is an approximation algorithm with running

time poly(M log(1=Æ)). This appears to be a diÆ ult question in global

unitary geometry.

8.3.3. Idea of the proof and geometri lemmas. The proof of Theo-

rem 8.5 is based on four geometri properties of the group SU(M) endowed

with the operator norm distan e d. First, the distan e is biinvariant, i.e.


d(WU;WV ) = d(U; V ) = d(UW;V W ). The se ond property is the re-

sult of Problem 8.14b; this is the only pla e where the number M omes

in. The remaining two properties are related to the group ommutator,

[[U; V ℄℄ = UV U

�1

V

�1

. Let us onsider the appli ation of the group multi-

pli ation and the group ommutator to a pair of subsets A;B � SU(M),

AB =

�

UV : U 2 A; V 2 B

; [[A;B℄℄ =

�

[[U; V ℄℄ : U 2 A; V 2 B

:

Then the following in lusions hold:

[[S

a

; S

b

℄℄ � S

2ab

;(8.17)

S

ab=4

� [[S

a

; S

b

℄℄S

O(ab(a+b))

:(8.18)

(Re all that S

r

denotes the r-neighborhood of the identity. The right-hand

side of the last in lusion represents the O(ab(a+b))-neigborhood of [[S

a

; S

b

℄℄.

Indeed, the r-neighborhood of any set T an be expressed as TS

r

.)

The in lusion (8.17) is easy to prove:

[[U; V ℄℄� I

=

UV � V U

=

(U � I)(V � I)� (V � I)(U � I)

� 2 kU � Ik kV � Ik:

Formula (8.18) an be proved by approximating the group ommutator

by the orresponding Lie algebra bra ket.

7

In our ase, the Lie algebra is

formed by skew-Hermitian M �M matri es with zero tra e,

su(M) =

�

X : X

y

= �X; TrX = 0

; [X;Y ℄ = XY � Y X:

The exponential map exp : su(M)! SU(M) is simply the matrix exponen-

tiation.

The eigenvalues of X 2 su(M) have the form

eigenvalues(X) =

�

ix

1

; : : : ; ix

M

;

M

X

k=1

x

k

= 0; x

k

2 R:

Using a basis in whi h X is diagonal, one an see that if kXk = t � � then

k exp(X)�Ik = 2 sin(t=2). Therefore k exp(X)�Ik � kXk if either of these

numbers is small. Let R

t

denote the t-neighborhood of 0 in su(M) (with

respe t to the operator norm). The map exp : R

t

! S

2 sin t=2

is bije tive for

t < �. So we may represent group elements near the identity as exp(X) and

try to repla e [[ exp(X); exp(Y )℄℄ by exp([X;Y ℄); see inequality (8.21) below.

Thus the in lusion (8.18) an be obtained from the result of the following

problem.

[3!℄ Problem 8.15. Prove that R

ab=4

� [R

a

; R

b

℄ � R

2ab

.

7

We assume that the reader knows some basi fa ts about Lie groups and Lie algebras, whi h

an be found in the �rst hapter of any textbook (see e.g. [1, 15, 34, 56℄).


(Note that for su(2), [R

a

; R

b

℄ = R

2ab

| this follows from the standard

representations of the bra ket in su(2)

�

=

so(3) as the ve tor produ t in R

3

.)

[2!℄ Problem 8.16. Prove that for any X;Y 2 su(M)

exp(X) � I �X

� O

�

kXk

2

�

;(8.19)

exp(X) exp(Y )� exp(X + Y )

� O

�

kXk kY k

�

;(8.20)

[[ exp(X); exp(Y )℄℄� exp([X;Y ℄)

� O

�

kXk kY k

�

kXk+ jY k

��

:(8.21)

(The impli it onstants in O(: : : ) should not depend on M .)

How does one use the above four properties in an e�e tive pro edure?

We are going to de�ne three important operations with nets in SU(M) (in

addition to removing redundant points; see Problem 8.14a). Operations,

alled \shrinking" and \teles oping", are used to build in reasingly tight

nets �

0

;�

1

; : : : ;�

n

, n = O(log(1=Æ)) in in reasingly small neighborhoods of

the identity. Spe i� ally, ea h �

j

is an (r

j

; r

j

=q)-net, where r

j

= r

0

�

�j

,

q > � > 1 (q and � are some onstants). Elements of �

j

are produ ts of

L

j

= O(j

2+�

) generators. Then we use the onstru ted nets to approximate

an arbitrarily element V 2 SU(M) in a pro edure alled \zooming in"

(think of the nets as magnifying glasses of di�erent strength).

\Shrinking" is the operation that employs the group ommutator. It

does what its name suggests, namely makes smaller nets from bigger ones.

An (r; r=q)-net shrinks to an (r

2

=4; 5r

2

=q)-net. Suppose that elements of

the original net were produ ts of l generators. Taking the ommutator

multiplies l by 4, whereas the radius r gets approximately squared, and so

does the pre ision Æ = r=q (we assume that q is bounded by a onstant).

Repetition of this pro edure ould yield the desired rate of ompression, and

even better, Æ � r � exp(�l

1=2

). However, the quality of the net degrades

at ea h step, q 7! q=20. The \teles oping" omes to res ue, but at some

ost. Also, we need to sele t a sparse subnet after ea h shrinking to keep

the number of points in ea h net bounded by exp(O(M

2

)). The resulting

rate of ompression is slightly lower, Æ � r � exp(�l

1=(2+�)

), where � an

be arbitrary small. Therefore l = O

�

(log(1=Æ))

2+�

�

.

[2!℄ Problem 8.17. In items (b) and ( ) below, assume that G is an ar-

bitrary group with a biinvariant distan e fun tion d. The result of (a) is

spe i� to SU(M).

(a) (\shrinking"). Let �

1

� SU(M) be an (r

1

; r

1

=q)-net, and �

2

�

SU(M) an (r

2

; r

2

=q)-net. Denote by [[�

1

;�

2

℄℄

�

an �-sparse subnet sele ted

from [[�

1

;�

2

℄℄ =

�

[[U

1

; U

2

℄℄ : U

1

2 �

1

; U

2

2 �

2

(see Problem 8.14a). Prove

that [[�

1

;�

2

℄℄

1=6

is an (r

1

r

2

=4; 5r

1

r

2

=q)-net provided q > 20 and r

1

; r

2

�

O(q

�1

).


(b) (\teles oping" | ombining two nets into one of higher quality).

Let �

1

� G be an (r

1

; Æ

1

)-net, and �

2

� G an (r

2

; Æ

2

)-net, where Æ

1

� r

2

.

Prove that the set �

1

�

2

=

�

U

1

U

2

: U

1

2 �

1

; U

2

2 �

2

is an (r

1

; Æ

2

)-net.

( ) (\zooming in" | iterative approximation). Let �

0

;�

1

; : : : ;�

n

� G

be a sequen e of nets: �

0

is a Æ

0

-net for the entire G, whereas for j � 1

ea h �

j

is an (r

j

; Æ

j

)-net. Suppose that r

j

= r

0

�

�j

, Æ

j

= Æ

0

�

�j

, where

r

0

=Æ

0

= q > � > 1. Prove that any element V 2 G an be approximated by

Z = Z

0

Z

1

� � �Z

n

(Z

j

2 �

j

) so that d(V;Z) � Æ

n

.

8.3.4. The algorithm. Without loss of generality, we may assume that

� = 1=p, where p is an integer. The algorithm onsists of three stages.

Prepro essing. Computation at this stage does not depend on V . We build

in reasingly tight nets in in reasingly small neighborhoods of the identity.

This is done by shrinking an initial net p times and \teles oping" it with one

of the previous nets to regain the original quality; then the y le repeats.

More pre isely, we onstru t a set of nets �

j;k

, j = 0; : : : ; n = O(log(1=Æ)),

k = 0; : : : ; p, a ording to the re ursion rules

�

j;k

=

��

�

dj=2e;k�1

;�

bj=2 ;k�1

��

1=6

for k = 1; : : : ; p;

�

j;0

= �

j�1;p

�

j;p

:

Ea h �

j;k

is an (r

j;k

; r

j;k

=q

k

)-net, where

r

j;k

= r

0;k

�

�j

; r

0;k

= 4C

�p2

k

=(2

p

�1)

; q

k

= C

2p�k

; � = C

p

; C = 20:

The re ursion relations work only for suÆ iently large j (namely, r

j;k

should

be small enough to satisfy the ondition of Problem 8.17a). The �rst few

nets have to be obtained by pi king points from the initial net A; hen e

we need to set the onstant " small enough. (A ording to this rule, the

onstant " depends on p. To avoid su h dependen e, we need to run the

�rst few steps using p = 1, and then swit h to the desired p.)

Ea h element of �

j;k

is a produ t of L

j;k

generators. The numbers L

j;k

satisfy the relations

L

j;k

= 2L

dj=2e;k�1

+ 2L

bj=2 ;k�1

(k = 1; : : : ; p); L

j;0

= L

j�1;p

+ L

j;p

:

An upper bound L

j;k

� j

2+1=p

2

�k=p

�

u

0

� u

1

=j

�

(with onstant u

0

and u

1

)

an be obtained by indu tion; hen e L

j;k

= O(j

2+�

).

When onstru ting elements of the nets �

j;k

, we do not a tually write

them as sequen es of generators. Instead, we represent them as M �M

matri es and keep re ord of the way ea h element was obtained. This

stage involves exp(O(M

2

)) log(1=Æ) matrix multipli ations, whi h amounts

to exp(O(M

2

))(log(1=Æ))

3

bit operations.


Iterative approximation. We use the nets �

j;0

as indi ated in Problem 8.17 .

This yields an element Z = Z

0

� � �Z

n

(Z

j

2 �

j;0

) su h that kV �Z

0

� � �Z

n

k �

Æ. The omplexity of this stage is also exp(O(M

2

))(log(1=Æ))

3

.

Expansion. Now we need to represent ea h Z

j

as a word in the alphabet

\A". We have already omputed Z

j

as matri es, so we know the sequen e of

matrix multipli ations and inversions that have led to Z

j

. In other words,

we have a lassi al ir uit over the basis fmultipli ation, inversiong.

This ir uit will perform omputation over the free group as well. Thus

we plug symbols of the alphabet \A" to the inputs of the ir uit, and get

some words w

j

as the output; then we on atenate them to obtain the

word w representing Z. When omputing w

j

, we operate with exponentially

in reasing words; therefore the omplexity is dominated by the last step. So,

the number of operations is O(jw

j

j) = O(L

j;0

) = O(j

2+�

). Summing over j,

we on lude that w is omputed in O(L) steps, where L = jwj = O(n

3+�

)

(re all that n = O(log(1=Æ) ).

9. De�nition of Quantum Computation.

Examples

9.1. Computation by quantum ir uits. Until now, we have been de-

s ribing the work of a quantum omputer. Now it is time to de�ne when

this work leads to the solution of problems that are interesting to us. The

de�nition will resemble the de�nition of probabilisti omputation.

Consider a fun tion F : B

n

! B

m

. We examine a quantum ir uit op-

erating with n bits, U = U

L

� � �U

2

U

1

: B

N

! B

N

. Loosely speaking, this

ir uit omputes F if, after having applied U to the initial state jx; 0

N�n

i

and \having looked" at the �rst m bits, we \see" F (x) with high probability.

(The remaining qubits an ontain arbitrary garbage.)

We only need to dis uss the nature of that probability. The pre ise

meaning of the words \having looked" and \see" is that a measurement

of the values of the orresponding qubits is performed. Several di�erent

answers an be obtained as the result of this measurement, ea h with its

own probability. Later (in Se tion 10) this question will be onsidered in

more details. To give a de�nition of quantum omputation of a fun tion F ,

it suÆ es (without inje ting physi al explanations of this fa t) to a ept the

following: the probability of getting a basis state x in the measurement of

the state j i =

P

x

x

jxi equals

(9.1) P(j i; x) = j

x

j

2

:

We are interested in the probability that the omputer will �nish its

work in a state of the form (F (x); z), where z is arbitrary.


De�nition 9.1. The ir uit U = U

L

� � �U

2

U

1

omputes F if for any x we

have

X

z

jhF (x); zjU jx; 0

N�n

ij

2

� 1� ";

where " is some �xed number smaller than 1=2. (Note that F (x) and x

onsist of di�erent numbers of bits, although the total lengths of (F (x); z)

and (x; 0

N�n

) must be equal to N .)

Just as for probabilisti omputation, the hoi e of " is unimportant,

inasmu h as it is possible to e�e t several opies of the ir uit indepen-

dently and to hoose the result that is most frequently obtained. From the

estimate (4.1) on p. 37 it follows that in order to de rease the probabil-

ity of failure by a fa tor of a, we need to take k = �(log a) opies of the

ir uit U . The hoi e of the most frequent result is realized by a lassi al

ir uit, using the majority fun tion MAJ(x

1

; : : : ; x

k

) (whi h takes value 1

when more than half of its arguments equal 1 and value 0 otherwise). The

fun tion MAJ(x

1

; : : : ; x

k

) an be realized over a omplete basis by a ir uit

of size O(k) (see Problem 2.15). Therefore the a-fold redu tion of the error

probability is a hieved at the ost of in reasing the ir uit size by the fa tor

O(log a).

[1℄ Problem 9.1. Prove that the above argument is a tually orre t if we

in orporate the fun tion MAJ into the quantum ir uit. Spe i� ally, show

that we may use the fun tion MAJ

�

realized by a reversible ir uit, so that

its input bits are the output qubits of k opies of the ir uit U .

[2!℄ Problem 9.2. Suppose that ea h gate of the ir uit U

k

is approximated

by

~

U

k

with pre ision Æ. Prove that the resulting ir uit

~

U =

~

U

L

� � �

~

U

2

~

U

1

satis�es the inequality from De�nition 9.1, with " repla ed by ~" = "+ 2LÆ.

(The suggested solution is based on the general notion of quantum prob-

ability; see Se tion 10, espe ially Remark 10.1.)

Now that we have the de�nition of quantum omputation, we an make

a omparison of the e�e tiveness of lassi al and quantum omputing. In

the Introdu tion we mentioned three fundamental examples where quantum

omputation appears to be more e�e tive than lassi al. We begin with the

example where the greater e�e tiveness of quantum omputation has been

proved (although the in rease in speed is only polynomial).

9.2. Quantum sear h: Grover's algorithm. We will give a de�nition of

a general sear h problem in lassi al and quantum formulations. It belongs

to a lass of omputational problems with ora le, in whi h the input is given

as a fun tion ( alled a \bla k box", or an ora le) rather than a binary word.

y

x

A(x; y)


Suppose we have a devi e (see the diagram) that

re eives inputs x and y and determines the value of

some predi ate A(x; y). We are interested in the pred-

i ate F (x) = 9yA(x; y). This resembles the de�nition

of the lass NP, but now the internal stru ture of the devi e al ulating the

predi ate A is ina essible to us. Under su h onditions, it is not possible to

omplete the al ulation faster than in N = 2

n

steps on a lassi al omputer,

where n is the number of bits in the binary word y.

The problem an be formulated even without x: we need to ompute

the value of the \fun tional" F(A) = 9yA(y). If x is present, we an regard

it as a part of the ora le, i.e., repla e A with the predi ate A

x

su h that

A

x

(y) = A(x; y). Then F(A

x

) = F (x) = 9yA(x; y).

Remark 9.1. The version of the problem without x has another interpre-

tation, whi h is quite rigorous (unlike the analogy with NP). Let us think

of A as a bit string: y is the index of a bit, whereas A(y) is the value of

that bit ( f. the paragraph pre eeding De�nition 2.2 on page 26). Then

F(A) =

W

y2B

n

A(y), the OR fun tion with N = 2

n

inputs.

It turns out that a quantum omputer an determine the value of F(A)

and even �nd a y for whi h A(y) is satis�ed, in time O(

p

N). The lower

bound (

p

N) has also been obtained, showing that in this situation quan-

tum omputers give only a quadrati speed-up in omparison with lassi al

ones.

jyiU jyi

U

In the quantum formulation, the problem looks as

follows. A quantum omputer an query the ora le by

sending y so that di�erent values of y may form su-

perpositions, and the ora le will return superpositions

a ordingly. Interestingly enough, the ora le an en ode the answer into

a phase fa tor. Spe i� ally, our ora le (or \bla k box") is de�ned as an

operator U a ting by the rule

U jyi =

(

jyi if A(y) = 0;

�jyi if A(y) = 1:

We assume that the omputer an hoose whether to query the ora le or

not, whi h orresponds to applying the operator �(U).

The goal is to ompute the value F(A) and �nd an \answer" y for whi h

A(y) is satis�ed. This should be done by a quantum ir uit, using �(U) as

a gate (in addition to the standard basis).

The results that we have already mentioned are formulated as follows

( f. [32, 75℄): there exist two onstants C

1

; C

2

su h that there is a ir uit of

size � C

1

p

N , de iding the problem for an arbitrary predi ate A; and, for


an arbitrary ir uit of size � C

2

p

N , there exists a predi ate A for whi h

the problem is not de ided by this ir uit (i.e., the ir uit gives an in orre t

answer with probability > 1=3).

We will onstru t a quantum ir uit for a simpli�ed version of the prob-

lem: we assume that the \answer" exists and is unique, and we denote it

by y

0

; we need to �nd y

0

. The ir uit will be des ribed in terms of operator

a tion on the basis ve tors.

Consider two operators

U = I � 2jy

0

ihy

0

j;

V = I � 2j�ih�j; where j�i =

1

p

N

X

y

jyi.

The operator U is given to us (it is the ora le). The operator V is

represented by the matrix

V =

0

B

�

1�

2

N

: : : �

2

N

.

.

.

.

.

.

.

.

.

�

2

N

: : : 1�

2

N

1

C

A

(re all that N = 2

n

).

Let us realize V by a quantum ir uit. We will pro eed as follows:

we transform j�i to j0

n

i by some operator W , then apply the operator

I � 2j0

n

ih0

n

j, and �nally apply W

�1

.

It is easy to onstru t an operatorW that takes j�i to j0

n

i. The following

will do: W = H

n

, where H is the Hadamard gate from the standard basis

(see De�nition 8.5). In fa t, j�i =

1

p

2

n

(j0i+ j1i)

n

, and H :

1

p

2

(j0i+ j1i) 7!

j0i.

Now we have to implement the operator I � 2j0

n

ih0

n

j. We will use a

reversible lassi al ir uit that realizes the operator Z : B

n+1

! B

n+1

,

Zja

0

; : : : ; a

n

i = ja

0

� f(a

1

; a

2

; : : : ; a

n

); a

1

; : : : ; a

n

i;

f(a

1

; : : : ; a

n

) =

(

1 if a

1

= � � � = a

n

= 0;

0 if 9 j : a

j

6= 0:

(Up to a permutation of the arguments, Z =

f

�

.) Sin e f has Boolean

ir uit omplexity O(n), Z an be realized by a reversible ir uit of size

O(n) (see Lemma 7.2).

The ir uit that realizes the operator V is shown in Figure 9.1. The en-

tral portion, in orporating Z, �

z

and Z, realizes the operator I � 2j0

n

ih0

n

j.

In this ir uit, the operator �

z

= K

2

(K from the standard basis) is used.


ZZ

�

z

WW

a

0

= 0

a

1

.

.

.

a

n

0

.

.

.

0

0

.

.

.

0

0

Fig. 9.1. Implementation of the operator V .

We note that W

2

and Z

2

a t trivially (as the identity operator) on

ve tors with zero-valued borrowed qubits. Therefore the de isive role is

played by the operator �

z

a ting on an auxiliary qubit, whi h likewise returns

to its initial value 0 in the end.

We must not be onfused by the fa t that although �

z

a ts only on

an f

�

- ontrolled qubit, the whole ve tor hanges as a result. In general,

the distin tion between \reading" and \writing" in the quantum ase is not

absolute and depends on the hoi e of basis. Let us give a relevant example.

�

x

HH

H H

Fig. 9.2. �(�

x

) in a di�erent basis.

Let us �nd the matrix of �(�

x

) : ja; bi 7! ja; a� bi relative to the basis

1

p

2

(j0i � j1i) for ea h of the qubits. In other words, we need to write the

matrix of the operator X = (H H) �(�

x

) (H H) relative to the lassi al

basis. The ir uit for this operator is shown in Figure 9.2. Using the equality


�(�

x

)j ; di = j ; � di, we �nd the a tion of X on any basis ve tor:

Xja; bi =

1

2

(H H)�(�

x

)

X

;d

(�1)

a +bd

j ; di

=

1

2

(H H)

X

;d

(�1)

a +bd

j ; � di

=

1

4

X

a

0

;b

0

; ;d

(�1)

a

0

+b

0

( +d)

(�1)

a +bd

ja

0

; b

0

i

=

1

4

X

a

0

;b

0

2Æ

b;b

0

� 2Æ

a; a

0

�b

0

ja

0

; b

0

i = ja� b; bi:

Thus, in the basis

1

p

2

(j0i � j1i), the ontrolling and the ontrolled qubits

have hanged pla es. Whi h bit is \ ontrolling" (is \read") and whi h is

\ ontrolled" (is \modi�ed") depends on the hoi e of basis. Of ourse, su h

a situation goes against our lassi al intuition. It is hard to imagine that by

passing to a di�erent basis, a quantum printer suddenly be omes a quantum

s anner.

[1℄ Problem 9.3. What will happen if we hange the basis only in one of

the qubits?

�

x

H H

For example, what will the matrix of the op-

erator with the ir uit shown in the diagram

look like? Also try to hange the basis in the

other qubit.

Let us return to the onstru tion of a ir uit for the general sear h

problem. What follows is the main part of Grover's algorithm. The ora le

U = I � 2jy

0

ihy

0

j was given to us, and we have realized the operator V =

I�2j�ih�j. We start omputation with the ve tor j�i, whi h an be obtained

from j0

n

i by applying the operator W . Now, with the aid of the operators

U and V , we are going to transform j�i to the solution ve tor jy

0

i. For this,

we will apply alternately the operators U and V :

� � � V UV U j�i = (V U)

s

j�i:

What do we get from this? Geometri ally, both operators are re e tions

through hyperplanes. The subspa e L = C (j�i; jy

0

i) is invariant under both

operators, and thus, under V U . Sin e the initial ve tor j�i belongs to L, it

suÆ es to onsider the a tion of V U on this subspa e.

j�i

jy

0

i

'=2

The omposition of two re e tions with respe t

to two lines is a rotation by twi e the angle between

those lines. The angle is easy to al ulate: h�jy

0

i =


1

p

N

= sin

'

2

, i.e., the lines are almost perpendi ular.

Therefore we may write V U = �R, where R is the

rotation by the small angle '. But then (V U)

s

=

(�1)

s

R

s

, where R

s

is the rotation through the angle

s'. The sign does not interest us (phase fa tors

do not a�e t probabilities). For large N , we have

' � 2=

p

N . Then, after s � (�=4)

p

N iterations, the initial ve tor is turned

by an angle s' � �=2 and be omes lose to the solution ve tor. This also

indi ates that the system ends up in the state jy

0

i with probability lose to

one.

To solve the sear h problem in the most general setting (when there

may be several answers, or there may be none), additional te hni al devi es

are needed. Note that the number of steps for the rotation from the initial

ve tor to some ve tor of the subspa e spanned by the solution ve tors is

inversely proportional to the square root of the number of solutions.

Problem 9.4. For a given n, onstru t poly(n)-size quantum ir uits

(over the basis of all two-qubit gates) whi h perform the following tasks.

[2℄ a) For a given number q, 1 � q � 2

n

, transform the state j0

n

i into the

state j�

n;q

i =

1

p

q

P

q�1

j=0

jji.

[2℄ b) Transform jq� 1; 0

n

i into jq� 1i j�

n;q

i for all q, assuming that q is

expressed in n binary digits.

[3!℄ ) Realize the Fourier transform operator F

q

over the group Z

q

:

F

q

jxi =

1

p

q

q�1

X

y=0

exp

�

2�i

xy

q

�

jyi;

where x and y are expressed in n binary digits. Consider the ase q = 2

n

.

(The ase of arbitrary q requires some extra tools, so we will onsider it

later; see Problem 13.4.)

9.3. A universal quantum ir uit. The se ond of the examples men-

tioned in the Introdu tion was simulation of a quantum me hani al system.

This is a vaguely posed problem sin e the hoi e of parti ular systems and

distinguishing \essential" degrees of freedom play an important role. The

problem has been a tually solved in several settings. With high on�den e,

we may laim that every physi al quantum system an be eÆ iently simu-

lated on a quantum omputer, but we an never prove this statement. The

situation resembles that of Turing's thesis (see Se tion 1.3). Re all that

the validity of Turing's thesis is partially justi�ed by the existen e of the

universal Turing ma hine. In this vein, we may examine universality of our


quantum omputation model by purely mathemati al means. Let us try to

simulate many ir uits by one.

We will not limit the type of gates we use to any parti ular basis. General

quantum ir uits have manageable des ription if the gates are spe i�ed as

matri es with entries given by binary fra tions to ertain pre ision Æ

1

. Then

the ina ura y of an r-qubit gate (in the operator norm) does not ex eed

Æ = MÆ

1

, where M = 2

r

is the size of the matrix. Suppose we have a

des ription Z of a quantum ir uit of size � L and pre ision Æ. Ea h gate of

the ir uit a ts on at most r qubits, so that the total length of the des ription

does not ex eed poly (L2

r

log(1=Æ)). The operator realized by the ir uit will

be denoted by Op(Z). We will try to simulate all ir uits with the given

parameters L; r; Æ.

Using the algorithm from the proof of Theorem 8.1, we redu e the prob-

lem to the ase r = 2. Then we apply Theorem 8.3. Thus we an realize

ea h operator in Z by a ir uit of size poly(2

r

log(1=Æ)) over the standard

basis using O(r) an illas. This yields (a des ription of) a ir uit R(Z)

over the standard basis, whi h has size S = poly (L2

r

log(1=Æ)), operates

on N = L + O(r) qubits, and approximates Op(Z) with pre ision O(LÆ).

The transformation Z 7! R(Z) is performed by a Boolean ir uit of size

poly (L2

r

log(1=Æ)). Hen e simulating a general ir uit is not mu h harder

than simulating ir uits over the standard basis.

The result is as follows. There is a universal quantum ir uit U of size

poly (L2

r

log(1=Æ)) that simulates the work of an arbitrary quantum ir uit

in the following way: for any ir uit des ription Z and input ve tor j�i, U

satis�es the ondition

U

�

jZi j�i j0

k

i

�

� jZi

�

Op(Z)j�i

�

j0

k

i

= O(LÆ):

That is, U works as a \programmable quantum omputer", with Z being

the \program".

The qubits of the ir uit U in lude N \ ontrolled" qubits that orre-

spond to the qubits of R(Z). Another subset of qubits holds jZi. There is

also a number of auxiliary qubits, some of whi h are alled \ ontrolling".

The key omponent of the ir uit U is a ir uit V , the produ t of the op-

erators V

j

= �(X)[j; k

j

℄ (or V

j

= �(X)[j; k

j

; l

j

℄, or V

j

= �(X)[j; k

j

; l

j

;m

j

℄),

with X from the standard basis, applied to ea h one (or pair, or triple) of

the ontrolled qubits in an arbitrary order. The ontrolling qubits j are all

di�erent. If we set one ontrolling qubit to 1 and all the others to 0, then the

ir uit V realizes an operator of the form X[k℄ (or X[k; l℄, or X[k; l;m℄) on

the ontrolled qubits. Hen e the omposition of S opies of V with di�erent

ontrolling qubits an simulate an arbitrary ir uit of size S over the stan-

dard basis, provided that the ontrolling qubits are set appropriately. To


set the ontrolling qubits, we need to ompute R(Z) by a reversible ir uit

(with garbage) and arrange the output in a ertain way. This omputation

should be reversed at the end.

9.4. Quantum algorithms and the lass BQP. Up until now we have

been studying nonuniform quantum omputation (i.e., omputation of Boo-

lean fun tions). Algorithms ompute fun tions on words of arbitrary length.

A de�nition of a quantum algorithm an be given using quantum ir uits

that have been already introdu ed. Roughly speaking, a lassi al Turing

ma hine builds a quantum ir uit that omputes the value of the fun tion

on one or many inputs. A tually, there are several equivalent de�nitions,

the following being the standard one. Let F : B

�

! B

�

be a fun tion su h

that the length of the output is polynomial in the length of the input. It is

omposed of a sequen e of Boolean fun tions F

n

: B

n

! B

m(n)

(restri tions

of F to inputs of length n = 0; 1; 2; : : : ). A quantum algorithm for the

omputation of F is a uniform sequen e of quantum ir uits that ompute

F

n

. Uniform means that the des ription Z

n

of the orresponding ir uit is

onstru ted by a lassi al Turing ma hine whi h takes n as the input. We

will say that the quantum algorithm omputes F in time T (n) if building

the ir uit takes at most T (n) steps. The size of the ir uit L is obviously

not greater than T (n).

A subtle point in this de�nition is what basis to use. It is safe to sti k

to the standard basis. Alternatively, the basis may onsist of all unitary

operators. In this ase, ea h r-qubit gate should be spe i�ed as a list of all its

matrix elements with pre ision 2

�r

L

�1

, so that the pre ision of the matrix

(in the operator norm) is L

�1

, where is a small onstant. If "+ 2 < 1=2

(see De�nition 9.1 and Problem 9.2) then the approximate ir uit works �ne.

Using the algorithm of Theorems 8.1 and 8.3, this ir uit an be transformed

to an equivalent ir uit of size poly(T (n)) over the standard basis (note that

T (n) in ludes the fa tor 2

r

). The onverse is obvious.

Remark 9.2. The use of an arbitrary omplete basis ould lead to \patholo-

gies". For example, let the basis ontain the gate

X =

�

os � � sin �

sin � os �

�

;

where � is a non omputable number, e.g., the n-th digit of � says whether

the universal Turing ma hine terminates at input n ( f. Problem 1.3). Then

p = sin

2

� is also non omputable. If we apply X to the state j0i and mea-

sure the qubit, we will get 1 with probability p and 0 with probability 1� p.

Repeating this pro edure exp(�(n)) times and ounting the number of 0s

and 1s, we an �nd the n-th digit of p with very small error probability.

Thus the gate X enables us to solve the halting problem! (This argument


has nothing to do with quantum me hani s. A lassi al probabilisti om-

puter ould also get extra power if random numbers with arbitrary p were

allowed.) Of ourse, we want to avoid su h things in our theory, so we must

be areful about the hoi e of basis. However, in the real world \superpow-

erful" gates might exist. Experimentalists measure dimensionless physi al

onstants (su h as the �ne stru ture onstant) with in reasingly high pre-

ision, getting new digits of the number theoreti ians annot ompute. If

we ever learn where the fundamental physi al onstants ome from, we will

probably know whether they are omputable, and if they are not, whether

they arry some mathemati ally meaningful information, e.g., allow one to

solve the halting problem.

Remark 9.3. It is possible to de�ne a quantum Turing ma hine dire tly,

through superpositions of various states of a lassi al Turing ma hine (the

original de�nition of D. Deuts h [20℄ was just like this). The standard

de�nition turns out to be equivalent but more onvenient.

Using the universal quantum ir uit, we an simplify the standard def-

inition even further. It suÆ es to have a lassi al Turing ma hine that

generates a des ription of a quantum ir uit Z(x) whi h is only good to

ompute F (x) for a single value of x. In this ase, x is the input word for

the TM whereas the ir uit does not have input data at all (i.e., it operates

on supplementary qubits initialized by the state j0

N

i ). Indeed, if we have

su h a ma hine M , then we an build a ma hine M

0

whi h re eives n and

onstru ts a Boolean ir uit whi h omputes Z(x) for all values of x, jxj = n

(see Theorem 2.3). By a ertain polynomial algorithm, the Boolean ir uit

an be transformed into a reversible ir uit (with garbage) and ombined

with the universal quantum ir uit, so that the output of the former (i.e.,

Z(x)) be omes the \program" for the latter. This yields a quantum ir uit

that omputes F

n

.

Thus we will adopt the following de�nition.

De�nition 9.2. A quantum algorithm for the omputation of a fun tion

F : B

�

! B

�

is a lassi al algorithm (i.e., a Turing ma hine) that omputes

a fun tion of the form x 7! Z(x), where Z(x) is a des ription of a quantum

ir uit whi h omputes F (x) on empty input. The fun tion F is said to

belong to lass BQP if there is a quantum algorithm that omputes F in

time poly(n).

How does the lass BQP relate to the omplexity lasses introdu ed

earlier?

[3!℄ Problem 9.5. Prove that

BPP � BQP � PP � PSPACE:


The lass PP onsists of predi ates of the form

Q(x) =

�

�

�

fy : R

0

(x; y)g

�

�

<

�

�

fy : R

1

(x; y)g

�

�

�

;

where R

0

; R

1

2 P, and y runs through all words of length bounded by some

polynomial q(x).

This is almost all that is known about the orresponden e between BQP

and the other omplexity lasses. Indire t eviden e in favor of the stri t

in lusion BPP � BQP is given by the existen e of e�e tive quantum algo-

rithms for some number-theoreti problems traditionally regarded as diÆ ult

(see Se tion 13).

We also remark that there have re ently appeared interesting results on-

erning quantum analogs of some stronger omplexity lasses (not des ribed

in Part 1).

What's next? (A note to the impatient reader). We have spent four

se tions de�ning what quantum omputation is, but have given only few

nontrivial examples so far. The reader may want to see more examples right

now. If so, you may skip some material and read Se tion 13.1 (Simon's

algorithm). There will be some referen es to \mixed states", but all al u-

lations an be done with state ve tors as well. However, most other results

are based (not as mu h formally as on eptually) upon the general notion

of quantum probability and measurement. We will now pro eed to these

topi s.

10. Quantum probability

10.1. Probability for state ve tors. Let us dis uss several \physi al"

aspe ts of quantum omputation. Let a system of n qubits be in the state

j i =

P

x

x

jxi. The oeÆ ients of the expansion relative to the lassi al

basis are alled amplitudes. The square of the modulus of the amplitude,

j

x

j

2

, equals the probability of �nding the system in a given state x ( ompare

with (9.1)). In other words, under a measurement of the state of this quan-

tum system, a lassi al state will be obtained, a ording to the probability

distribution j

x

j

2

.

The quantity determined by formula (9.1) possesses the basi properties

of ordinary probability. The fa t that the square of the modulus of the am-

plitude is the probability of observing the system in state x agrees with the

fa t that the physi al states of quantum me hani s orrespond to ve tors of

unit length, and transformations of these states do not hange the length,

i.e., they are unitary. Indeed, h j i =

P

x

j

x

j

2

= 1 (the sum of probabil-

ities equals 1), and the appli ation of physi ally realizable operators must

preserve this relation, i.e., the operator must be unitary.


Formula (9.1) is suÆ ient for the de�nition of quantum omputation

and the lass BQP. There are, however, situations for whi h this de�nition

turns out to be in onvenient or inappli able. Two fundamental examples

are measurement operators and algorithms that are based on them, and

the problem of onstru ting reliable quantum ir uits from unreliable gates

(error orre tion).

We therefore give a de�nition of quantum probability whi h general-

izes both what we observe (the state of the system) and the result of the

observation. We will arrive at this general de�nition by analysing a series

of examples. To begin with, we rewrite the expression for the probability

already btained in the form

j

x

j

2

= jh jxij

2

= h

�

x

z }| {

jxihxj i;

where �

x

denotes the proje tion to the subspa e spanned by jxi.

To make the next step toward the general de�nition of quantum prob-

ability, we ompute the probability that the �rst m bits have a given value

y = (y

1

; : : : ; y

m

). Let us represent basis states in the form of two blo ks:

x =

y

z

m

n�m

. We obtain

(10.1)

P(j i; y) =

X

z

P(j i; (y; z)) =

X

z

h jy; zihy; zj i

= h j

�

jyihyj I

�

j i = h j�

M

j i:

Here �

M

denotes the operator of orthogonal proje tion onto the subspa e

M = jyi B

(n�m)

. Formula (10.1) gives the de�nition of quantum proba-

bility also in the ase whereM is an arbitrary subspa e. In this ase the pro-

je tion onto the subspa eM� N is given by the formula �

M

=

P

j

je

j

ihe

j

j,

where e

j

runs over an arbitrary orthonormal basis for M.

Remark 10.1. The quantity

P

z

jhF (x); zjU jx; 0

N�n

ij

2

, whi h appears in

the de�nition of the evaluation of a fun tion F : B

n

! B

m

by a quantum

ir uit (De�nition 9.1), equals P(U jx; 0

N�n

i;M), where M = jF (x)i

B

N�m

. Re all on e again the meaning of this de�nition: the ir uit U =

U

L

� � �U

2

U

1

omputes F if for ea h x the probability to observe the orre t

result for F (x) after appli ation of the ir uit to the initial state jx; 0

N�n

i

is at least 1� ".

Proje tions do not represent physi ally realizable operators; more pre-

isely, they do not des ribe the evolution of one state of a system to another

over a �xed time period. Su h evolution is des ribed by unitary operators.

Nonetheless, taking some liberty, it is possible to bestow physi al meaning

on proje tion. A proje tion sele ts a portion of system states from among all


possible states. Imagine a �lter, i.e., a physi al devi e whi h passes systems

in states belonging toM but destroys the system if its state is orthogonal to

M. (For example, a polarizer does this to photons.) If we submit a system

in state j i to the input of su h a �lter, then the system at the output will

be in the state j�i = �

M

j i. The probability asso iated to this state is

generally smaller than one; it is p = h�j�i = h j�

M

j i. The number 1 � p

determines the probability that the system will not pass through the �lter.

Let us ompare lassi al and quantum probability.

Classi al probability Quantum probability

De�nition

An event is a subset M of a �xed �nite

set N .

An event is a subspa e M of some

�nite-dimensional Hilbert spa e N .

A probability distribution is given by a

fun tion w : N ! R with the properties

a)

P

j

w

j

= 1; b) w

j

� 0.

A probability distribution is given by a

state ve tor j i, h j i = 1.

Probability: Pr(w;M) =

P

j2M

w

j

. Probability: P(j i;M) = h j�

M

j i.

Properties

1. If M

1

\M

2

= ?, then

Pr(w;M

1

[M

2

) = Pr(w;M

1

)

+Pr(w;M

2

).

1

q

. If M

1

?M

2

, then

P(j i;M

1

�M

2

) = P(j i;M

1

)

+P(j i;M

2

).

2. (in the general ase)

Pr(w;M

1

[M

2

) = Pr(w;M

1

)

+Pr(w;M

2

)�Pr(w;M

1

\M

2

).

2

q

. If �

M

1

�

M

2

= �

M

2

�

M

1

, then

P(j i;M

1

+M

2

) = P(j i;M

1

)

+P(j i;M

2

)�P(j i;M

1

\M

2

).

Note that the onditionM

1

?M

2

(mutually ex lusive events) is equiv-

alent to the ondition �

M

1

�

M

2

= �

M

2

�

M

1

= 0.

If we have two nonorthogonal subspa es with zero interse tion, the quan-

tum probability is not ne essarily additive. We give a simple example where

P(j i;M

1

+M

2

) 6= P(j i;M

1

) +P(j i;M

2

).

Let j�i = j0i,M

1

= C (j0i) (the linear subspa e generated by the ve tor

j0i ), M

2

= C(j�i); where h�j�i is lose to 1. Then

1 = P(j�i;M

1

+M

2

) 6= P(j�i;M

1

) +P(j�i;M

2

) � 1 + 1:

10.2. Mixed states (density matri es). Thus, we have de�ned, in the

most general way, what quantity we measure. Now we need to generalize

what obje t we perform the measurement on. Su h obje ts will be something

more general than state ve tors or probability distributions. This will give

us a de�nition of probability that generalizes both lassi al and quantum

probability.

Consider a probability distribution on a �nite set of quantum states

�

j�

1

i; : : : ; j�

s

i

. The probability of the state j�

j

i will be denoted by p

j

;


learly

P

j

p

j

= 1. We will al ulate the probability of observing a state in

the subspa e M:

(10.2)

X

k

p

k

P(j�i;M) =

X

k

p

k

h�

k

j�

M

j�

k

i

=

X

k

p

k

Tr (j�

k

ih�

k

j�

M

) = Tr(��

M

);

where � denotes the density matrix

8

� =

P

k

p

k

j�

k

ih�

k

j. The �nal expression

in (10.2) is what we take as the general de�nition of probability.

[1!℄ Problem 10.1. Prove that the operators of the form � =

P

k

p

k

j�

k

ih�

k

j

are pre isely the Hermitian nonnegative operators with tra e 1, i.e., opera-

tors that satisfy the onditions

1) � = �

y

; 2) 8j�i h�j�j�i � 0; 3) Tr � = 1:

From now on, by a density matrix we will mean an arbitrary operator

with these properties.

The arguments about the \probability distribution on quantum states"

were of an an illary nature. The problem is how to generalize the notion

of a quantum state to in lude lassi al probability distributions. The result

we have obtained (the last expression in (10.2)) depends only on the density

matrix, so that we may postulate that generalized quantum states and den-

sity matri es be the same. If a state is given by a density matrix of rank 1

(i.e., � = j�ih�j), then it is said to be pure; if it is given by a general density

matrix, it is alled mixed.

De�nition 10.1. For a quantum state given by a density matrix � and a

subspa eM, the probability of the \event" M equals P(�;M) = Tr(��

M

).

Diagonal matri es orrespond to lassi al probability distributions on

the set of basis ve tors. Indeed, onsider the quantum probability asso iated

with the diagonal matrix � =

P

j

w

j

jjihjj and the subspa e M spanned by

a subset of basis ve tors M . This probabilitity an also be obtained by the

lassi al formula: P(�;M) = Pr(w;M). From the physi al point of view,

a lassi al system is a quantum system that supports only diagonal density

matri es (see dis ussion of de oheren e in the next se tion). A state of su h

a system may be denoted as

(10.3) � =

X

j

w

j

� (j):

8

A tually, this is an operator rather than a matrix, although the term \density matrix" is

traditional. In the sequel, we will often have in mind a matrix, i.e., an operator expressed in a

parti ular basis.


Mathemati ally, this is just a di�erent notation of the probability distribu-

tion w. It is onvenient when we need to simultaneously deal with lassi al

and quantum systems.

Now we ontinue the omparison of the properties of lassi al and quan-

tum probability; for the latter we shall now understand the general de�nition

in terms of a density matrix. (Properties 1

q

and 2

q

remain valid.)

Classi al probability Quantum probability

Properties

3. Suppose a probability distribution of the

form w

jk

= w

(1)

j

w

(2)

k

is spe i�ed on the set

N = N

1

� N

2

. Consider two sets of out-

omes, M

1

� N

1

, M

2

� N

2

. Then the

probabilities multiply: Pr(w;M

1

�M

2

) =

Pr(w

(1)

;M

1

) Pr(w

(2)

;M

2

).

3

q

. Suppose a density matrix of

the form �

1

�

2

is de�ned on the

spa e N = N

1

N

2

. Consider two

subspa es, M

1

� N

1

, M

2

� N

2

.

Then the probabilities likewise mul-

tiply: P(�

1

�

2

;M

1

M

2

) =

P(�

1

;M

1

)P(�

2

;M

2

).

4: Consider a joint probability distribution

on the set N

1

�N

2

. The event we are inter-

ested in does not depend on the out ome

in the se ond set, i.e., M =M

1

�N

2

. The

probability of su h an event is expressed by

a \proje tion" of the distribution onto the

�rst set: Pr(w;M

1

� N

2

) = Pr(w

0

;M

1

),

where w

0

j

=

P

k

w

jk

.

4

q

. In the quantum ase, the restri -

tion to one of the subsystems is de-

s ribed by taking a partial tra e (see

below). Thus, even if the initial state

was pure, the resulting state of the

subsystem may turn out to be mixed:

P(�;M

1

N

2

) = P(Tr

N

2

�;M

1

).

De�nition 10.2. Let X 2 L(N

1

N

2

) = L(N

1

) L(N

2

). The partial

tra e of the operator X over the spa e N

2

is de�ned as follows: if X =

P

m

A

m

B

m

, then Tr

N

2

X =

P

m

A

m

(TrB

m

).

Due to the universality property of the tensor produ t (see p. 55), the

partial tra e does not depend on the hoi e of summands in the represen-

tation X =

P

m

A

m

B

m

. This may seem somewhat obs ure, so we will

give a dire t proof. Let us hoose orthonormal bases in the spa es N

1

, N

2

and express the partial tra e in terms of the matrix elements X

jj

0

kk

0

=

hj; j

0

jXjk; k

0

i. Let

A

m

=

X

j;k

a

m

jk

jjihkj and B

m

=

X

j

0

;k

0

b

m

j

0

k

0

jj

0

ihk

0

j:

Then

X =

X

j;j

0

;k;k

0

X

jj

0

kk

0

jj; j

0

ihk; k

0

j =

X

m

A

m

B

m

=

X

j;j

0

;k;k

0

;m

a

m

jk

b

m

j

0

k

0

jj; j

0

ihk; k

0

j;

so the partial tra e equals

Tr

N

2

X =

X

m

X

j;k

a

m

jk

�

X

l

b

m

ll

�

jjihkj =

X

j;k

X

l

X

jlkl

jjihkj:


Let us onsider an example where taking the partial tra e of the density

matrix orresponding to a pure state leads to the density matrix orrespond-

ing to a mixed state.

Let N

1

= N

2

= B and � = j ih j, where j i =

1

p

2

(j0; 0i + j1; 1i). In

this ase � =

1

2

P

a;b

ja; aihb; bj, thus we obtain

Tr

N

2

� =

1

2

X

a

jaihaj =

�

1=2 0

0 1=2

�

:

This matrix orresponds to a mixed state (pure states orrespond to matri es

of rank 1). Moreover, this mixed state is equivalent to a lassi al probability

distribution: 0 and 1 have probabilities 1/2. Thus, dis arding the se ond

qubit yields a purely lassi al probability distribution on the �rst qubit.

Proposition 10.1. An arbitrary mixed state � 2 L(N ) an be represented

as the partial tra e Tr

F

(j ih j) of a pure state of a larger system, j i 2

NF . Su h j i is alled a puri� ation of �. (We may assume that dimF =

dimN .)

Proof. Set F = N

�

. Sin e � is a nonnegative (= positive semide�nite)

Hermitian operator, there exists

p

� 2 L(N ) = N N

�

. More expli itly,

let us hoose an orthonormal basis fj�

j

ig in whi h � is diagonal, i.e., � =

P

j

p

j

j�

j

ih�

j

j. Then

p

� =

P

j

p

p

j

j�

j

ih�

j

j.

Let us regard

p

� as a ve tor of the spa e N N

�

:

j

p

�i = j i =

X

j

p

p

j

j�

j

i j�

j

i; where j�

j

i = h�

j

j 2 N

�

:

This ve tor satis�es the desired requirements, i.e., Tr

F

(j ih j) = �. Indeed,

j ih j =

X

jk

p

p

j

p

k

�

j�

j

i j�

j

i

��

h�

k

j h�

k

j

�

:

Only terms with j = k ontribute to the partial tra e. Therefore

Tr

N

�

(j ih j) =

X

j

p

j

j�

j

ih�

j

j = �:

�

[2!℄ Problem 10.2. Consider a pure state j i 2 N F . Show that the

so- alled S hmidt de omposition holds:

j i =

X

j

�

j

j�

j

i j�

j

i;

where 0 < �

j

� 1, and the sets of ve tors fj�

j

ig � N and fj�

j

ig � F are

orthonormal.


Note that the numbers �

2

j

are the nonzero eigenvalues of the partial

tra es � = Tr

F

(j ih j) and �

0

= Tr

N

(j ih j). (Hen e the nonzero eigenval-

ues of � and �

0

oin ide.) The number of su h eigenvalues equals the rank of

� and �

0

. For example, if rank(�) = 1, the S hmidt de omposition onsists

of one term, and vi e versa. Thus the state � = Tr

F

(j ih j) is pure if and

only j i is a produ t state, i.e., j i = j�i j�i. In general, rank(�) is the

smallest dimension of the auxiliary spa e F whi h allows a puri� ation of �.

[2!℄ Problem 10.3 (\Puri� ation is unique up to unitary equivalen e").

Let j

1

i; j

2

i 2 N F be two pure states su h that Tr

F

(j

1

ih

1

j) =

Tr

F

(j

2

ih

2

j). Prove that j

2

i = (I

N

U)j

1

i for some unitary opera-

tor U on the spa e F .

10.3. Distan e fun tions for density matri es. In pra ti e, various

mixed states are always spe i�ed with some pre ision, so we need to somehow

measure \distan e" between density matri es. What would the most natural

de�nition of this distan e be? To begin with, let us ask the same question

for probability distributions.

Let w be the probability distribution of an out ome produ ed by some

devi e. Suppose that the devi e is faulty, i.e., with some probability " it goes

ompletely wrong, but with probability 1 � " it works as expe ted. What

an one tell about the a tual probability distribution w

0

of the out ome of

su h a devi e? The answer is

(10.4)

X

j

jw

0

j

� w

j

j � 2":

Conversely, if the inequality (10.4) is true, we an represent w

0

as the prob-

ability distribution produ ed by a pipeline of two pro esses: the �rst gener-

ates j a ording to the distribution w, whereas the se ond alters j with total

probability � ". We on lude that the natural distan e between probability

distributions is given by the `

1

norm, kw � w

0

k

1

=

P

j

jw

0

j

� w

j

j. Now we

will generalize this de�nition to arbitrary density matri es.

De�nition 10.3. The tra e norm of an operator A 2 L(N ) is

(10.5) kAk

tr

= Tr

�

p

A

y

A

�

:

For Hermitian operators, the tra e norm is the sum of the moduli of the

eigenvalues.


[2℄ Problem 10.4. Verify that (10.5) a tually de�nes a norm. Prove that

kAk

tr

= sup

B 6=0

jTrABj

kBk

= max

U2U(N )

jTrAU j;(10.6)

kAk

tr

= inf

(

X

k

j�

k

i

j�

k

i

:

X

k

j�

k

ih�

k

j = A

)

(10.7)

(kXk denotes the operator norm; f. De�nition 8.2 on p. 71).

[1℄ Problem 10.5. Verify that the tra e norm has the following properties:

a) kABk

tr

; kBAk

tr

� kAk

tr

kBk, b) jTrAj � kAk

tr

,

) kTr

M

Ak

tr

� kAk

tr

, d) kABk

tr

= kAk

tr

kBk

tr

.

The following lemma shows why the tra e norm for density matri es an

be regarded as the analogue of the `

1

-norm for probability distributions.

Lemma 10.2. Let N =

L

j

N

j

be a de omposition of N into the dire t sum

of mutually orthogonal subspa es. Then for any pair of density matri es �

and ,

X

j

jP(�;N

j

)�P( ;N

j

)j � k�� k

tr

:

Proof. The left-hand side of the inequality an be represented in the form

Tr((�� )U), where U =

P

j

(��

N

j

). It is lear that U is unitary. We then

apply the representation of the tra e norm in the form (10.6). �

There is another ommonly used distan e fun tion on density matri es,

alled the �delity distan e. Let �; 2 L(N ). Consider all possible puri�-

ations of � and over an auxiliary spa e F of dimension dimF = dimN ;

these are pure states j�i; j�i 2 N F . Then the �delity distan e between �

and is

d

F

(�; )

def

= min

n

j�i � j�i

: Tr

F

(j�ih�j) = �; Tr

F

(j�ih�j) =

o

:

It is related to a quantity alled �delity :

(10.8) F (�; )

def

= max

n

�

�

h�j�i

�

�

2

: Tr

F

(j�ih�j) = �; Tr

F

(j�ih�j) =

o

:

(One an show that the ondition dimF = dimN in these de�nitions an

be relaxed: it is suÆ ient to require that dimF � maxfrank(�); rank( )g.

Thus, any auxiliary spa e F will do, as long as it allows puri� ations of �

and .)

Problem 10.6. Prove that

[1℄ a) d

F

(�; ) =

r

2

�

1�

p

F (�; )

�

;


[2℄ b) F (�; ) =

p

�

p

2

tr

;

[3℄ )

�

1�

k�� k

tr

2

�

2

� F (�; ) � 1�

�

k�� k

tr

2

�

2

.

11. Physi ally realizable transformations of

density matri es

In this se tion we introdu e a formalism for the des ription of irreversible

quantum pro esses. We will not use it in full generality (so some of the

results are super uous), but the basi on epts and examples will be helpful.

11.1. Physi ally realizable superoperators: hara terization.

All transformations of density matri es we will en ounter an be represented

by linear maps between operator spa es, L(N ) ! L(M). A general linear

map of this type is alled a superoperator. We now des ribe those superop-

erators that are admissible from the physi al point of view.

1. A unitary operator takes the density matrix of a pure state � = j�ih�j

to the matrix �

0

= U j�ih�jU

y

. It is natural to assume (by linearity) that

su h a formula also yields the a tion of a unitary operator on an arbitrary

density matrix:

� 7

U

�! U�U

y

:

2. A se ond type of transformation is the operation of taking the partial

tra e. If � 2 L(NF), then the operation of dis arding the se ond subsystem

is des ribed by the superoperator

Tr

F

: � 7! Tr

F

�:

3. We re all that it has been useful to us to borrow qubits in the state j0i.

Let the state � 2 L(B

n

). We onsider the isometri (preserving the inner

produ t) embedding V : B

n

! B

N

in a spa e of larger dimension, given

by the formula j�i 7

V

�! j�i j0

N�n

i. The density matrix � is transformed

thereby into �j0

N�n

ih0

N�n

j. For any isometri embedding V we similarly

obtain a superoperator V � V

y

that a ts as follows:

V � V

y

: � 7! V �V

y

:

We postulate that a physi ally realizable superoperator is a omposition

of an arbitrary number of transformations of types 2 and 3 (type 1 is a

spe ial ase of 3).

[3!℄ Problem 11.1. Prove that a superoperator T is physi ally realizable if

and only if it has the form

(11.1) T = Tr

F

(V � V

y

) : � 7! Tr

F

(V �V

y

);


where V : N ! N F is an isometri embedding.

[2!℄ Problem 11.2 (\Operator sum de omposition"). Prove that a super-

operator T is physi ally realizable if and only if it an be represented in the

form

(11.2) T =

X

m

A

m

� A

y

m

: � 7!

X

m

A

m

�A

y

m

; where

X

m

A

y

m

A

m

= I:

The operation of taking the partial tra e means forgetting (dis arding)

one of the subsystems. We show that su h an interpretation is reasonable,

spe i� ally that the subsequent fate of the dis arded system in no way in-

uen es the quantities hara terizing the remaining system. Let us take a

system onsisting of two subsystems, whi h is in some state � 2 L(N F).

If we dis ard the se ond subsystem (to the trash), then it will be subje ted

to un ontrollable in uen es. Suppose we apply some operator U to the �rst

subsystem. We will then obtain a state = (U Y )�(U Y )

y

, where Y

is an arbitrary unitary operator (the a tion of the trash bin on the trash).

If we wish to �nd the probability for some subspa e M � N pertaining to

the �rst subsystem (the trash doesn't interest us), then the result does not

depend on Y and equals

P( ;MF) = P(Tr

F

;M) = P

�

U(Tr

F

�)U

y

;M

�

:

Here the �rst equality is the property 4

q

of quantum probability, whereas

the se ond equality represents a new property:

(11.3) Tr

F

�

(U Y )�(U Y )

y

�

= U(Tr

F

�)U

y

:

[1℄ Problem 11.3. Prove the identity (11.3).

[2!℄ Problem 11.4. Let us write a superoperator T : L(N )! L(M) in the

oordinate form:

T

�

jjihkj

�

=

X

j

0

;k

0

T

(j

0

j)(k

0

k)

jj

0

ihk

0

j:

Prove that the physi al realizability of T is equivalent to the set of three

onditions:

a)

P

l

T

(lj)(lk)

= Æ

jk

(Krone ker symbol);

b) T

�

(j

0

j)(k

0

k)

= T

(k

0

k)(j

0

j)

;

) The matrix

�

T

(j

0

j)(k

0

k)

�

is nonnegative (ea h of the index pairs is re-

garded as a single index).

[3!℄ Problem 11.5. Prove that a superoperator T : L(N ) ! L(M) is

physi ally realizable if and only if it satis�es the following three onditions:

a) Tr(TX) = TrX for any X 2 L(N );

b) (TX)

y

= TX

y

for any X 2 L(N );


) T is ompletely positive. Namely, for any additional spa e G the super-

operator T I

L(G)

: L(N G)! L(MG) maps nonnegative operators

to nonnegative operators.

11.2. Cal ulation of the probability for quantum omputation.

Now, sin e we have the general de�nitions of quantum probability and of a

physi ally realizable transformation of density matri es, there are two ways

to al ulate the probability that enters the de�nition of quantum omputa-

tion. Suppose we use a supplementary subsystem. After we no longer need

it, we an dis ard it to the trash and, in ounting the probability, take the

partial tra e over the state spa e of the supplementary subsystem. Or else

we may hold all the trash until the very end and onsider the probability

of an event of the form M

1

N

2

(on e we have stopped using the se ond

subsystem, no details of its existen e are of any importan e to us and we are

not interested in what pre isely happens to it in the trash bin). As already

stated, these probabilities are equal: P(�;M

1

N

2

) = P(Tr

N

2

�;M

1

).

Remark 11.1. It is not diÆ ult to de�ne a more general model of quantum

omputation in whi h suitable physi ally realizable superoperators (not ne -

essarily orresponding to unitary operators) serve as the elementary gates.

Su h a model of omputation with mixed states is more adequate in the phys-

i al situation where the quantum omputer intera ts with the surrounding

environment. In parti ular, one an onsider things like ombination of las-

si al and quantum omputation. From the omplexity point of view, the new

model is polynomially equivalent to the standard one, if a omplete basis is

used in both ases. (Completeness in the new model is most omprehensively

de�ned as the possibility to e�e t arbitrary unitary operators on \en oded

qubits"; f. Remark 8.2.) We also note that in the model of omputation

with mixed states a more natural de�nition of a probabilisti subroutine is

possible. We will not give this de�nition here, but refer interested readers

to [4℄.

11.3. De oheren e. The term \de oheren e" is generally used to denote

irreversible degradation of a quantum state aused by its intera tion with the

environment. This ould be an arbitrary physi ally realizable superoperator

that takes pure states to mixed states. For the purpose of our dis ussion,

de oheren e means the spe i� superoperator D that \forgets" o�-diagonal

matrix elements:

� =

X

j;k

�

jk

jjihkj 7

D

�!

X

k

�

kk

jkihkj:

This superoperator is also known as an extreme ase of a \phase damping

hannel". We will show that it is physi ally realizable. For simpli ity, let us

assume that D a ts on a single qubit.


The a tion of D on a density matrix � an be performed in three steps.

First, we append a null qubit:

� 7! � j0ih0j:

Then we \ opy" the original qubit into the an illa. This an be a hieved by

applying the operator �(�

x

) : ja; bi 7! ja; a� bi. We get

� j0ih0j 7

�(�

x

)

��!

X

k

�

jk

jj; jihk; kj:

Finally, we take the partial tra e over the an illa, whi h yields the diagonal

matrix

X

k

�

kk

jkihkj:

Warning. The \ opying operation" we onsidered:

jji 7! jj; ji;

X

j;k

�

jk

jjihkj 7!

X

k

�

jk

jj; jihk; kj

(the omposition of the �rst two transformations) in fa t opies only the

basis states. We note that the opying of an arbitrary quantum state j�i 7!

j�i j�i is a nonlinear operator and so annot be physi ally realized. (This

statement is alled a \no- loning theorem".) We will take the liberty of

alling the operator of the form

X

j

j

j�

j

i 7!

X

j

j

j�

j

i j�

j

i

opying relative to the orthonormal basis fj�

j

ig.

So, the de oheren e superoperator D translates any state into a lassi al

one (with diagonal density matrix) by opying qubits. This an be inter-

preted as follows: if we onstantly observe a quantum system (make opies),

then the system will behave like a lassi al one. Thus the opying operation,

together with \forgetting" about the opy (i.e., the partial tra e), provides

a on eptual link between quantum me hani s and the lassi al pi ture of

the world.

Remark 11.2 (De oheren e in physi s). In Nature, de oheren e by

\ opying to the environment" is very ommon and, of ourse, does not

require a human observer. Let us onsider one famous example of quantum

phenomenon | an interferen e pattern formed by a single photon. It is

known from lassi al opti s that a light beam passing through two parallel

slits forms a pattern of bright and dark stripes on a s reen pla ed behind

the slits. This pattern an be re orded if one uses a photographi �lm

as the s reen. When the light is dim, the photographi image onsists of

random dots produ ed by individual photons (i.e., quanta of light). The


probability for a dot to appear at a given position

9

x is the probability that

a photon hits the �lm at the point x. What will happen if the light is so

dim that only one photon rea hes the �lm? Quantum me hani s predi ts

that the photon arrives in a ertain superposition j i =

P

x

x

jxi, so the

above probability equals j

x

j

2

. Thus, the quantum state of the photon is

transformed into a lassi al obje t | a dot, lo ated at a parti ular pla e

(although the appearan e of the dot at a given position x o urs with the

probability related to the orresponding amplitude

x

). When and how does

this transition happen?

When the photon hits the �lm, it breaks a hemi al bond and generates a

defe t in a light-sensitive grain (usually, a small rystal of silver ompound).

The photon is delo alized in spa e, so a superposition of states with the de-

fe t lo ated at di�erent pla es is initially reated. Basi ally, this is the same

state j i =

P

x

x

jxi, but x now indi ates the position of the defe t. The

transition from the quantum state j i to the lassi al state

P

x

j

x

j

2

jxihxj

is de oheren e. It o urs long before anyone sees the image, even before the

�lm is developed. About every 10

�12

se onds sin e the defe t was reated,

it s atters a phonon (a quantum of soni vibrations). This has the e�e t of

\ opying" the state of the defe t to phonon degrees of freedom relative to

the position basis.

The above explanation is based on the assumption that the phonon

s attering (or whatever auses the de oheren e) is irreversible. But what

does this assumption mean if the s attering is just a unitary pro ess whi h

o urs in the �lm? In the pre eding mathemati al dis ussion, the irreversible

step was the partial tra e; it was justi�ed by the fa t that the opy was

\dis arded", i.e., never used again. On the ontrary, the s attered phonon

stays around and an, in prin iple, s atter ba k to \undo the opying".

In reality, however, the s attering does not reverse by itself. One reason

is that the phonons intera t with other phonons, ausing the \ opies" to

multiply. The quantum state qui kly be omes so entangled that it annot

be disentangled. (Well, this argument is more empiri al than logi al; it

basi ally says that things an be lost and never found | a true fa t with

no \proof" whatsoever. For some parti ular lasses of Hamiltonians, some

assertions about irreversibility, like \information es apes to in�nity", an

be formulated mathemati ally. Proving this kind of statements is a diÆ ult

and generally unsolved problem.)

9

We think of the position on the �lm as a dis rete variable; spe i� ally, it refers to a grain of

light-sensitive substan e. The whole grain either turns dark (if it aught a photon) or stays white

when the �lm is developed. Speaking about photons, we have oversimpli�ed the situation for

illustrative purposes. In modern �lms, a single photon does not yet produ e a suÆ ient hange to

be developed, but several (3 or more) photons per grain do. For single photon dete tion, physi ists

use other kinds of devi es, e.g., ones based on semi ondu tors.


Some irreversibility postulate or assumption is ne essary to give an in-

terpretation of quantum me hani s, i.e., to introdu e a lassi al observer.

It seems that the exa t nature of irreversibility is the real question behind

many philosophi al debates surrounding quantum me hani s. Another thing

that is not fully understood, is the meaning of probability in the physi al

world. Both problems exist in lassi al physi s as well; quantum me hani s

just makes them more evident.

Fortunately (espe ially to mathemati ians), the theory of quantum om-

putation deals with an ideal world where nothing gets lost. If we observe

something, we an also \un-observe", unless we expli itly hoose to dis ard

the result or to keep it as the omputation output. As far as probabilities

are on erned, we deal with them formally rather than trying to explain

them by something else. [End of remark℄

In the ase of a single qubit, the de oheren e superoperator (the o�-

diagonal matrix elements set to zero) an be also obtained if we apply the

operator �

z

with probability 1=2:

� 7!

1

2

�+

1

2

�

z

��

z

:

Su h a pro ess is alled random dephasing: the state j1i is multiplied by the

phase fa tor �1 with probability 1=2. Thus, the dephasing leads likewise to

the situation that the system behaves lassi ally.

[2!℄ Problem 11.6. Suppose we have a physi ally realizable superoperator

T : L(N ) ! L(N F) with the following property: Tr

F

(T�) = � for any

pure state �. Prove that TX = X (for any operator X), where is a

�xed density matrix on the spa e F .

Think of N as a system one wants to observe, and F as an \observation

re ord". Then the ondition Tr

F

(T�) = � indi ates that the superoperator T

does not perturb the system, whereas TX = X means that the obtained

re ord does not arry any information about �. Thus, it is impossible to

get any information about an unknown quantum state without perturbing the

state.

11.4. Measurements. In des ribing quantum algorithms, it is often nat-

ural (though not mathemati ally ne essary) to assume that, together with

a quantum omputational system, a lassi al one might also be used. An

important type of intera tion between quantum and lassi al parts is mea-

surement of a quantum register. It yields a lassi al \re ord" (out ome),

while the quantum register may remain in a modi�ed state, or may be de-

stroyed.


Consider a system onsisting of two parts, a quantum part (N ) and

a lassi al part (K). The density matrix is diagonal with respe t to the

lassi al indi es, i.e.,

� =

X

j;k;l

�

jkll

(jjihkj) (jlihlj) =

X

l

w

l

(l)

jlihlj;

where w

l

=

P

j

�

jjll

is the probability of having the lassi al state l, and

the operator

(l)

= w

�1

l

P

j;k

�

jkll

possesses all the properties of a density

matrix. In this manner, quantum- lassi al states are always de omposed

into \ onditional" (by analogy with onditional probability) density matri es

(l)

. In su h a ase we will use in su h a ase a spe ial notation similar to

that of equation. (10.3): � =

P

l

w

l

� (

(l)

; l) =

P

l

(w

l

(l)

; l). (Here the dot

does not have spe ial meaning as, e.g., in (11.1); it just indi ates that w

l

is

a fa tor rather than a fun tion.)

Suppose we have a set of mutually ex lusive possibilities, whi h is ex-

pressed as a de omposition of the state spa e into a dire t sum of pair-

wise orthogonal subspa es, N =

L

j2

L

j

, where = f1; : : : ; rg is the

set of orresponding lassi al out omes. (We say \mutually ex lusive" be-

ause, if a subspa e L

0

is orthogonal to a subspa e L

00

, and � 2 L(L

0

), then

P(�;L

00

) = 0.)

A transformation of density matri es, that we will all a proje tive mea-

surement, is su h that for states of the subspa e L

j

, a \measuring devi e"

puts the number of the state j into the lassi al register:

(11.4) if j�i 2 L

j

, then j�ih�j 7! (j�ih�j; j):

Although the measurement maps the spa e L(N ) to L(N K), the result is

always diagonal with respe t to the lassi al basis in K. Therefore we an as-

sume that the measurement is a linear map R : L(N )! L(N )�f1; : : : ; rg =

L

r

j=1

L(N ); su h linear maps will also be alled superoperators.

By linearity, equation (11.4) implies that R� = (�; j) for any � 2 L(L

j

).

However, to de�ne the a tion of R on an arbitrary �, we have to use the

ondition that R is physi ally realizable.

[3℄ Problem 11.7. Prove that the superoperator

(11.5) R : � 7!

X

j

�

�

L

j

��

L

j

; j

�

is the only physi ally realizable superoperator of the type L(N )! L(N )�

f1; : : : ; rg that is onsistent with (11.4).

Thus we arrive at our �nal de�nition.


De�nition 11.1. A proje tive measurement is a superoperator of the form

(11.5), whi h an also be written as follows:

(11.6) � 7!

X

j

P(�;L

j

) �

�

(j)

; j

�

;

where

(j)

=

�

L

j

��

L

j

P(�;L

j

)

.

We may say that P(�;L

j

) is the probability of getting a spe i�ed out-

ome j. If the out ome j is a tually obtained, the state of the quantum

system after the measurement is

(j)

. If we measure pure states, i.e., if

� = j�ih�j, then

(j)

= j�

j

ih�

j

j, where

j�

j

i =

�

L

j

j�i

p

P(j�i;L

j

)

:

Let us give a simple example of a measurement. We opy a qubit (rela-

tive to the lassi al basis) and apply the de oheren e superoperator to the

opy. In this example, �

L

0

= j0ih0j, �

L

1

= j1ih1j, and the measurement

superoperator is

� =

�

�

00

�

01

�

10

�

11

�

7! �

00

�

�

j0ih0j; 0

�

+ �

11

�

�

j1ih1j; 1

�

:

We have onsidered nondestru tive measurements. A destru tive mea-

surement an be des ribed as a nondestru tive one after whi h the measured

system is dis arded. This orresponds to the transition from a quantum state

� to the lassi al state

P

j

P(�;L

j

) � (j) (where (j) is a short notation for

jjihjj ). A general physi ally realizable transformation of a quantum state

to a lassi al state is given by the formula

(11.7) � 7!

X

k

Tr(�X

k

) � (k);

where X

k

are nonnegative Hermitian operators satisfying

P

k

X

k

= I. Su h

a set of operators fX

k

g is alled a POVM (positive operator-valued mea-

sure), whereas the superoperator (11.7) is alled a POVM measurement.

Remark 11.3. Nondestru tive POVM measurements ould also be de�ned,

but there is no su h de�nition that would be natural enough. An ex eption

is the ase where the operators X

k

ommute with ea h other. Then they

an be represented as linear ombination of proje tions �

j

with nonnegative

oeÆ ients, whi h an be interpreted as \ onditional probabilities". Su h

measurements (in the nondestru tive form) and their realizations will be

studied in the next se tion.


[2!℄ Problem 11.8. Prove that any POVM measurement an be repre-

sented as an isometri embedding into a larger spa e, followed by a proje tive

measurement.

[3!℄ Problem 11.9 (\Quantum teleportation"; f. [4℄). Suppose we have

three qubits: the �rst is in an arbitrary state � (not known in advan e),

whereas the se ond and third are in the state j�

00

i =

1

p

2

(j0; 0i + j1; 1i).

On the �rst two qubits we perform the measurement orresponding to the

orthogonal de omposition

B

2

= C (j�

00

i)� C (j�

01

i)� C (j�

10

i)� C (j�

11

i);

where

j�

ab

i =

1

p

2

X

(�1)

b

j ; � ai:

Find a way to restore the initial state � from the remaining third qubit and

the measurement out ome (a; b). Write the whole sequen e of a tions (the

measurement and the re overy) in the form of a quantum ir uit.

(Informally, this pro edure an be des ribed as follows. Suppose that

Ali e wants to transmit to Bob

13

a quantum state � by a lassi al ommu-

ni ation hannel, e.g., over the phone. It turns out that this is possible,

provided Ali e and Bob have prepared in advan e the state j�

00

i so that

ea h of them keeps half of the state, i.e., a single qubit. Ali e performs the

measurement and tells the result to Bob. Then Bob translates his qubit to

the state �. Thus the unknown state gets \teleported".)

11.5. The superoperator norm. At �rst sight, it seems natural to de�ne

a norm for superoperators of type L(N ;M) by analogy with the operator

norm,

(11.8) kTk

1

= sup

X 6=0

kTXk

tr

kXk

tr

;

and to use this norm to measure distan e between physi ally realizable

transformations of density matri es. (Of ourse, the norm is applied to

the di�eren e between two physi ally realizable superoperators, whi h is

not physi ally realizable.) However, the use of the norm (11.5) turns out to

be in onvenient be ause it is \unstable" with respe t to the tensor produ t.

Let us explain this in more detail.

Suppose we want to hara terize the distan e between physi ally realiz-

able superoperators P;R 2 L(N ;M). From the physi al point of view, both

superoperators pertain to some quantum system, whi h is a part of the Uni-

verse. Certainly, we do not expe t that the answer to our problem would

13

These two hara ters are en ountered in almost every paper on quantum information

theory.


depend on what happens in some other galaxy, and even on the existen e

of that galaxy. In other words, we expe t that the distan e between P and

R is the same as the distan e between P I

L(G)

and R I

L(G)

, whatever

additional spa e G we hoose. But this is not always the ase if we use

kP �Rk

1

as the distan e fun tion.

Example 11.1. Consider the superoperator whi h transposes a matrix,

T : jjihkj 7! jkihjj (j; k = 0; 1):

It is obvious that kTk

1

= 1. However,

T I

L(B)

1

= 2. Indeed, let the

superoperators T I

L(B)

a t on the operator X =

P

j;k

jj; jihk; kj; then

kXk

tr

= 2 but

(T I

L(B)

)X

tr

= 4. Hen e

T I

L(B)

1

� 2. The upper

bound

T I

L(B)

1

� 2 is also easily obtained.

One may argue that this is not quite \bad" a ounterexample yet, sin e

T does not have the form P�R, where P and R are physi ally realizable (for

example, be ause Tr(TI

B

) 6= 0 ). Consider, however, another superoperator,

Q : � 7! (T�) �

z

:

It is easy to see that kQk

1

= kTk

1

k�

z

k

tr

= 2; similarly,

Q I

L(B)

1

= 4.

The superoperator Q satis�es the following two onditions:

a) Tr(QX) = 0; and b) (QX)

y

= QX

y

(for any X):

It is possible to show (using the result of Problem 11.4) that any superop-

erator with these properties an be represented as (P � R), where P and

R are physi ally realizable, and is a positive real number.

Fortunately, it turns out (as we will prove below) that the pathology of

Example 11.1 has a restri tion by dimension. Spe i� ally, if dimG � dimN ,

then kT I

G

k

1

= kTk

}

, where the quantity kTk

}

does not depend on G.

Before proving this assertion, let us examine its onsequen es.

First, it is lear that the quantity kTk

}

de�ned in this manner is a norm.

Se ond, let us noti e that kT Rk

1

� kTk

1

kRk

1

, sin e the tra e norm is

multipli ative with respe t to the tensor produ t. Substituting the identity

operator for R, we obtain kTk

}

� kTk

1

. Similarly, if we repla e T by

T I

L(G)

, and R by R I

L(G)

(where the dimension of G is large enough),

we get kT Rk

}

� kTk

}

kRk

}

.

Third, it follows from the de�nition that kTRk

1

� kTk

1

kRk

1

; therefore

kT Rk

1

= k(T I)(I R)k

1

� kT Ik

1

kI Rk

1

:

Repla ing T and R by T I

L(G)

and R I

L(G)

(resp.), we get kT Rk

}

�

kTk

}

kRk

}

, whi h is the opposite to the previous inequality. Hen e the

norm k � k

}

is multipli ative with respe t to the tensor produ t,

(11.9) kT Rk

}

= kTk

}

kRk

}

:


In order to prove that

T I

L(G)

1

stabilize when dimG � dimN , we

give another de�nition of the stable superoperator norm kTk

}

.

First, we note that an arbitrary superoperator T : L(N ) ! L(M) an

be represented in the form T = Tr

F

(A � B

y

), where A;B 2 L(N ;MF)

(re all that A �B

y

denotes the superoperator X 7! AXB

y

; f. Problem 11.1).

Without loss of generality we may assume that dimF = (dimN )(dimM).

In fa t, the dimension of F an be made as low as the rank of the matrix

(T

(j

0

j)(k

0

k)

) de�ned in Problem 11.4.

De�nition 11.2. Consider all representations of T : L(N )! L(M) in the

form T = Tr

F

(A � B

y

). Then

kTk

}

= inf

n

kAk kBk : Tr

F

(A �B

y

) = T

o

:

It follows from Theorem 11.1 below that this quantity does not depend

on the hoi e of the auxiliary spa e F , provided at least one representation

T = Tr

F

(A

0

� B

y

0

) exists. For the minimization of kAk kBk, it suÆ es to

onsider operators with norms kAk � kA

0

k and kBk � kB

0

k. The set of

su h pairs (A;B) is ompa t, hen e the in�mum is a hieved.

Theorem 11.1. If dimG � dimN , then kT I

G

k

1

= kTk

}

.

Proof. Let T = Tr

F

(A � B

y

). Using the properties of the tra e norm from

Problem 10.5, we obtain

(T I

L(G)

)X

tr

=

Tr

F

�

(A I

G

)X(B

y

I

G

)

�

tr

�

(A I

G

)X(B

y

I

G

)

tr

� kAk kBk kXk

tr

:

Hen e kTk

}

�

T I

L(G)

1

.

Proving the opposite inequality is somewhat more ompli ated. With-

out loss of generality we may assume that kTk

}

= 1, and the ini�mum in

De�nition 11.2 is a hieved when kAk = kBk = 1.

We show at �rst that there exist three density matri es �; 2 L(N ) and

� 2 L(F) su h that Tr

M

(A�A

y

) = Tr

M

(B B

y

) = � . Let

K = Ker(A

y

A� I

N

); L = Ker(B

y

B � I

N

);

E =

n

Tr

M

(A�A

y

) : � 2 D(K)

o

; F =

n

Tr

M

(B B

y

) : 2 D(L)

o

;

where D(L) denotes the set of density matri es on the subspa e L. Then

E;F � D(F), so that in pla e of � we an put any element of E \ F .

We prove that E \ F 6= ;. Sin e E and F are ompa t onvex sets, it

suÆ es to prove that there is no hyperplane that would separate E from

F . In other words, there is no Hermitian operator Z 2 L(F) su h that

Tr(XZ) > Tr(Y Z) for all pairs of X 2 E and Y 2 F . This follows from


the ondition that the value of kAk kBk is minimal; in parti ular, it annot

de rease under the transformation

A 7! (I

M

e

�tZ

)A; B 7! (I

M

e

tZ

)B;

where t is a small positive number.

Thus, let Tr

M

(A�A

y

) = Tr

M

(B�B

y

) = � 2 D(F), where �; 2 D(N ).

We an use the additional spa e G to onstru t puri� ations of � and ,

i.e., to represent them in the form � = Tr

G

(j�ih�j), = Tr

G

(j�ih�j), where

j�i; j�i 2 N G are unit ve tors (see Proposition 10.1). This is possible

due to the ondition dimG � dimN . We set X = j�ih�j. It is obvious that

kXk

tr

= 1.

We prove that

(T I

L(G)

)X

tr

� 1. If we set

X

0

= (T I

L(G)

)X; j�

0

i = (A I

G

)j�i; j�

0

i = (B I

G

)j�i;

then

X

0

= Tr

F

(j�

0

ih�

0

j); Tr

MG

(j�

0

ih�

0

j) = Tr

MG

(j�

0

ih�

0

j) = �:

From this it follows, �rst, that the ve tors j�

0

i and j�

0

i have unit length.

Se ond, there is a unitary operator U a ting on the spa e MG su h that

(U I

F

) j�

0

i = j�

0

i (see Problem 10.3). Consequently,

kX

0

k

tr

� jTrUX

0

j =

�

�

Tr

�

(U I

F

)j�

0

ih�

0

j

�

�

�

=

�

�

Tr(j�

0

ih�

0

j)

�

�

= 1:

�

Surprisingly enough, the superoperator norm is onne ted not only to

the tra e norm, but also to the �delity (see (10.8)).

[3℄ Problem 11.10. Let T = Tr

F

(A � B

y

), where A;B : N ! F M.

Prove that

kTk

2

}

= max

n

F

�

Tr

M

(A�A

y

); Tr

M

(B B

y

)

�

: �; 2 D(N )

o

;

where D(N ) denotes the set of density matri es on N .

Note that the operators �

0

= Tr

F

(A �B

y

) and

0

= Tr

M

(B B

y

) are not

density matri es: the ondition Tr�

0

= Tr

0

= 1 is not satis�ed. However,

the de�ninion of �delity and the result of Problem 10.6b (but not 10.6a

or 10.6 ) is valid for arbitrary nonnegative Hermitian operators.

The result of Problem 11.10 has been used in the study of the omplexity

lass QIP [38℄.


12. Measuring operators

A measuring operator is a generalization of an operator with quantum on-

trol. Su h operators are very useful in onstru ting quantum algorithms.

After mastering this tool, we will be ready to ta kle diÆ ult omputational

problems.

12.1. De�nition and examples. Consider a state spa e N K and �x

a de omposition of the �rst fa tor into a dire t sum of pairwise orthogonal

subspa es: N =

L

j2

L

j

( = f1; : : : ; rg). Ea h operator of the form

W =

X

j

�

L

j

U

j

is alled a measuring operator, where �

L

j

is the proje tion onto the subspa e

L

j

, and U

j

2 L(K) is unitary.

In order to justify the name \measuring", we onsider the following

pro ess. Let � 2 L(N ) be a quantum state we want to measure. We onne t

it to an instrument in the initial state j0i (we assume that in the state spa e

of the instrument, K, some �xed basis is hosen, e.g., K = B

n

). Then the

joint state of the system is des ribed by the density matrix � j0

m

ih0

m

j.

Now we apply the measuring operator W . We obtain the state

W

�

� j0

m

ih0

m

j

�

W

y

=

X

j

�

�

L

j

��

L

j

�

�

U

j

j0ih0jU

y

j

�

(we have used the de�ning properties of a proje tion: �

y

= �, �

2

= �).

Finally, we make the instrument lassi al by applying the de oheren e

transformation. This means that the matrix is diagonalized with respe t to

the se ond tensor fa tor. Let us see how the fa tor U

j

j0ih0jU

y

j

in the above

sum hanges:

U

j

j0ih0jU

y

j

7!

X

k

�

�

hkjU

j

j0i

�

�

2

jkihkj:

Thus we obtain a bipartite mixed state, whi h is lassi al in the se ond

omponent,

X

j

X

k

�

�

L

j

��

L

j

�

�

hkjU

j

j0i

�

�

2

; k

�

=

X

j

X

k

P(kjj) �

�

�

L

j

��

L

j

; k

�

:

Here we have introdu ed the onditional probabilities P(kjj) =

�

�

hkjU

j

j0i

�

�

2

.

We will see that they obey the usual rules of probability theory, provided all

measuring operators we use are de�ned with respe t to the same orthogonal

de omposition of the spa e N .


Summing up, the whole pro edure orresponds to the transformation of

density matri es

T : � 7!

X

k;j

P(kjj) �

�

�

L

j

��

L

j

; k

�

:

We note that a proje tive measurement (as de�ned in the pre eding se tion)

is a spe ial ase of this, P(kjj) = Æ

kj

. The more general pro ess just de-

s ribed an be alled a \probabilisti proje tive measurement." It an also

be viewed as a nondestru tive version of the POVM measurement

� 7!

X

k

Tr(�X

k

) � (k); where X

k

=

X

j

P(kjj)�

L

j

:

Let us give a few examples of measuring operators.

1. The operator �(U) = �

0

I + �

1

U , a ting on the spa e B N , is

measuring.

1

0

. It is more interesting that �(U) is measuring also with respe t to the

se ond subsystem, N . Sin e U is a unitary operator, it an be de om-

posed into the sum of proje tions onto the eigenspa es: U =

P

j

�

j

�

L

j

,

j�

j

j = 1. Then

�(U) =

X

j

(�

0

+ �

j

�

1

)�

L

j

=

X

j

�

1 0

0 �

j

�

�

L

j

:

However, the onditional probabilities are trivial: P(0jj) = 1, P(1jj) =

0. Thus su h an operator, even though measuring a ording to the

de�nition, does not a tually measure anything.

Now we will try to modify the operator �(U) to make the onditional

probabilities nontrivial.

H

H

Physi ist's approa h. Let U be the operator

of phase shift for light as it passes through a

glass plate. We an split the light beam into two

parts by having it pass through a semitranspar-

ent mirror. Then one of the two beams passes

through the glass plate, after whi h the beams merge at another semitrans-

parent mirror (see the diagram). The resulting interferen e will allow us to

determine the phase shift.

A mathemati al variant of the pre eding example. The operator

H =

1

p

2

�

1 1

1 �1

�

serves as an analog of the semitransparent mirror. As is evident from the

diagram above, we need to apply it at the beginning and at the end. The


middle part is represented by the operator �(U) (the ontrolling qubit or-

responds to whether the photon passes through the plate or not). Thus we

obtain the operator

(12.1) �(U) = (H I) �(U) (H I) : B N ! B N :

If the initial ve tor has the form j i = j�i j�i (j�i 2 L

j

), then

�(U)j i = j�

0

i j�i, where

j�

0

i =

1

p

2

�

1 1

1 �1

��

1 0

0 �

j

�

1

p

2

�

1 1

1 �1

�

j�i =

1

2

�

1 + �

j

1� �

j

1� �

j

1 + �

j

�

j�i:

Therefore

�(U) =

X

j

R

j

z }| {

1

2

�

1 + �

j

1� �

j

1� �

j

1 + �

j

�

�

L

j

:

Now we al ulate the onditional probabilities. The eigenvalues have mod-

ulus 1, so that we an write �

j

= exp(2�i'

j

). As a onsequen e, we have

(12.2) P(0jj) =

�

�

h0jR

j

j0i

�

�

2

=

�

�

�

�

1 + �

j

2

�

�

�

�

2

=

1 + os(2�')

2

:

In the next se tion, the measuring operator (12.1) will be used to esti-

mate the eigenvalues of unitary operators. For this, we will need to apply

the operator �(U) several times to the same \obje t" (the spa e N ), but

with di�erent \instruments" ( opies of the spa e B). But �rst we must make

sure that this is orre t, in the sense that the probabilities multiply as they

should.

12.2. General properties. We will onsider measuring operators that

orrespond to a �xed orthogonal de omposition N =

L

j

L

j

.

1. The produ t of measuring operators is a measuring operator. Indeed,

let two su h operators be

W

(1)

=

X

j

R

(1)

j

�

L

j

and W

(2)

=

X

j

R

(2)

j

�

L

j

:

Inasmu h as �

L

j

�

L

k

= Æ

jk

�

L

k

, we have

W

(2)

W

(1)

=

X

j

R

(2)

j

R

(1)

j

�

L

j

:

2. The onditional probabilities for produ ts of measuring operators with

\di�erent instruments" are multipli ative. Spe i� ally, if R

(1)

=

~

R

1

I

K

2

,

R

(2)

= I

K

1

~

R

2

(both operators a t on K

1

K

2

), then P(k

1

; k

2

jj) =


P(k

1

jj)P(k

2

jj). This equality follows immediately from the de�nition of

onditional probabilities and the obvious identity

�

h�

1

j h�

2

j

� �

U

1

U

2

� �

j�

1

i j�

2

i

�

= h�

1

jU

1

j�

1

i h�

2

jU

2

j�

2

i:

3. Formula of total probability. Let W =

P

R

j

�

L

j

be a measuring

operator. If we apply it to the state j0ih0j �, where � 2 L(N ), then the

resulting probability of the state k an be written in the form:

P

�

W

�

j0ih0j �

�

W

y

; C (jki) N

�

=

X

j

P(kjj)P(�;L

j

):

Proof. We have

W

�

j0ih0j �

�

W

y

= =

X

j

�

R

j

j0ih0jR

y

j

�

�

L

j

��

L

j

:

It was proved earlier that P

�

; C (jki) N

�

= P

�

Tr

N

( ); C (jki)

�

. Further,

Tr

N

( ) =

X

j

�

R

j

j0ih0jR

y

j

�

Tr

�

�

L

j

��

L

j

�

:

Sin e

Tr(�

L

j

��

L

j

) = Tr(�

2

L

j

�) = Tr(�

L

j

�)

def

= P(�;L

j

);

we obtain the desired expression P

�

; C (jki)N

�

=

P

j

P(kjj)P(�;L

j

). �

[1!℄ Problem 12.1. Let N =

L

j

L

j

be an orthogonal de omposition, U

j

2

K and

~

U

j

2 K B

N

some unitary operators. Suppose that for ea h j the

operator

~

U

j

approximates U

j

with pre ision Æ using an illas (namely, the

spa e B

N

). Then the measuring operator

~

W =

P

j

�

L

j

~

U

j

approximates

W =

P

j

�

L

j

U

j

with the same pre ision Æ.

12.3. Garbage removal and omposition of measurements. Mea-

surement operators are used to obtain some information about the value

of the index j in the de omposition N =

L

j2

L

j

. From a omputation

perspe tive, only a part of this information may be useful. In this situation,

the measurement operator an be written in the form

(12.3) W =

X

j2

�

L

j

R

j

; R

j

: B

N

! B

N

; R

j

j0i =

X

y;z

y;z

(j)jy; zi;

where y 2 B

m

represents the \useful result" and z 2 B

N�m

is \garbage".

Ignoring the garbage, we get the onditional probabilities

(12.4) P(yjj)

def

=

0

�

�

R

y

j

�

M

y

R

j

�

�

0

�

; M

y

= C

�

jyi

�

B

(N�m)

:

How an one onstru t another measuring operator U that would pro-

du e y with the same onditional probabilities, but without garbage? It

seems that there is no general solution to this problem, ex ept for the ase


where the result y is deterministi , namely, P(yjj) = Æ

y;f(j)

for some fun -

tion f : ! B

m

(so that we an say that W a tually measures the value of

f(j) ). Then we an use the same tri k as in the proof of Lemma 7.2: we

measure f(j), opy the result, and \un-measure".

We are going to extend this simple result in three ways. First, we

will assume that W measures f with some error probability "; we will �nd

out with what pre ision the above pro edure orresponds to a garbage-free

measurement (the answer is

p

2"). Se ond, the formula for the onditional

probabilities (12.4) makes sense for an arbitrary orthogonal de omposition

B

N

=

L

y2�

M

y

. Third, instead of opying the result, we an apply any

operator V that is measuring with respe t to the indi ated de omposition.

The opying orresponds to V : jy; z; vi 7! jy; z; y � vi.

[2!℄ Problem 12.2. Let N =

L

j2

L

j

and B

N

=

L

y2�

M

y

be orthog-

onal de ompositions. Consider two measuring operators on N K B

N

,

su h that B

N

serves as the \instrument" for one and the \obje t" for the

other:

W =

X

j2

�

L

j

I

K

R

j

; V =

X

y2�

I

N

Q

y

�

M

y

:

Suppose that W measures a fun tion f : ! � with error probability

� ", i.e., the onditional probabilities P(yjj) = h0

N

jR

y

j

�

M

y

R

j

j0

N

i satisfy

P(f(j)jj) � 1 � ". Then the operator

~

U = W

�1

VW approximates the

operator

U =

X

j2

�

L

j

Q

f(j)

: N K ! N K

with pre ision 2

p

", using B

N

as the an illary spa e. If V is the opy

operator, then the pre ision is

p

2".

13. Quantum algorithms for Abelian groups

The only nontrivial quantum algorithm we have onsidered so far is Grover's

algorithm for the solution of the universal sear h problem (see Se tion 9.2).

Unfortunately, there we a hieved only a polynomial in rease in speed. For

this reason Grover's algorithm does not yield any serious onsequen es (of

type BQP � BPP) for omplexity theory. At present time, there is no proof

that quantum omputation is super-polynomially faster than lassi al prob-

abilisti omputation. But there are several pie es of indire t eviden e in

favor of su h an assertion. The �rst of these is an example of a problem

with ora le ( f. De�nition 2.2 on page 26 and the beginning of Se tion 9.2),


for whi h there exists a polynomial quantum algorithm, while any lassi-

al probabilisti algorithm is exponential.

10

This example, onstru ted by

D. Simon [66℄, is alled the hidden subgroup problem for (Z

2

)

k

. The famous

fa toring algorithm by P. Shor [62℄ is based on a similar idea. After dis-

ussing these examples, we will solve the hidden subgroup problem for the

group Z

k

, whi h generalizes both results.

The hidden subgroup problem. Let G be a group with a spe i�ed

representation of its elements by binary words. There is a devi e (an ora le)

that omputes some fun tion f : G! B

n

with the following property:

(13.1) f(x) = f(y)() x� y 2 D;

where D � G is an initially unknown subgroup. It is required to �nd that

subgroup. (The result should be presented in a spe i�ed way.)

13.1. The problem of hidden subgroup in (Z

2

)

k

; Simon's algorithm.

We onsider the problem formulated above for the group G = (Z

2

)

k

. The

elements of this group an be represented by length k words of zeros and

ones; the group operation is bitwise addition modulo 2. We may regard G

as the k-dimensional linear spa e over the �eld F

2

. Any subgroup of G is a

linear subspa e, so it an be represented by a basis.

It is easy to show that a \hidden subgroup" annot be found qui kly

using a lassi al probabilisti ma hine. (A lassi al ma hine sends words

x

1

; : : : ; x

l

to the input of the \bla k box" and re eives answers y

1

; : : : ; y

l

.

Ea h subsequent query x

j

depends on the previous answers y

1

; : : : ; y

j�1

and

some random number r that is generated in advan e.)

Proposition 13.1. Let n � k. For any lassi al probabilisti algorithm

making no more than 2

k=2

queries to the ora le, there exist a subgroup D �

(Z

2

)

k

and a orresponding fun tion f : (Z

2

)

k

! B

n

for whi h the algorithm

is wrong with probability > 1=3.

Proof. For the same subgroup D there exist several di�erent ora les f . We

assume that one of them is hosen randomly and uniformly. (If the algorithm

is wrong with probability > 1=3 for the randomized ora le, then it will be

wrong with probability > 1=3 for some parti ular ora le.) The randomized

ora le works as follows. If the present query is x

j

, and x

j

�x

s

2 D for some

s < j, the answer y

j

oin ides with the answer y

s

that was given before.

Otherwise, y

j

is uniformly distributed over the set B

n

n fy

1

; : : : ; y

j�1

g. The

randomized ora le is not an ora le in the proper sense, meaning that its

answer may depend on the previous queries rather than only on the urrent

10

One should keep in mind that the omplexity of problems with ora le frequently di�ers

from the omplexity of ordinary omputational problems. A lassi al example is the theorem

stating that IP = PSPACE [58, 59℄. The ora le analogue of this assertion is not true [28℄!


one. In this manner, the randomized ora le is equivalent to a devi e with

memory, whi h, being asked the question x

j

, responds with the smallest

number s

j

� j su h that x

j

� x

s

j

2 D. Indeed, if we have a lassi al

probabilisti ma hine making queries to the randomized ora le, we an adapt

it for the use with the devi e just des ribed. For this, the ma hine should

be altered in su h a way that it will transform ea h answer s

j

to y

j

and

pro eed as before. (That is, y

j

= y

s

if s

j

< j, or y

j

is uniformly distributed

over B

n

n fy

1

; : : : ; y

j�1

g if s

j

= j.)

Let the total number of queries be l � 2

k=2

. Without loss of generality

all queries an be assumed di�erent. In the ase D = f0g, all answers are

also di�erent, i.e., s

j

= j for all j. Now we onsider the ase D = f0; zg,

where z is hosen randomly, with equal probabilities, from the set of all

nonzero elements of the group (Z

2

)

k

. Then, regardless of the algorithm that

is used, z =2 fx

j

�x

1

; : : : ; x

j

�x

j�1

g with probability � 1� (j� 1)=(2

k

� 1).

The ondition for z implies that s

j

= j. This is true for all j = 1; : : : ; l

with probability � 1� l(l�1)=(2(2

k�1

�1)) > 1=2. Re all that we have two

random parameters: z and r. We an �x z in su h a way that the probability

of obtaining the answers s

j

= j (for all j) will still be greater than 1=2. Let

us see what a lassi al ma hine would do in su h a ir umstan e. If it gives

the answer \D = f0 g" with probability � 2=3, we set D = f0; zg, and then

the resulting answer will be wrong with probability > (2=3) � (1=2) = 1=3.

If, however, the probability of the answer \D = f0g" is smaller than 2=3,

we set D = f0g. �

Let us de�ne a quantum analog of the ora le f . The orresponding

quantum ora le is a unitary operator

(13.2) U : jx; yi ! jx; y � f(x)i

(� denotes bitwise addition). We note that the quantum ora le allows linear

ombinations of the various queries; therefore it is possible to use it more

eÆ iently than the lassi al ora le.

Now we des ribe Simon's algorithm for �nding the hidden subgroup in

Z

k

2

. Let E = G=D and letE

�

be the group of hara ters onE. (By de�nition,

a hara ter is a homomorphism E ! U(1).) In the ase G = (Z

2

)

k

we an

hara terize the group E

�

in the following manner:

E

�

= fh 2 (Z

2

)

k

su h that h � z = 0 for all z 2 Dg;

where h � z denotes the inner produ t modulo 2. (The hara ter orrespond-

ing to h has the form z 7! (�1)

h�z

.) Let us show how one an generate

a random element h 2 E

�

using the operator U . After generating suÆ-

iently many random elements, we will �nd the group E

�

, hen e the desired

subgroup D.


We begin by preparing the state

j�i = 2

�k=2

X

x2G

jxi = H

k

j0

k

i

in the �rst quantum register. In the se ond register we pla e the state j0

n

i

and apply the operator U . Then we dis ard the se ond register, i.e., we will

no longer make use of its ontents. Thus we obtain the mixed state

� = Tr

2

�

U

�

j�ih�j j0ih0j

�

U

y

�

= 2

�k

X

x;y:x�y2D

jxihyj:

Now we apply the operator H

k

to the remaining �rst register. This yields

a new mixed state

= H

k

�H

k

= 2

�2k

X

a;b

X

x;y:x�y2D

(�1)

a�x�b�y

jaihbj:

It is easy to see that

P

x;y:x�y2D

(�1)

a�x�b�y

is di�erent from zero only in the

ase where a = b 2 E

�

. Hen e

=

1

jE

�

j

X

a2E

�

jaihaj:

This is pre isely the density matrix orresponding to the uniform probability

distribution on the groupE

�

. It remains to apply the following lemma, whi h

we formulate as a problem.

[2!℄ Problem 13.1. Let h

1

; : : : ; h

l

be independent uniformly distributed

random elements of an Abelian group X. Prove that they generate the

entire group X with probability � 1� jXj=2

l

.

Therefore, 2k random elements suÆ e to generate the entire group E

�

with probability of error � 2

�k

, where \error" refers to the ase where

h

1

; : : : ; h

2k

a tually generate a proper subgroup of E

�

. (Note that su h a

small | ompared to 1/3 | probability of error is obtained without mu h

additional expense. To make it still smaller, it is most eÆ ient to use the

standard pro edure: repeat all al ulations several times and hoose the

most frequent out ome.)

Summing up, to �nd the \hidden subgroup" D, we need O(k) queries to

the quantum ora le. The overall omplexity of the algorithm is O(k

3

).

13.2. Fa toring and �nding the period for raising to a power. A

se ond pie e of eviden e in favor of the hypothesis BQP � BPP is the

fast quantum algorithm for fa toring integers into primes and for another

number-theoreti problem | �nding the dis rete logarithm. They were

found by P. Shor [62℄. Let us dis uss the �rst of these two problems.


Fa toring (an integer into primes). Suppose we are given a positive

integer y. It is required to �nd its de omposition into prime fa tors

y = p

�

1

1

p

�

2

2

� � � p

�

k

k

:

This problem is thought to be so omplex that pra ti al ryptographi

algorithms are based on the hypotheti al diÆ ulty of its solution. From the

theoreti al viewpoint, the situation is somewhat worse: there is neither a

redu tion of problems of lass NP to the fa toring problem, nor any other

\dire t" eviden e in favor of its omplexity. (The word \dire t" is put in

quotation marks be ause at present the answer to the question P

?

= NP is

unknown.) Therefore, the onje ture about the omplexity of the fa toring

problem omplements the abundant olle tion of unproved onje tures in

the omputational omplexity theory. It is desirable to de rease the number

of su h problems. Shor's result is a signi� ant step in this dire tion: if

we ommit an \a t of faith" and believe in the omplexity of the fa toring

problem, then the need for yet another a t of faith (regarding the greater

omputational power of the quantum omputer) disappears.

We will onstru t a fast quantum algorithm for solving not the fa toring

problem, but another problem, alled Period finding, to whi h the fa -

toring problem is redu ed with the aid of a lassi al probabilisti algorithm.

Period finding. Suppose we are given an integer q > 1 that an be

written using at most n binary digits (i.e., q < 2

n

) and another integer a

su h that 1 � a < q and g d(a; q) = 1 (where g d(a; q) denotes the greatest

ommon divisor). It is required to �nd the period of a with respe t to q,

i.e., the smallest nonnegative number t su h that a

t

� 1 (mod q).

In other words, the period is the order of the number a in the multipli a-

tive group of residues (Z=qZ)

�

. We will denote the period of a with respe t

to q by per

q

(a).

Below we will examine a quantum algorithm for the solution of the period

�nding problem. But we will begin by des ribing the lassi al probabilisti

redu tion of the fa toring problem to the problem of �nding the period. We

suggest that the reader reviews the probabilisti test for primality presented

in Part 1 (see Se tion 4.2).

13.3. Redu tion of fa toring to period �nding. Thus, let us assume

that we know how to �nd the period. It is lear that we an fa tor the

number y by running O(log y) times a subprogram whi h, for any omposite

number, �nds a nontrivial divisor with probability at least 1=2. (Of ourse,

it is also ne essary to use the standard pro edure for ampli� ation of su ess

probability; see formula (4.1) on p. 37 and the paragraph pre eding it.)

Pro edure for �nding a nontrivial divisor.


Input. An integer y (y > 1).

Step 1. Che k y for parity. If y is even, then give the answer \2";

otherwise pro eed to Step 2.

Step 2. Che k whether y is the k-th power of an integer for k =

2; : : : ; log

2

y. If y = m

k

, then give the answer \m"; otherwise pro eed to

Step 3.

Step 3. Choose an integer a randomly and uniformly between 1 and

y � 1. Compute b = g d(a; q) (say, by Eu lid's algorithm). If b > 1, then

give the answer \b"; otherwise pro eed to Step 4.

Step 4. Compute r = per

y

(a) (using the period �nding algorithm that

we assume we have). If r is odd, then the answer is \y is prime" (whi h

means that we give up �nding a nontrivial divisor). Otherwise pro eed to

Step 5.

Step 5. Compute d = g d(a

r=2

� 1; y). If d > 1, then the answer is \d";

otherwise the answer is \y is prime".

Analysis of the divisor �nding pro edure. If the above pro edure yields

a number, it is a nontrivial divisor of y. The pro edure fails and gives the

answer \y is prime" in two ases: 1) when r = per

y

(a) is odd, or 2) when r is

even but g d(a

r=2

� 1; y) = 1, i.e., a

r=2

� 1 is invertible modulo y. However,

(a

r=2

+ 1)(a

r=2

� 1) � a

r

� 1 � 0 (mod y), hen e a

r=2

+ 1 � 0 (mod y) in

this ase. The onverse is also true: if r is even and a

r=2

+ 1 � 0 (mod y),

then the answer is \y is prime".

Let us prove that our pro edure su eeds with probability at least 1 �

1=2

k�1

, where k is the number of distin t prime divisors of y. (Note that

this probability vanishes for prime y, so that the the pro edure also works as

a primality test.) In the proof we will need the Chinese Remainder Theorem

(Theorem A.5 on page 241) and the fa t that the multipli ative group of

residues modulo p

�

, p prime, is y li (see Theorem A.11).

Let y =

Q

k

j=1

p

�

j

j

be the de omposition of y into prime fa tors. We

introdu e the notation

a

j

� a (mod p

�

j

j

); r

j

= per

(p

�

j

j

)

a

j

= 2

s

j

r

0

j

; where r

0

j

is odd:

By the Chinese Remainder Theorem, r is the least ommon multiple of all

the r

j

. Hen e r = 2

s

r

0

, where s = maxfs

1

; : : : ; s

k

g and r

0

is odd.

We now prove that the pro edure yields the answer \y is prime" if and

only if s

1

= s

2

= � � � = s

k

. Indeed, if s

1

= � � � = s

k

= 0, then r is odd.

If s

1

= � � � = s

k

� 1, then r is even, but a

r

j

=2

j

� �1 (mod p

�

j

j

) (using

the y li ity of the group (Z=p

�

j

j

Z)

�

), hen e a

r=2

� �1 (mod y) (using the

Chinese Remainder Theorem). Thus the pro edure yields the answer \y is


prime" in both ases. Conversely, if not all the s

j

are equal, then r is even

and s

m

< s for some m, so that a

r=2

m

� 1 (mod p

�

m

m

). Hen e a

r=2

6� �1

(mod y), i.e., the pro edure yields a nontrivial divisor.

To give a lower bound of the su ess probability, we may assume that

the pro edure has rea hed Step 4. Thus a is hosen a ording to the uniform

distribution over the group (Z=qZ)

�

. By the Chinese Remainder Theorem,

the uniform random hoi e of a is the same as the independent uniform

random hoi e of a

j

2 (Z=p

�

j

j

Z)

�

for ea h j. Let us �x j, hoose some s � 0

and estimate the probability of the event s

j

= s for the uniform distribution

of a

j

. Let g

j

be a generator of the y li group (Z=p

�

j

j

Z)

�

. The order of this

group may be represented as p

�

j

j

� p

�

j

�1

j

= 2

t

j

q

j

, where q

j

is odd. Then

�

�

fa

j

: s

j

= sg

�

�

=

�

�

�

�

g

l

j

: l = 2

t

j

�s

m; where m is odd

�

�

�

=

�

q

j

if s = 0;

(2

s

� 2

s�1

)q

j

if s = 1; : : : ; t

j

:

For any given s, the probability of the event s

j

= s does not ex eed 1=2. Now

let s = s

1

be a random number (depending on a

1

); then Pr[s

j

= s℄ � 1=2

for j = 2; : : : ; k. It follows that

Pr[s

1

= s

2

= � � � = s

k

℄ � (1=2)

k�1

:

This yields the desired estimate of the su ess probability for the entire pro-

edure: with probability at least 1� 1=2

k�1

the pro edure �nds a nontrivial

divisor of y.

13.4. Quantum algorithm for �nding the period: the basi idea.

Thus, the problem is this: given the numbers q and a, onstru t a polynomial

size quantum ir uit that omputes per

q

(a) with error probability � � 1=3.

The ir uit will operate on a single n-qubit register, as well as on many other

qubits, some of whi h may be onsidered lassi al. The n-qubit register is

meant to represent residues modulo q (re all that q < 2

n

).

Let us examine the operator that multiplies the residues by a, a ting by

the rule

U

a

: jxi 7! jax mod qi:

(A more a urate notation would be U

q;a

, indi ating the dependen e on q.

However, q is �xed throughout the omputation, so we suppress it from the

subs ript. We keep a be ause we will also use the operators U

b

for arbitrary

b.) This operator permutes the basis ve tors for 0 � x < q (re all that

(a; q) = 1). However, we represent jxi by n qubits, so x may take any value

between 0 and 2

n

� 1. We will assume that the operator U

a

a ts trivially on

su h basis ve tors, i.e., U

a

: jxi = jxi for q � x < 2

n

.


Sin e for the multipli ation of the residues there is a Boolean ir uit of

polynomial | O(n

2

) | size, there is a quantum ir uit (with an illas) of

about the same size.

The permutation given by the operator U

a

an be de omposed into y-

les. The y le ontaining 1 is (1; a; a

2

; : : : ; a

per

q

(a)�1

); it has length per

q

(a).

The algorithm we are dis ussing begins at the state j1i, to whi h the op-

erator U

a

gets applied many times. But su h transformations do not take

us beyond the orbit of 1 (the set of elements whi h onstitute the y le de-

s ribed above). Therefore we onsider the restri tion of the operator U

a

to

the subspa e generated by the orbit of 1.

Eigenvalues of U

a

: �

k

= e

2�i�k=t

; where t is the period;

Eigenve tors of U

a

: j�

k

i =

1

p

t

t�1

X

m=0

e

�2�i�km=t

ja

m

i:

It is easy to verify that the ve tors j�

k

i are indeed eigenve tors. It suÆ es to

note that the multipli ation by a leads to a shift of the indi es in the sum.

If we hange the variable of summation in order to remove this shift, we get

the fa tor e

2�i�k=t

.

If we are able to measure the eigenvalues of the operator U

a

, then we

an obtain the numbers k=t. First let us analyze how this will help us in

determining the period.

Suppose we have a ma hine whi h in ea h run gives us the number

k=t, where t is the sought-for period and k is a random number uniformly

distributed over the set f0; : : : ; t � 1g. We suppose that k=t is represented

as an irredu ible fra tion k

0

=t

0

(if the ma hine were able to give the number

in the form k=t, there would be no problem at all).

Having obtained several fra tions of the form k

0

1

=t

0

1

; k

0

2

=t

0

2

; : : : ; k

0

l

=t

0

l

, we

an, with high probability, �nd the number t by redu ing these fra tions to

a ommon denominator.

Lemma 13.2. If l � 2 fra tions are obtained, then the probability that their

least ommon denominator is di�erent from t is less than 3 � 2

�l

.

Proof. The fra tions k

0

1

=t

0

1

; k

0

2

=t

0

2

; : : : ; k

0

l

=t

0

l

an be obtained as redu tions

of fra tions k

1

=t; : : : ; k

l

=t (i.e., k

0

j

=t

0

j

= k

j

=t), where k

1

; : : : ; k

l

are indepen-

dently distributed random numbers. The least ommon multiple of t

0

1

; : : : ; t

0

l

equals t if and only if the greatest ommon divisor of k

1

; : : : ; k

l

and t is equal

to 1.

The probability that k

1

; : : : ; k

l

have a ommon prime divisor p does not

ex eed 1=p

l

. Therefore the probability of not getting t after redu ing to a


ommon denominator does not ex eed

P

1

k=2

1

k

l

< 3 � 2

�l

(the range of the

index k in this sum obviously in ludes all prime divisors of t). �

Now we onstru t the ma hine M that generates the number k=t (in

the form of an irredu ible fra tion) for random uniformly distributed k.

This will be a quantum ir uit whi h realizes the measuring operator W =

P

t�1

k=0

V

k

�

L

k

, where L

k

= C (j�

k

i), the subspa e generated by j�

k

i. The

operators V

k

are the form j0i 7!

P

y;z

jy; zi, where y is an irredu ible fra tion

and z is garbage. In this, the onditional probabilities should satisfy the

inequality

(13.3) P

�

k

t

�

�

�

k

�

def

=

X

z

�

�

�

D

k

t

; z

�

�

�

V

k

�

�

�

0

E

�

�

�

2

� 1� ";

where

k

t

denotes the irredu ible fra tion equal to the rational number k=t.

The onstru tion of su h a measuring ir uit is rather omplex, so we

�rst explain how it is used to generate the out ome y with the desired

probability w

y

=

P

k2M

y

1

t

, whereM

y

=

n

k :

k

t

= y

o

. Let us take the state

j1i as the initial state. A dire t omputation (the reader is advised to arry

it through) shows that

j1i =

1

p

t

t�1

X

k=0

j�

k

i:

If we perform the measurement on this state, then by the formula for total

probability we obtain

Pr[out ome = y℄ = P

�

W (j0i j1i); y

�

=

X

k

P(yjk)P(j1i;L

k

):

The probabilities of all j�

k

i are equal: P(j1i;L

k

) = jh�

k

j1ij

2

= 1=t, whi h

orresponds to the uniform distribution of k. The property (13.3) guarantees

that we obtain the out ome

k

t

with probability � 1 � ". Well, the reader

may �nd this statement not rigorous be ause k does not have a ertain value.

To be ompletely pedanti , we need to derive an inequality similar to (10.4),

namely,

X

y

�

�

�

Pr[out ome = y℄� w

y

�

�

�

� 2�:

S hemati ally, the ma hine M fun tions as follows:

j1i �!

random hoi e of k

(God playing di e)

j�

k

i

��! measuring of W

�

�*

H

Hj

y 6=

k

t

with probability

� ",

y =

k

t

with probability

� 1� ".

The random hoi e of k happens automati ally, without applying any op-

erator whatsoever. Indeed, the formula of total probability is arranged in


su h a way as if: before the measurement begins, a random k was generated,

whi h then remains onstant. (Of ourse, the formula is only true when the

operator W is measuring with respe t to the given subspa es L

k

.)

13.5. The phase estimation pro edure. Now we will onstru t the op-

erator that measures the eigenvalues of U

a

. The eigenvalues have the form

�

k

= e

2�i'

k

; where '

k

=

k

t

mod 1:

The phase '

k

is a real number modulo 1, i.e., '

k

2 R=Z. (The set R=Z an

be onveniently represented as a ir le of unit length.) The pro edure for

determining '

k

is alled phase estimation.

As we already mentioned, we an limit ourselves to the study of the

a tion of the operator U

a

on the input ve tor j�

k

i. The onstru tion is

divided into four stages.

1. We onstru t a measuring operator su h that the onditional probabili-

ties depend on the value of ' = '

k

. Thus a single use of this operator

will give us some information about ' (like ipping a biased oin tells

something about the bias, though in on lusively) (see 13.5.1).

2. We lo alize the value of ' with modest pre ision. It is the moment to

emphasize that, in all the arguments, there are two parameters: the

probability of error " and the pre ision Æ. As the result of a measure-

ment, we obtain some number y, for whi h the ondition jy�'j

mod 1

< Æ

must hold with probability at least 1 � ". (Here j � j

mod 1

denotes the

distan e on the unit length ir le, e.g., j0:1 � 0:9j

mod 1

= 0:2.) For the

time being, a modest pre ision will do, say Æ = 1=16 (see 13.5.2).

3. Now we must in rease the pre ision. Spe i� ally, we determine ' with

pre ision 1=2

2n+2

(see 13.5.3).

4. We need to pass from the approximate value of ' to the exa t one,

represented in the form of an irredu ible fra tion. It is essential to

be able to distinguish between numbers of the form ' = k=t, where

0 � k < t < 2

n

. Noti e that if k

1

=t

1

6= k

2

=t

2

, then jk

1

=t

1

�k

2

=t

2

j

mod 1

�

1=(t

1

t

2

) > 1=2

2n

. Therefore, knowing ' = k=t with pre ision 1=2

2n+1

,

one an, in prin iple, determine its exa t value. Moreover, this an be

done eÆ iently by the use of ontinued fra tions (see 13.5.4).

At stage 3, we will use the operator U

b

for arbitrary b (not just for b = a,

the number for whi h we seek the period). To this end, we introdu e an

operator U that sends jb; xi to jb; bx mod qi whenever g d(b; q) = 1. How the

operator U a ts in the remaining ases is not important; this a tion an be

de�ned in an arbitrary omputationally trivial way, so that U be represented

by a quantum ir uit of size O(n

2

). In fa t, all the earlier arguments about


the simulation of Boolean ir uits by quantum ir uits hold true for the

simulation of ir uits that ompute partially de�ned fun tions.

[1℄ Problem 13.2. Using the operator U , realize the operator �(U

b

) for

arbitrary b relatively prime to q.

13.5.1. How to get some information about the phase. In Se tion 12

we introdu ed the operator �(U

a

) = (H I)�(U

a

)(H I), whi h measures

the eigenvalues. In our ase �

k

= e

2�i'

k

and we an write the operator in

the form

�(U

a

) =

X

k

V

k

�

L

k

; V

k

=

1

2

�

1 + e

2�i'

k

1� e

2�i'

k

1� e

2�i'

k

1 + e

2�i'

k

�

;

and its a tion in the form

j0i j�

k

i 7

�(U

a

)

��!

�

1 + e

2�i'

k

2

j0i+

1� e

2�i'

k

2

j1i

�

j�

k

i;

so that for the onditional probabilities we get the expressions

P(0jk) =

�

�

�

�

1 + e

2�i'

k

2

�

�

�

�

2

=

1 + os(2�'

k

)

2

; P(1jk) =

1� os(2�'

k

)

2

:

Although the onditional probabilities depend on '

k

, they do not allow

one to distinguish between '

k

= ' and '

k

= �'. That is why another type

of measurement is needed. We will use the operator �(iU

a

). Its realization

is shown in the diagram below. It uses the operator K =

�

1 0

0 i

�

from

HH K

U

a

�(iU

a

)

the standard basis. The en ir led part of the diagram realizes the operator

�(iU

a

). Indeed, K multiplies only j1i by i, but this is just the ase where

the operator U

a

is applied (by the de�nition of �(U

a

)). For the operator

�(iU

a

) the onditional probabilities are

P(0jk) =

1� sin(2�'

k

)

2

; P(1jk) =

1 + sin(2�'

k

)

2

:

The omplexity of the realization of the operators �(U

a

) and �(iU

a

)

depends on the omplexity of the operator �(U

a

), whi h is not mu h higher

than the omplexity of the operator U ( f. Problem 13.2). Thus, �(U

a

) and

�(iU

a

) an be realized by quantum ir uits of size O(n

2

) in the standard

basis.


Input: a and q.

{ Computation of the powers a

2

j

mod q for j = 0; : : : ; 2n� 1 ( lassi al).

{ Setting up l quantum registers, ontaining the base state j1i.

j1i j1i : : : j1i

?

2s

0 : : : 0

0 : : : 0

: : :

2s

0 : : : 0

0 : : : 0

4ns auxiliary qubits

(\measurement de-

vi es")

� Quantum measurements with the aid of the operators �

�

U

a

2

j

�

and �

�

iU

a

2

j

�

.

v

(1)

1

: : : v

(1)

s

~v

(1)

1

: : : ~v

(1)

s

: : :

v

(2n)

1

: : : v

(2n)

s

~v

(2n)

1

: : : ~v

(2n)

s

measurement

out omes |

0s and 1s

� Counting the numbers of 0s and 1s ( lassi al).

os'� Æ

sin'� Æ

: : :

os(2

2n�1

')� Æ

sin(2

2n�1

')� Æ

� Trigonometri al ulations ( lassi al).

'�

1

16

: : : 2

2n�1

'�

1

16

� Sharpening the value of ' using the set of numbers '; : : : ; 2

2n�1

' ( lassi al).

'� 2

�(2n+2)

� Determining the exa t value of ' with the aid of ontinued fra tions

( lassi al).

Result: k

0

=t

0

| some fra tion, equal to '.

Similar

ompu-

tations

? ? ? ?

k

0

1

t

0

1

k

0

2

t

0

2

: : :

k

0

l

t

0

l

{ Cal ulation of the least ommon denominator ( lassi al).

Answer: t (with probability of error < 3 � 2

�l

+ nle

�(s)

).

Table 13.1. General s heme of the period �nding algorithm. Shown in

a box is the phase estimation part.

13.5.2. Determining the phase with onstant pre ision. We want to

lo alize the value of ' = '

k

, i.e., to infer the inequality j' � yj

mod 1

< Æ

for some (initially unknown) y and a given pre ision Æ. To get su h an

estimate, we apply the operators �(U

a

) and �(iU

a

) to the same \obje t of

measurement" but di�erent \instruments" (auxiliary qubits). The reasoning

is the same for both operators, so we limit ourselves to the ase �(U

a

).


We have the quantum register A that ontains j�

k

i. A tually, this reg-

ister initially ontains j1i =

1

p

t

P

t�1

k=0

j�

k

i, but we onsider ea h j�

k

i sepa-

rately. (We an do this be ause we apply only operators that are measuring

with respe t to the orthogonal de omposition

L

k

C (j�

k

i), so that di�erent

eigenve tors do not mix.) Let us introdu e a large number s of auxiliary

qubits. Ea h of them will be used in applying the operator �(U

a

).

As was proved in Se tion 12 (see p. 114), the onditional probabilities

in su h a ase multiply. For the operators

Q

s

r=1

�(U

a

)[r;A℄, the onditional

probabilities are equal to P(v

1

; : : : ; v

s

jk) =

Q

s

r=1

P(v

r

jk) (here v

r

denotes

the value of the r-th auxiliary qubit).

From this point on, the qubits ontaining the results of the \experi-

ments" will only be operated upon lassi ally. Sin e the onditional prob-

abilities multiply, we an assume that we are estimating the probability

p

�

= P(1jk) of the out ome 1 (\head" in a oin ip) by performing a series

of Bernoulli trials.

If the oin is tossed s times (where s is large), then the observed fre-

quen y (

P

v

r

)=s of the out ome 1 is lose to its probability p

�

. What is the

a ura y of this estimate? The exa t question: with what probability does

the number (

P

v

r

)=s fail to approximate p

�

with a given pre ision Æ? The

answer is given by Cherno�'s bound :

(13.4) Pr

h

�

�

�

s

�1

s

X

r=1

v

r

� p

�

�

�

�

� Æ

i

� 2e

�2Æ

2

s

:

(This inequality is a generalization of the inequality (4.1) whi h was used

to prove the ampli� ation of su ess probability in the de�nitions of BPP

and BQP.) Thus, for a �xed Æ we an �nd a suitable onstant = (Æ) su h

that the error is smaller than " when s = d log(1=")e = �(log(1=")) trials

are made.

So, we have learned how to �nd os(2�') and sin(2�') with any given

pre ision Æ. Now we hoose Æ so that the value ' an be determined from

the values of the sine and the osine with pre ision 1=16. This still takes

�(log(1=")) trials. The se ond stage is ompleted.

[3℄ Problem 13.3. Prove the inequality (13.4).

13.5.3. Determining the phase with exponential pre ision. To in-

rease the pre ision, we will use, along with �(U

a

), the operators �

�

(U

a

)

2

j

�

for all j � 2n�1. We an qui kly raise numbers to a power, but, in general,

omputing a power of an operator is diÆ ult. However, the operation U

a

of

(mod q)-multipli ation by a possesses the following remarkable property:

(U

a

)

p

= U

a

p

= U

(a

p

mod q)

:


Consequently, �

�

(U

a

)

2

j

�

= �(U

b

), where b � a

2

j

(mod q). The required

values for the parameter b an be al ulated using a ir uit of polynomial

size; then we an apply the result of Problem 13.2.

Let us return to the ir uit des ribed in 13.5.1. We found the eigenvalue

�

k

= � = e

2�i'

for some eigenve tor j�

k

i. This same ve tor is an eigenve tor

for any power of the operator U

a

, so that in the same quantum register we

an look for an eigenvalue of U

2

a

= U

a

2 (it equals �

2

= e

2�i�2'

), of U

4

a

= U

a

4

(it equals �

4

= e

2�i�4'

), et .

In other words, we an determine with pre ision 1=16 the values of ',

2',. . . , 2

2n�1

' modulo 1. But this allows us to determine ' with pre ision

1=2

2n+2

eÆ iently (in linear time with onstant memory). The idea is based

on the following obvious fa t: if jy � 2'j

mod 1

< Æ < 1=2, then

either jy

0

0

� 'j

mod 1

< Æ=2 or jy

0

1

� 'j

mod 1

< Æ=2;

where y

0

0

; y

0

1

are the solutions to the equation 2y

0

� y (mod 1). Thus we an

start from 2

2n�1

' and in rease the pre ision as we pro eed toward '. The

approximate values of 2

j�1

' (j = 2n; 2n � 1; : : : ; 1) will allow us to make

the orre t hoi es.

Let m = 2n. For j = 1; : : : ;m we repla e the known approximate value

of 2

j�1

' by �

j

, the losest number from the set

�

0

8

;

1

8

;

2

8

;

3

8

;

4

8

;

5

8

;

6

8

;

7

8

. This

guarantees that

j2

j�1

'� �

j

j

mod 1

< 1=16 + 1=16 = 1=8:

Let us introdu e a notation for binary fra tions: :�

1

� � ��

p

=

P

p

j=1

2

�j

�

j

(�

j

2 f0; 1g). Our algorithm is as follows.

Algorithm for sharpening the value of '. Set :�

m

�

m+1

�

m+2

= �

m

and pro-

eed by iteration:

�

j

=

(

0 if

�

�

:0�

j+1

�

j+2

� �

j

�

�

mod 1

< 1=4;

1 if

�

�

:1�

j+1

�

j+2

� �

j

�

�

mod 1

< 1=4

for j = m� 1; : : : ; 1

(ea h time, exa tly one of the two ases holds). The result satis�es the

inequality

j:�

1

�

2

� � ��

m+2

� 'j

mod 1

< 2

�(m+2)

:

The proof is a simple indu tion:

�

�

:�

j

� � ��

m+2

� 2

j�1

'

�

�

mod 1

< 2

�(m+3�j)

at ea h step.

This pro edure is an example of omputation by a �nite-state automaton

(see Problem 2.11). The state of the automaton is the pair (�

j+1

; �

j+2

),

whereas the input symbols are �

j

. It follows that the omputation an be

represented by a Boolean ir uit of size O(m) and depth O(logm).


13.5.4. Determining the exa t value of the phase. We have found

a number y satisfying jy � k=tj < 1=2

2n+1

. We represent it as a ontinued

fra tion (see Se tion A.7) and try all onvergents of y until we �nd a fra tion

k

0

=t

0

su h that jy � k

0

=t

0

j < 1=2

2n+1

. The se ond part of Theorem A.13

guarantees that the number k=t is ontained among the onvergents, and

therefore will be found unless the algorithm stops earlier. But it annot stop

earlier be ause there is at most one fra tion with denominator � 2

n

that

approximates a given number with pre ision 1=2

2n+1

. The running time of

this algorithm is O(n

3

).

Important observations. 1. It is essential that the ve tor j�

k

i does not

deteriorate during the omputation.

2. The entire period �nding pro edure depends on the parameters l and

s; they should be adjusted so that the error probability be small enough. The

error an o ur in determining the period t as the least ommon denominator

(see Lemma 13.2) or in estimating the osine and the sine of '

k

with onstant

pre ision Æ (see inequality (13.4)). The total probability of error does not

ex eed 3 � 2

�l

+ nle

�(s)

. If it is required to get the result with probability

of error � 1=3, then we must set l = 4, s = �(log n). In this way we

get a quantum ir uit of size O(n

3

logn). (In fa t, there is some room for

optimization; see Corollary 13.3.1 below.)

13.6. Dis ussion of the algorithm. We dis uss two questions that arise

naturally with regard to the algorithm that has been set forth.

|Whi h eigenvalues do we �nd? We �nd a randomly hosen eigen-

value. The distribution over the set of all eigenvalues an be ontrolled by

appropriately hoosing the initial state. In our period �nding algorithm, it

was j1i =

1

p

t

P

t�1

k=0

j�

k

i, whi h orresponded to the uniform distribution on

the set of eigenvalues asso iated with the orbit of 1.

| Is it possible to �nd eigenvalues of other operators in the

same way as in the algorithm for determining the period? Let us

be a urate: by �nding an eigenvalue with pre ision Æ and error probability

� " we mean onstru ting a measuring operator with garbage, as in equa-

tion (12.3), where j 2 is an index of an eigenvalue �

j

= e

2�i'

j

, L

j

is

the orresponding eigenspa e, and y = :�

1

� � ��

n

(n = dlog

2

(1=Æ)e). The

onditional probabilities (12.4) should satisfy

(13.5) Pr

h

jy � '

j

j

mod 1

< Æ

i

=

X

y: jy�'

j

j

mod 1

<Æ

P(yjj) � 1� ":

The answer to the question is \yes" | it is only ne essary to implement

�(U), whi h is usually easy. (For example, if U j0i = j0i, we an use the

result of Problem 8.4.) However, in general, the attainable pre ision is not


great and depends polynomially on the number of times the operator �(U)

is used. If one an eÆ iently ompute the powers of U , e.g., if one an

implement the operator

(13.6) �

m

(U) : jpi j�i 7! jpi U

p

j�i (0 � p < 2

m

);

then the pre ision an be made exponential, Æ = exp(�(m)).

13.7. Parallelized version of phase estimation. Appli ations. Re-

markably, the phase estimation pro edure (ex ept for its last part | the

ontinued fra tion algorithm) an be realized by a quantum ir uit of small

depth. This result is due to R.Cleve and J.Watrous [18℄, but our proof is

di�erent from theirs.

Theorem 13.3. Eigenvalues of a unitary operator U an be determined with

pre ision Æ = 2

�n

and error probability � " = 2

�l

by an O(n(l + log n))-

size, O(log n+log l)-depth quantum ir uit over the standard basis, with the

additional gate �

m

(U), m = n+ log(l+ logn) +O(1). This gate is used in

the ir uit only on e.

Proof. At the ore of the usual phase estimation pro edure is a sequen e

of operators �(U

2

k

), k = 1; : : : ; n� 1, applied to the same main register A

with distin t ontrol qubits 1; : : : ; t. (Here t = 2ns, whi h orresponds to

2n series of Bernoulli trials, ea h onsisting of s = �(l + logn) oin tosses.

Ea h series is made to determine a single number os(2

k

'

j

) or sin(2

k

'

j

)

with the suitable onstant pre ision and error probability 2

�l

=(2n).) We

need to parallelize these sequen es. This an be done as follows: instead of

applying the ir uit

�(U

p

t

)[t; A℄ � � � �(U

p

1

)[1; A℄;

we ompute p = p(u

1

; : : : ; u

t

) = u

1

p

1

+ � � � + u

t

p

t

(where u

1

; : : : ; u

t

2 B

are the values of the ontrol qubits), use p as the ontrol parameter for the

operator �

m

(U), and un- ompute p.

To optimize the omputation of p, we noti e that ea h p

r

is of the form

p

r

= 2

k

r

. The terms in the sum

P

t

r=1

u

r

p

r

an be divided into 2s groups in

su h a way that the numbers k

r

be distin t within ea h group. Therefore,

ea h group orresponds to an n-digit integer, and there are 2s = O(l+logn)

of su h integers. The sum an be omputed by a ir uit of size O(n(l+log n))

and depth O(log n+ log l) (see Problem 2.13a).

Let us estimate the omplexity of the remaining part of the pro edure.

Ea h gate �(U

2

k

) is a ompanied by two H gates and possibly by one K

gate, whi h ontribute O(t) to the size and O(1) to the depth. Further, one

needs to ount the number of \heads" in ea h of the 2n series of oin ips.

This is done by ir uits of size O(s) and depth O(log s). The subsequent


trigonometri al ulations are performed with onstant pre ision, so ea h

instan e of su h al ulation is done by a ir uit of onstant size. Finally,

sharpening of the value of '

j

is arried out by a ir uit of size O(n) and

depth O(log n). All these numbers stay within the required bounds. �

Unfortunately, Theorem 13.3 does not imply that the algorithms for

period �nding and fa toring an be fully parallelized. However, one an

derive the following orollary.

Corollary 13.3.1. Period �nding and fa toring an be performed by a uni-

form sequen e of O(n

3

)-size, O((log n)

2

)-depth quantum ir uits, with some

lassi al pre-pro essing and post-pro essing. The pre-pro essing and post-

pro essing are realized by uniform sequen es of O(n

3

)-size Boolean ir uits.

Note that if we use De�nition 9.2, lassi al pre-pro essing does not ount,

sin e it an be in luded into the ma hine that generates the ir uit. (How-

ever, the post-pro essing does ount.) The division into three stages is lear

from Table 13.1. The pre-pro essing amounts to modular exponentiation,

(q; a; p) 7! a

p

mod q. No small depth ir uit is known for the solution of this

problem. Thus we must ompute the numbers a

2

j

mod q (j = 0; : : : ; 2n�1)

in advan e. The post-pro essing in ludes �nding the exa t value of ' (by

the ontinued fra tion algorithm) and the al ulation of the least ommon

denominator (by Eu lid's algorithm).

Proof of Corollary 13.3.1. The operator

�(U

a

) :

�

�

p; x

�

7!

�

�

p; (a

p

x mod q)

�

is realized using the onstru tion from the proof of Theorem 7.3. We need

to ompute (a

p

x mod q) and (a

�p

x mod q) by ir uits of small depth. With

pre- omputed values of (a

2

j

mod q) and (a

�2

j

mod q), the omputation of

(a

p

x mod q) or (a

�p

x mod q) amounts to multiplying O(n) numbers and

al ulating the residue mod q, whi h is done by a ir uit of size O(n

3

) and

depth O((log n)

2

). �

Remark 13.1. R.Cleve and J.Watrous [18℄ also noti ed that the depth

an be de reased at the ost of in rease in size. Indeed, the multipli ation

of O(n) n-digit numbers an be performed with depth O(logn) and size

O(n

5

(log n)

2

) (see [9℄); therefore the same bound applies to period �nding

and fa toring.

Now we an also prove Theorem 8.3. We will a tually onsider a more

general situation: instead of realizing a single operator, we will try to sim-

ulate a ir uit (see Theorem 13.5 below). Let us begin with a lemma.

Lemma 13.4. The operator �

n

(e

2�i=2

n

) : jli 7! e

2�il=2

n

jli (0 � l < 2

n

) an

be realized with pre ision Æ = 2

�n

by an O(n

2

logn)-size O((log n)

2

)-depth


ir uit C

n

over the standard basis, using an illas. The ir uit C

n

an be

onstru ted algorithmi ally in time poly(n).

Proof. Let us assume that we have at our disposal an n-qubit register in

the state

(13.7) j

n;k

i =

1

p

2

n

2

n

�1

X

j=0

exp

�

�2�i

kj

2

n

�

jji;

where k is odd. We will now see how it helps to a hieve our goal of realizing

the operator �

n

(e

2�i=2

n

).

The ve tor j

n;k

i is an eigenve tor of the permutation operator X :

jji 7! j(j + 1) mod 2

n

i,

Xj

n;k

i = e

2�i'

k

j

n;k

i; '

k

= k=2

n

:

Appli ation of a power ofX to the target state j

n;k

i results in multipli ation

by a phase fa tor,

X

p

j

n;k

i = e

2�i(kp=2

n

)

j

n;k

i:

If k is odd, we an hoose p to satisfy kp � l (mod 2

n

), whi h will provide

the required phase fa tor e

2�il=2

n

. Thus, for the realization of the operator

jli 7! e

2�il=2

n

jli we use the value of l to ompute p, apply the operator

(13.8) �

n

(X) : jp; ji 7!

�

�

p; (j + p) mod 2

n

�

; p; j 2 f0; : : : ; 2

n

� 1g;

ontrolled by this p, and \un- ompute" p. The operator �

n

(X) an be

realized by a ir uit of size O(n) and depth O(log n) over the standard

basis.

Ideally, we would want to use the ve tor j

n;k

i for some parti ular k,

say, k = 1. But onstru ting su h a ve tor is not easy, so we will start from

a superposition of all odd values of k, namely,

j�i =

1

p

2

j0i �

1

p

2

j2

n�1

i =

1

p

2

n�1

2

n�1

X

s=1

j

n;2s�1

i:

Then we will measure k = 2s � 1 and solve the equation for p. We now

des ribe the required a tions.

1. Create the ve tor j�i = �

z

[1℄H[1℄ j0

n

i.

2. Measure k with error probability � " = Æ

2

=4. To �nd k, it suÆ es to de-

termine the phase '

k

= k=2

n

with pre ision Æ = 2

�n

. By Theorem 13.3,

su h phase estimation is realized by a ir uit of size O(n

2

) and depth

O(log n). The measured value should be odd, k = 2s � 1. (If it has

happened to be even, set k = 1.)

3. Find p = p(s; l) satisfying the equation (2s � 1)p � l (mod 2

n

) (see

below).


4. Apply X

p

to the n-bit register (whi h presumably ontains j

n;2s�1

i).

This will e�e t the desired phase shift.

5. Reverse the omputation done at Steps 1{3.

Apart from Step 1 and its reverse, the above pro edure an be des ribed

symboli ally as W

�1

VW , where W represents Steps 2 and 3, and V rep-

resents Step 4. Hen e the result of Problem 12.2 applies | the pro edure

realizes the operator U = �

n

(e

2�i=2

n

) with pre ision 2

p

" = Æ.

Step 3 is the most demanding for resour es. The solution to the equation

(2s� 1)p � l (mod 2

n

) an be obtained as follows:

p � �l

m�1

X

j=0

(2s)

j

� �l

t�1

Y

r=1

�

1 + (2s)

2

r

�

mod 2

n

; m = 2

t

; t = dlog

2

ne:

This al ulation is done by a ir uit of size O(n

2

log n) and depth O((log n)

2

)

( f. solution to Problem 2.14a). �

Theorem 13.5. Any ir uit C of size L and depth d over a �xed �nite basis

C an be simulated with pre ision Æ by an O

�

Ln+n

2

logn

�

-size O

�

d log n+

(log n)

2

�

-depth ir uit

~

C over the standard basis (using an illas), where n =

O(log(L=Æ)).

Proof. Due to the results of Problems 8.1 and 8.2, ea h gate of the original

basis C an be repla ed by a onstant size ir uit over the basis Q[

�

�(e

i'

) :

' 2 R

. Thus the ir uit C is transformed into a ir uit C

0

of size L

0

=

O(L) and depth d

0

= O(d) over the new basis. Ea h gate �(e

i'

) an be

approximated with pre ision Æ

0

= Æ=(3L

0

) by a gate of the form �(e

2�il=2

n

),

where n = dlog

2

(1=Æ

0

)e, and l is an integer. The operator �(e

2�il=2

n

) is a

spe ial ase of �

n

(e

2�i=2

n

), hen e we an use Lemma 13.4. However, the

resulting ir uit is somewhat larger than required, although it will suÆ e

for the proof of Theorem 8.3 (whi h orresponds to the ase L = d = 1).

To optimize the above simulation pro edure, let us examine the proof

of Lemma 13.4. Most of the resour e usage an attributed to solving the

equation kp � l (mod 2

n

). But this step is redundant if k = 1. In fa t, the

operator �

n

(e

2�i=2

n

) an be realized by applying �

n

(X) (see (13.8)) to the

target state j

n;1

i; this is done by a ir uit of size O(n) and depth O(log n).

Thus we need to reate L

0

opies of the state j

n;1

i and use one opy per

gate in the simulation of the ir uit C

0

. The exa t sequen e of a tions is as

follows.

1. Create the state j

n;0

i = H

n

j0

n

i.

2. Turn it into j

n;1

i = �

n

(e

�2�i=2

n

)j

n;0

i by the pro edure of Lemma 13.4.

This is done with pre ision Æ

0

= 2

�n

� Æ=3. The orresponding ir uit

has size O(n

2


2

).


3. Make L

0

opies of the state j

n;1

i out of one opy (see below).

4. Simulate the ir uit C

0

with pre ision Æ=3, using one opy of j

n;1

i per

gate.

5. Reverse Steps 1{3.

To produ e multiple opies of the state j

n;1

i, we an use the equation

j

n;k

i

m

=W

�1

�

j

n;0

i

(m�1)

j

n;k

i

�

;

W : jx

1

; : : : ; x

m�1

; x

m

i 7! jx

1

; : : : ; x

m�1

; x

1

+ � � �+ x

m

i:

The operatorsW andW

�1

are realized using the onstru tion from the proof

of Theorem 7.3. This involves the addition of m n-digit numbers, whi h is

done by a Boolean ir uit of size O(nm) and depth O(log n + logm) (see

Problem 2.13a). In our ase m = L

0

= O(L). The overall size and depth of

the resulting quantum ir uit are as required. �

[3!℄ Problem 13.4. Let q � 1, n = dlog

2

qe, Æ = 2

�l

. Realize the Fourier

transform on the group Z

q

with pre ision Æ by a quantum ir uit of size

poly(n; l) and depth poly(log n; log l) over the standard basis. Estimate the

size and the depth of the ir uit more a urately. (For de�nition of the

quantum Fourier transform, see Problem 9.4 .)

13.8. The hidden subgroup problem for Z

k

. The algorithms dis ov-

ered by Simon and Shor an be generalized to a rather broad lass of prob-

lems onne ted with Abelian groups. The most general of these is the hidden

subgroup problem for Z

k

[12℄, to whi h the hidden subgroup problem in an

arbitrary �nitely generated Abelian group G an be redu ed. (Indeed, G

an be represented as a quotient group of Z

k

for some k.)

A \hidden subgroup" D � Z

k

(as de�ned on page 117) has �nite index:

the order of the group E = Z

k

=D does not ex eed 2

n

. Therefore D

�

=

Z

k

. From the omputational viewpoint, D is given by a basis (g

1

; : : : ; g

k

)

whose binary representation has length poly(k; n). Any su h basis gives a

solution to the problem. (The equivalen e of two bases an be veri�ed by a

polynomial algorithm.)

The problem of omputing the period is a spe ial ase of the hidden

subgroup problem in Z. Re all that per

q

(a) = minft � 1 : a

t

� 1 (mod q)g.

The fun tion f : x 7! a

x

(mod q) satis�es ondition (13.1), where D =

fm per

q

(a) : m 2 Zg. This fun tion is polynomially omputable, hen e

an arbitrary polynomial algorithm for �nding a hidden subgroup an be

transformed into a polynomial algorithm for al ulating the period.

The well-known problem of al ulating the dis rete logarithm an be

redu ed to the hidden subgroup problem for Z

2

. The smallest positive in-

teger s su h that �

s

= a, where � is a generator of the group (Z=qZ)

�

, is


alled the dis rete logarithm of a number a at base �. Consider the fun tion

f : (x

1

; x

2

) 7! �

x

1

a

x

2

mod q. This fun tion also satis�es ondition (13.1),

where D =

�

(x

1

; x

2

) 2 Z

2

: �

x

1

a

x

2

� 1 mod q

. If we know a basis of the

subgroup D � Z

2

, it is easy to �nd an element of the form (s;�1) 2 D.

Then �

s

= a, i.e., s is the dis rete logarithm of a at base �.

Let us des ribe a quantum algorithm that solves the hidden subgroup

problem for G = Z

k

. It is analogous to the algorithm for the ase G = (Z

2

)

k

,

but instead of the operator H

k

we use the pro edure for measuring the

eigenvalues. Instead of a basis for the group D we will look for a system

of generators of the hara ter group E

�

= Hom(E;U(1)) (the transition

from E

�

to D is realized by a polynomial algorithm; see, for example, [54,

Volume 1℄). The hara ter

(g

1

; : : : ; g

k

) 7! exp

�

2�i

X

j

'

j

g

j

�

is determined by the set '

1

; : : : ; '

k

of numbers modulo 1. These are rational

numbers with denominators not ex eeding jE

�

j � 2

n

.

If we produ e l = n+ 3 uniformly distributed random hara ters ('

(1)

1

,

. . . , '

(1)

k

), . . . , ('

(l)

1

, . . . , '

(l)

k

), then they will generate the entire group E

�

with probability � 1 � 1=2

l�n

= 1 � 1=8 (see Problem 13.1). It suÆ es to

know ea h '

(r)

j

with pre ision Æ and the probability of error � ", where

(13.9) Æ �

1

2

2n+1

; " �

1

5kl

:

The last ondition guarantees that the total probability of error does not

ex eed 1=8 + 1=5 < 1=3.

Let us hoose a suÆ iently large number M = 2

m

(a on rete estimate

an be obtained by analyzing the algorithm). We will work with integers

between 0 to M � 1.

Let us prepare, in one quantum register of length km, the states

j�i =M

�k=2

X

g2�

jgi; where � = f0; : : : ;M � 1g

k

:

In another register we put j0

n

i. We apply the quantum ora le (13.2) and

then dis ard the se ond register. We obtain the mixed state

� = Tr

[km+1;:::;km+n℄

�

U

�

j�ih�j j0

n

ih0

n

j

�

U

y

�

=M

�k

X

g;h2�:g�h2D

jgihhj:

Now we are going to measure the eigenvalues of the shift (mod M) op-

erators

V

j

:

�

g

1

; : : : ; g

j

; : : : ; g

k

�

7!

�

g

1

; : : : ; (g

j

+ 1) modM; : : : ; g

k

�


(only the j-th omponent hanges). These operators ommute, so that they

have a ommon basis of eigenve tors, and therefore we an determine their

eigenvalues simultaneously. The eigenvalues have the form e

2�is

j

=M

. The

orresponding eigenve tors are

j�

s

1

;:::;s

k

i = M

�k=2

X

(g

1

;:::;g

k

)2�

exp

�

�2�i

k

X

j=1

g

j

s

j

M

�

jg

1

; : : : ; g

k

i:

The probability that a given set (s

1

; : : : ; s

k

) will be realized equals

P(�; L

s

1

;:::;s

k

) = h�

s

1

;:::;s

k

j � j�

s

1

;:::;s

k

i

=M

�2k

X

g;h2Z

k

�

D

(g � h)�

�

(g)�

�

(h) exp

�

2�i

k

X

j=1

(g

j

� h

j

)s

j

M

�

;

where �

A

(�) denotes the hara teristi fun tion of the set A: The Fourier

transform of the produ t is the onvolution of the Fourier transforms of the

fa tors. Therefore,

(13.10) P(�; L

s

1

;:::;s

k

) =

1

jE

�

j

X

('

1

;:::;'

k

)2E

�

p

'

1

;:::;'

k

(s

1

; : : : ; s

k

);

where

p

'

1

;:::;'

k

(s

1

; : : : ; s

k

) =

k

Y

j=1

�

sin(M�(s

j

=M � '

j

))

M sin(�(s

j

=M � '

j

))

�

2

:

For a given element ('

1

; : : : ; '

k

) 2 E

�

the fun tion p

'

1

;:::;'

k

is a probabil-

ity distribution. Therefore equation (13.10) an be modeled by the following

pro ess: �rst, a random uniformly distributed element ('

1

; : : : ; '

k

) 2 E

�

is

generated; se ond, the parameters s

1

; : : : ; s

k

are set a ording to the ondi-

tional probabilities p

'

1

;:::;'

k

(s

1

; : : : ; s

k

). The onditional probabilities have

the following property:

Pr

h

js

j

=M � '

j

j > �

i

�

1

M�

for any � > 0. If we estimate the quantities s

j

=M with pre ision � and

error probability � 1=M�, we obtain the values of '

1

; : : : ; '

k

with pre ision

Æ = 2� and error probability� " = 2=M�. It remains to hoose the numbers

M and � so that inequality (13.9) be satis�ed.

Complexity of the algorithm. We need O(n) queries to the ora le, ea h

query being of length O(k(n + log k)). The size of the quantum ir uit is

estimated as O(kn

3

) poly(log k; log n).


14. The quantum analogue of NP: the lass

BQNP

It is possible to onstru t quantum analogues not only for the lass P, but

also for other lassi al omplexity lasses. This is not a routine pro ess, but

suitable generalizations often ome up naturally. We will onsider the lass

NP as an example. (For another example | the lass IP and its quantum

analogue QIP | see [72, 38℄. We also mention that the quantum analogue

of PSPACE equals PSPACE [71℄.)

14.1. Modi� ation of lassi al de�nitions. Quantum omputation, as

well as probabilisti omputation, is more naturally des ribed using partially

de�ned fun tions. Earlier we made do without this on ept so as not to

ompli ate matters by the in lusion of extraneous detail, but now we need

it.

A partially de�ned Boolean fun tion is a fun tion

F : B

n

! f0; 1; \unde�ned"g:

In this se tion it will be ta itly understood that by Boolean fun tion we

mean partially de�ned Boolean fun tion.

One more omment regarding notation: we have used the symbol P

both for the lass of polynomially omputable fun tions and for the lass

of polynomially de idable predi ates; now we a t analogously, using the

notations P, NP, et . for lasses of partially de�ned fun tions.

P, of ourse, denotes the lass of polynomially omputable partially

de�ned fun tions. We introdu e a modi�ed de�nition of the lass NP.

De�nition 14.1. A fun tion F : B

n

! f0; 1; \unde�ned"g belongs to the

lass NP if there is a partially de�ned fun tion R 2 P in two variables su h

that

F (x) = 1 =) 9 y

�

(jyj < q(jxj)) ^ (R(x; y) = 1)

�

F (x) = 0 =) 8 y

�

(jyj < q(jxj))) (R(x; y) = 0)

�

:

As before, q(�) is a polynomial.

What would hange if in De�nition 14.1 we repla ed the ondition R 2 P

by the ondition R 2 BPP? First of all, we would get a di�erent, broader,

lass, whi h we ould denote by BNP. However, for this lass there is an-

other, standard, notation | MA, indi ating that it falls into a hierar hy

of lasses de�ned by Arthur-Merlin games. We have mentioned Arthur and

Merlin in onne tion with the de�nition of NP. We have also dis ussed

games orresponding to other omplexity lasses (see Se tion 5.1). Tradi-

tionally, the term \Arthur-Merlin games" is used for probabilisti games in

whi h Arthur is a polynomial Turing ma hine whereas Merlin is all-powerful;


before ea h move Arthur ips oins so that both players see them. The or-

der of the letters in the symbol MA indi ates the order of the moves: at

�rst Merlin ommuni ates y , then Arthur he ks the truth of the predi-

ate R(x; y), by a polynomial probabilisti omputation. The message y is

sometimes alled a \proof"; it may be hard to �nd but easy to he k.

14.2. Quantum de�nition by analogy.

De�nition 14.2. A fun tion F : B

n

! f0; 1; \unde�ned"g belongs to the

lass BQNP if there exists a polynomial lassi al algorithm that omputes

a fun tion x 7! Z(x), where Z(x) is a des ription of a quantum ir uit,

realizing an operator U

x

: B

N

x

! B

N

x

su h that

F (x) = 1 =) 9 j�i 2 B

m

x

P

�

U

x

j�i j0

N

x

�m

x

i; M

�

� p

1

;

F (x) = 0 =) 8 j�i 2 B

m

x

P

�

U

x

j�i j0

N

x

�m

x

i; M

�

� p

0

:

Here M = C

�

j1i

�

B

(N

x

�1)

, and p

0

; p

1

satisfy the ondition p

1

� p

0

�

(n

��

) for some onstant � � 0. The quanti�ers of j�i in lude only ve tors

of unit length. (We will use an analogous onvention further on in this

se tion, pushing numeri fa tors outside the j�i sign.)

The ve tor j�i plays the role of y in the previous de�nition. Note that

m

x

� N

x

� jZ(x)j = poly(jxj) sin e the algorithm is polynomial.

In e�e t, the very same game of Merlin with Arthur is taking pla e,

but now it is governed by the laws of quantum me hani s. Merlin sends a

quantum message (the state j�i) to Arthur, who he ks it by applying the

operator U

x

. A suitable message will onvin e Arthur that F (x) = 1 (if

this is a tually so) with probability � p

1

. But if F (x) = 0, Merlin annot

su eed in onvin ing Arthur to the ontrary with probability higher than p

0

,

whatever message he sends. Instead of a pure state j�i, we an allow Merlin

to send an arbitrary density matrix | the maximum of the probability is

a hieved on a pure state anyway.

In De�nition 14.2, we have the same exibility in hoosing the threshold

probabilities p

0

and p

1

as in the de�nitions of BPP and BQP.

Lemma 14.1 (ampli� ation of probabilities). If F 2 BQNP, then it

likewise satis�es a variant of De�nition 14.2 in whi h the numbers p

0

, p

1

(p

1

� p

0

= (n

��

)) are repla ed by

p

0

1

= 1� "; p

0

0

= "; " = exp(�(n

�

));

where � is an arbitrary positive onstant.

Proof. The general idea of amplifying the probabilities remains as before:

we onsider k = poly(n) opies of the ir uit realizing the operator U = U

x

.


To the results of their work we apply a variant of the majority fun tion,

with the threshold value adjusted so as to separate p

0

from p

1

:

(14.1) G(z

1

; : : : ; z

k

) =

(

1 if

P

k

j=1

z

j

� pk;

0 if

P

k

j=1

z

j

< pk;

where p = (p

0

+ p

1

)=2. But now there appears an additional diÆ ulty:

Merlin may attempt to de eive Arthur by sending him a message that is not

fa torable into the tensor produ t.

Let us grant Merlin greater freedom, allowing him to submit any density

matrix � 2 L(B

km

). The probability of obtaining out omes z

1

; : : : ; z

k

by

applying k opies of U to the message � is as follows:

(14.2) P(z

1

; : : : ; z

k

j �) = Tr

�

X

(z

1

)

� � � X

(z

k

)

�

�

;

where

(14.3) X

(a)

= Tr

[m+1;:::;N ℄

�

U

y

�

(a)

1

U

�

I

B

m j0

N�m

ih0

N�m

j

�

�

:

Here �

(a)

1

is the proje tion onto the subspa e of states having a in the �rst

qubit (i.e., C

�

jai

�

B

(N�1)

).

If F (x) = 1, Merlin an simply send the state � = �

k

x

, where �

x

=

j�

x

ih�

x

j is the message that would onvin e Arthur with probability � p

1

in

the original version of the game (with a single opy of U). By the general

properties of quantum probability, formula (14.2) takes the form

P(z

1

; : : : ; z

k

j �) =

k

Y

j=1

Tr(X

(z

j

)

�

x

) =

k

Y

j=1

P(z

j

j�

x

):

We will derive a lower bound for this quantity from a more general analysis

given below.

In the opposite ase, F (x) = 0, we will obtain an upper bound for the

probability P(z

1

; : : : ; z

k

j�) over all density matri es �.

Let us sele t an orthonormal basis in the spa e B

m

, in whi h the oper-

ator X

(1)

is diagonalized (this operator is learly Hermitian). The operator

X

(0)

= I � X

(1)

is diagonal in the same basis. We de�ne a set of \ ondi-

tional probabilities" p(zjd) = P

�

z

�

�

jdihdj

�

= hdjX

(z)

jdi, where jdi is one of

the basis ve tors. It is obvious that p(zjd) � 0 and p(0jd) + p(1jd) = 1. In

this notation, the quantity P(z

1

; : : : ; z

k

j�) be omes

P(z

1

; : : : ; z

k

j �) =

X

d

1

;:::;d

k

p

d

1

;:::;d

k

p(z

1

jd

1

) � � � p(z

k

jd

k

);

X

d

1

;:::;d

k

p

d

1

;:::;d

k

= 1;

where p

d

1

;:::;d

k

= hd

1

; : : : ; d

k

j�jd

1

; : : : ; d

k

i.

This formula has the following interpretation. Consider the set of prob-

abilities P(z

1

; : : : ; z

k

j�) for all sequen es (z

1

; : : : ; z

k

) 2 B

k

as a ve tor in


a 2

k

-dimensional real spa e; we denote this ve tor by

�!

P(�) 2 R

B

k

. We

have shown that for a general density matrix � the ve tor

�!

P(�) belongs

to the onvex hull of su h ve tors orresponding to produ t states, namely,

jd

1

ihd

1

j� � �jd

k

ihd

k

j. Therefore the probability of the event G(z

1

; : : : ; z

k

) =

1,

Pr

�

G(z

1

; : : : ; z

k

) = 1

�

�

�

�

=

X

z2B

k

G(z)P(zj�) =

�

�!

G;

�!

P (�)

�

;

a hieves its maximum at a density matrix of this spe ial type.

In the ase where G is the threshold fun tion (14.1),

p

max

def

= max

�

Pr

�

G(z

1

; : : : ; z

k

) = 1

�

�

�

�

=

X

j�l

�

k

j

�

p

j

�

(1� p

�

)

k�j

;

where p

�

= max

j�i

h�jX

(1)

j�i. The number p

max

equals the probability of

getting � pk \heads" for k oins tossed, Pr

h

k

�1

P

k

j=1

v

j

� p

i

, where

Pr

�

v

j

= 1

�

= p

�

. This probability an be estimated using Cherno�'s in-

equality (13.4). Thus we obtain

11

p

max

� exp(�2(p� p

�

)

2

k) if p � p

�

;

p

max

� 1� exp(�2(p� p

�

)

2

k) if p � p

�

:

A ording to the assumptions of the lemma, p

�

� p

0

if F (x) = 0, and

p

�

� p

1

if F (x) = 1. We have hosen p so that p � p

0

� (n

��

) and

p

1

� p � (n

��

). Choosing k = n

2�+�

(for a suitable onstant ), we get

exa tly the estimate whi h is required,

p

max

� exp(�(n

�

)) if F (x) = 0;

p

max

� 1� exp(�(n

�

)) if F (x) = 1:

�

Remark 14.1. An important point in the proof was the fa t that X

(0)

and

X

(1)

are diagonalized over the same basis. In general, the ampli� ation of

probability for nontrivial omplexity lasses (both lassi al and quantum) is

a rather subtle thing.

14.3. Complete problems. Similarly to the lass NP, the lass BQNP

has omplete problems. Completeness is understood with respe t to the

same polynomial redu tion that we onsidered earlier (i.e., Karp redu tion;

see De�nition 3.4). Here is the simplest example.

11

We have omitted the fa tor 2 from (13.4) be ause it in ludes both ases that are now

onsidered separately (see the proof of Cherno�'s inequality in the solution to Problem 13.3).

This is an unimportant fa tor anyway.


Problem 0. Consider a fun tion F whi h is de�ned on a subset of the

words of this form:

z =

�

(des ription of a quantum ir uit U); p

0

; p

1

�

;

where by des ription of a ir uit we mean its approximate realization in the

standard basis, and p

0

; p

1

are su h that p

1

� p

0

� (n

��

) (n is the size of

the ir uit, � > 0 is a onstant). The fun tion F is de�ned as follows:

F (z) = 1 () there exists a ve tor j�i, on whi h we get 1 in the �rst

bit with probability greater than p

1

;

F (z) = 0 () for all j�i the probability of getting 1 in the �rst bit is

smaller than p

0

.

The ompleteness of Problem 0 is obvious: by saying that the problem

is omplete we just rephrase De�nition 14.2.

We now onsider more interesting examples. To start, we de�ne a quan-

tum analog of 3-CNF | the lo al Hamiltonian (lo ality is the analogue of

the ondition that the number of variables in ea h lause is bounded).

De�nition 14.3. An operator H : B

n

! B

n

is alled a k-lo al Hamil-

tonian if it is expressible in the form

H =

X

j

H

j

[S

j

℄;

where ea h term H

j

2 L(B

jS

j

j

) is a Hermitian operator a ting on a set of

qubits S

j

, jS

j

j � k.

In addition, we put a normalization ondition, namely, 0 � H

j

� 1,

meaning that both H

j

and I �H

j

are nonnegative.

Problem 1: the lo al Hamiltonian. Let

z =

�

des ription of a k-lo al Hamiltonian H; a; b

�

;

where k = O(1), 0 � a < b, b� a = (n

��

) (� > 0 is a onstant). Then

F (x) = 1 () H has an eigenvalue not ex eeding a;

F (x) = 0 () all eigenvalues of H are greater than b:

Proposition 14.2. The problem lo al Hamiltonian belongs to BQNP.

Proof. At the outset we des ribe the general idea. We onstru t a ir uit

W that an be applied to a state j�i 2 B

n

so as to produ e a result 1 or 0

(\yes" or\no"): it says whether Arthur a epts the submitted state or not.

The answer \yes" will o ur with probability p = 1 � r

�1

h�jHj�i, where

r is the number of terms in the Hamiltonian H. If j�i is an eigenve tor


orresponding to an eigenvalue � � a, then the probability of the answer

\yes" is

p = 1� r

�1

h�jHj�i = 1� r

�1

� � 1� r

�1

a;

and if every eigenvalue of H ex eeds b, then

p = 1� r

�1

h�jHj�i � 1� r

�1

b:

At �rst we onstru t su h a ir uit for a single term. This will be just a

realization of the POVM measurement orresponding to the de omposition

I = H

j

+ (I � H

j

). We ould use the general result about POVM mea-

surements (see Problem 11.8), but let us give an expli it onstru tion from

s rat h.

Let H

j

=

P

s

�

s

j

s

ih

s

j. This operator a ts on a bounded number of

qubits, jS

j

j � k. Therefore we an realize the operator

W

j

: j

s

; 0i 7! j

s

i

�

p

�

s

j0i+

p

1� �

s

j1i

�

by a ir uit of onstant size. It a ts on the set of qubits S

j

[ f\answer"g,

where \answer" denotes the qubit that will ontain the measurement out-

ome.

We ompute the probability that the out ome is 1. Let j�i =

P

s

y

s

j

s

i

be the expansion of j�i in the orthogonal system of eigenve tors of H

j

. We

have, by de�nition of the probability,

P

j

(1) = h�; 0jW

y

j

�

I j1ih1j

| {z }

answer

�

W

j

j�; 0i

=

X

s

y

�

s

h

s

; 0j

!

W

y

j

�

I j1ih1j

| {z }

answer

�

W

j

X

t

y

t

j

t

; 0i

!

=

X

s;t

p

1� �

s

y

�

s

p

1� �

t

y

t

h

s

j

t

i =

X

s

(1� �

s

)y

�

s

y

s

= 1� h�jH

j

j�i:

The general ir uit W hooses the integer j randomly and uniformly,

after whi h it applies the orresponding operator W

j

. This pro edure an

be realized by the measuring operator

P

j

jjihjj W

j

, applied to the initial

ve tor

�

1

p

r

P

j

jji

�

j�; 0i. (Here jji denotes a basis ve tor in an auxiliary

r-dimensional spa e.) The probability of getting the out ome 1 is

P(1) =

X

j

1

r

P

j

(1) =

X

j

1

r

�

1� h�jH

j

j�i

�

= 1� r

�1

h�jHj�i:

�


14.4. Lo al Hamiltonian is BQNP- omplete.

Theorem 14.3. The problem lo al Hamiltonian is BQNP- omplete

with respe t to the Karp redu tion.

The rest of this se tion onstitutes a proof of this theorem. The main

idea goes ba k to Feynman [24℄: repla ing a unitary evolution by a time

independent Hamiltonian (i.e., transition from the ir uit to a lo al Hamil-

tonian).

Thus, suppose we have a ir uit U = U

L

� � �U

1

of size L. We will assume

that U a ts on N qubits, the �rst m of whi h initially ontain Merlin's

message j�i, the rest being initialized by 0. The gates U

j

a t on pairs of

qubits.

14.4.1. The Hamiltonian asso iated with the ir uit. It a ts on the

spa e

L = B

N

C

L+1

;

where the �rst fa tor is the spa e on whi h the ir uit a ts, whereas the

se ond fa tor is the spa e of a step ounter ( lo k). The Hamiltonian onsists

of three terms whi h will be de�ned later,

H = H

in

+H

prop

+H

out

:

We are interested in the minimum eigenvalue of this Hamiltonian, or the

minimum of the ost fun tion f

�

j�i

�

= h�jHj�i over all ve tors j�i of unit

length. We will try to arrange that the Hamiltonian has a small eigenvalue

if and only if there exists a quantum state j�i 2 B

m

ausing U to output

1 with high probability. In su h a ase, the minimizing ve tor j�i will be

related to that j�i in the following way:

j�i =

1

p

L+ 1

L

X

j=0

U

j

� � �U

1

j�; 0i jji:

In onstru ting the terms of the Hamiltonian, we will try to \enfor e" this

stru ture of the ve tor j�i by imposing \penalties" that in rease the ost

fun tion whenever j�i deviates from the indi ated form.

The term H

in

orresponds to the ondition that, at step 0, all the qubits

but m are in state j0i. Spe i� ally,

(14.4) H

in

=

N

X

s=m+1

�

(1)

s

!

j0ih0j;

where �

(�)

s

is the proje tion onto the subspa e of ve tors for whi h the s-th

qubit equals �. The se ond fa tor in this formula a ts on the spa e of the

ounter. (Informally speaking, the term �

(1)

s

j0ih0j \ olle ts a penalty" by


adding 1 to the ost fun tion whenever the s-th qubit is in state j1i while

the ounter being in state j0i.)

The term H

out

orresponds to the �nal state and equals

(14.5) H

out

= �

(0)

1

jLihLj:

Here we assume that the bit of the result has number 1. (That is, at step L

the �rst qubit should be in state j1i, or a penalty will be imposed.)

And, �nally, the term H

prop

des ribes the propagation of a quantum

state through the ir uit. It onsists of L terms, ea h of whi h orresponds

to the transition from j � 1 to j:

H

prop

=

L

X

j=1

H

j

;(14.6)

H

j

= �

1

2

U

j

jjihj�1j �

1

2

U

y

j

jj�1ihjj +

1

2

I

�

jjihjj + jj�1ihj�1j

�

:

Ea h term H

j

a ts on two qubits of the spa e B

N

, as well as on the spa e

of the ounter (the latter is not represented by qubits yet).

14.4.2. Change of basis. We e�e t the hange of basis given by the op-

erator

W =

L

X

j=0

U

j

� � �U

1

jjihjj:

(It is worth mentioning that W is a measuring operator with respe t to the

the value of the ounter j.) The hange of basis means that we represent

the ve tor j�i in the form j�i =W je�i; from now on, we are dealing with je�i

instead of j�i. Under su h a hange, the Hamiltonian is transformed into its

onjugate,

e

H =W

y

HW . We onsider how the onjugation by the operator

W a ts on the terms of H.

On the term H

in

the onjugation has no e�e t:

(14.7)

e

H

in

=W

y

H

in

W = H

in

:

The a tion on the term H

out

is:

(14.8)

e

H

out

=W

y

H

out

W =

�

U

y

�

(0)

1

U

�

jLihLj:


Ea h operator H

j

in (14.6) is the sum of three terms. Let us write the

a tion of the onjugation on the �rst of them:

W

y

�

U

j

jjihj�1j

�

W

=

X

p;t

�

U

p

� � �U

1

jpihpj

�

y

�

U

j

jjihj�1j

� �

U

t

� � �U

1

jtihtj

�

=

�

(U

j

� � �U

1

)

y

U

j

(U

j�1

� � �U

1

)

�

�

�

jjihjj

�

y

jjihj�1j

�

jj�1ihj�1j

�

�

= I jjihj�1j:

Conjugation of the two other terms pro eeds analogously, so that we obtain

e

H

j

=W

y

H

j

W

= I

1

2

�

jj�1ihj�1j � jj�1ihjj � jjihj�1j+ jjihjj

�

= I E

j

;

(14.9)

e

H

prop

=W

y

H

prop

W = I E;

where

E =

L

X

j=1

E

j

=

0

B

B

B

B

B

B

B

�

1

2

�

1

2

0

�

1

2

1 �

1

2

�

1

2

1 �

1

2

�

1

2

.

.

.

.

.

.

0 .

.

.

.

.

.

1

C

C

C

C

C

C

C

A

:

14.4.3. Existen e of a small eigenvalue if the answer is \yes".

Suppose that the ir uit U gives the out ome 1 (\yes") with probability

� 1� " on some input ve tor j�i. This, by de�nition, means that

P(0) =

�; 0

�

�

U

y

�

(0)

1

U

�

�

�; 0

�

� ":

We want to prove that in this ase

e

H (and so also H) has a small

eigenvalue. For this, it is suÆ ient to �nd a ve tor je�i su h that he�j

e

Hje�i is

small enough (the minimum of this expression as a fun tion of je�i is attained

at an eigenve tor).

In the spa e of the ounter we hoose the ve tor

(14.10) j i =

1

p

L+ 1

L

X

j=0

jji:

We set je�i = j�; 0i j i and estimate he�jHje�i.

It is lear that Ej i = 0. Therefore

he�j

e

H

prop

je�i = 0 = he�j

e

H

j

je�i:


Sin e all auxiliary qubits are initially set to 0, we immediately obtain from

the de�ning formula (14.4) that

he�j

e

H

in

je�i = 0:

It remains to estimate the last term

he�j

e

H

out

je�i = he�j

�

U

y

�

(0)

1

U jLihLj

�

je�i = P(0)

1

L+ 1

�

"

L+ 1

:

Thus we have proved that

he�j

e

H je�i �

"

L+ 1

;

so that H itself has an eigenvalue with the very same upper bound.

14.4.4. Lower bound for the eigenvalues if the answer is \no".

Suppose that for any ve tor j�i the probability of the out ome 1 does not

ex eed ", i.e.,

h�; 0jU

y

�

(0)

1

U j�; 0i � 1� ":

We will prove that, in this ase, all eigenvalues of H are � (1�

p

")L

�3

,

where is some onstant.

The proof is rather long, so we will outline it �rst. We represent the

Hamiltonian in the form

e

H = A

1

+ A

2

, where A

1

=

e

H

in

+

e

H

out

, and A

2

=

e

H

prop

. Both terms are nonnegative. It is easy to show that the null subspa es

ofA

1

andA

2

have trivial interse tion (i.e., L

1

[L

2

= f0g), hen e the operator

A

1

+ A

2

is positive de�nite. But this is not enough for our purpose, so we

obtain lower bounds for nonzero eigenvalues of A

1

and A

2

, namely, 1 for A

1

,

and

0

L

�2

for A

2

. In order to estimate the smallest eigenvalue of A

1

+A

2

, we

also need to know the angle between the null subspa es. The angle #(L

1

;L

2

)

between subspa es L

1

and L

2

with trivial interse tion is given by

(14.11) os#(L

1

;L

2

) = max

j�

1

i2L

1

j�

2

i2L

2

�

�

h�

1

j�

2

i

�

�

; 0 < #(L

1

;L

2

) <

�

2

:

Lemma 14.4. Let A

1

, A

2

be nonnegative operators, and L

1

, L

2

their null

subspa es, where L

1

\L

2

= f0g. Suppose further that no nonzero eigenvalue

of A

1

or A

2

is smaller than v. Then

A

1

+A

2

� v � 2 sin

2

#

2

;

where # = # (L

1

;L

2

) is the angle between L

1

and L

2

.

The notation A � a (A an operator, a a number) must be understood as

an abbreviation for A�aI � 0. In other words, if A � a, then all eigenvalues

of A are greater than, or equal to, a.


In our ase we will get the estimates 1 and

0

L

�2

for the nonzero eigen-

values of A

1

and A

2

(as already mentioned), and sin

2

# � (1�

p

") =(L+1)

for the angle. From this we derive the desired inequality

H �

�

1�

p

"

�

L

�3

:

Proof of Lemma 14.4. It is obvious that A

1

� v(I � �

L

1

) and A

2

�

v(I��

L

2

), so it is suÆ ient to prove the inequality (I��

L

1

)+(I ��

L

2

) �

2 sin

2

(#=2). This, in turn, is equivalent to

(14.12) �

L

1

+�

L

2

� 1 + os#:

Let j�i be an eigenve tor of the operator �

L

1

+�

L

2

orresponding to an

eigenvalue � > 0. Then

�

L

1

j�i = u

1

j�

1

i; �

L

2

j�i = u

2

j�

2

i; u

1

j�

1

i+ u

2

j�

2

i = �j�i;

where j�

1

i 2 L

1

and j�

2

i 2 L

2

are unit ve tors, and u

1

, u

2

are nonnegative

real numbers. From this we �nd

� =

�

�

�

�

�

L

1

+�

L

2

�

�

�

�

�

= u

2

1

+ u

2

2

;

�

2

=

�

u

1

h�

1

j+ u

2

h�

2

j

��

u

1

j�

1

i+ u

2

j�

2

i

�

= u

2

1

+ u

2

2

+ 2u

1

u

2

Reh�

1

j�

2

i:

Consequently,

(1 + x)��

2

= x(u

1

� u

2

)

2

� 0; where x =

�

�

Reh�

1

j�

2

i

�

�

:

Thus � � 1 + x � 1 + os#. �

We will now obtain the above-mentioned estimates. The subspa es A

1

and A

2

an be represented in the form

(14.13)

L

1

=

�

B

m

j0

N�m

i j0i

�

�

�

B

N

C

�

j1i; : : : ; jL� 1i

�

�

�

�

U

y

�

j1i B

(N�1)

�

jLi

�

(the last fa tor in all three terms pertains to the ounter), and

(14.14) L

2

= B

N

j i;

where the ve tor j i was de�ned by formula (14.10).

For the estimate

(14.15) A

1

L

?

1

� 1

it suÆ es to note that A

1

is the sum of ommuting proje tions, so that all

eigenvalues of this operator are nonnegative integers.

For the estimate of A

2

L

?

2

we need to �nd the smallest positive eigenvalue

of the matrix E. The eigenve tors and eigenvalues of E are

j

k

i = �

k

L

X

j=0

os

�

q

k

�

j +

1

2

�

�

jji; �

k

= 1� os q

k

;


where q

k

= �k=(L+ 1) (k = 0; : : : ; L). From this it follows that

(14.16) A

2

L

?

2

� 1� os

�

�

L+ 1

�

�

0

L

�2

:

Finally, we need to estimate the angle between the subspa es L

1

and L

2

.

We will estimate the square of the osine of this angle,

(14.17) os

2

# = max

j�

1

i2L

1

j�

2

i2L

2

�

�

h�

1

j�

2

i

�

�

2

= max

j�

2

i2L

2

h�

2

j�

L

1

j�

2

i:

Sin e the ve tor j�

2

i belongs to L

2

, it an be represented in the form j�

2

i =

j�i j i ( f. (14.14)). A ording to formula (14.13), the proje tion onto

L

1

breaks into the sum of three proje tions. It is easy to al ulate the

ontribution of the se ond term; it equals (L � 1)=(L + 1). The �rst and

third terms add up to

1

L+ 1

�

�

�

�

�

K

1

+�

K

2

�

�

�

�

�

�

1 + os'

L+ 1

;

where K

1

= B

N

, K

2

= U

y

�

j1i B

(N�1)

�

and ' is the angle between

these two subspa es. (Here we have used inequality (14.12), obtained in the

ourse of the proof of Lemma 14.4.)

The quantity os

2

' equals the maximum probability for the initial ir-

uit to produ e the out ome 1. By hypothesis this probability is not greater

than ". So we an ontinue the estimate (14.17):

h�

2

j�

L

1

j�

2

i �

L� 1

L+ 1

+

1 +

p

"

L+ 1

= 1�

1�

p

"

L+ 1

:

Consequently, sin

2

# = 1� os

2

# � (1�

p

") =(L+ 1) as asserted above.

14.4.5. Realization of the ounter. We wrote a ni e Hamiltonian al-

most satisfying the required properties. It has but one short oming | the

ounter is not a qubit. We ould, of ourse, represent it by O(logL) qubits,

but then the Hamiltonian would be only O(logL)-lo al, not O(1)-lo al.

This short oming an be removed if we embed the ounter spa e in a

larger spa e. We take L qubits, enumerated from 1 to L. The suitable

embedding C

L+1

! B

L

is

jji 7! j 1; : : : ; 1

| {z }

j

; 0; : : : ; 0

| {z }

L�j

i:


The operators on the spa e C

L+1

used in the onstru tion of the Hamiltonian

H are repla ed in a ordan e with the following s heme:

(13.23)

j0ih0j on �

(0)

1

; j0ih1j on

�

j0ih1j

�

1

�

(0)

2

;

jjihjj on �

(1)

j

�

(0)

j+1

; jj�1ihjj on �

(1)

j�1

�

j0ih1j

�

j

�

(0)

j+1

;

jLihLj on �

(1)

L

; jL�1ihLj on �

(1)

L�1

�

j0ih1j

�

L

:

Now they are 3-lo al (and the Hamiltonian itself, a ting also on the qubits

of the initial ir uit, is 5-lo al).

To be more pre ise, we have repla ed the Hamiltonian H, a ting on the

spa e L = B

N

C

L+1

, by a new Hamiltonian H

ext

, de�ned on the larger

spa e L

ext

= B

N

B

L

. The operator H

ext

maps the subspa e L � L

ext

into itself and a ts on it just as H does.

Now a new problem arises: what to do with the extra states in the

extended spa e of the ounter? We will ope with this problem by adding

still another term to the Hamiltonian H

ext

:

H

stab

= I

B

N

L�1

X

j=1

�

(0)

j

�

(1)

j+1

:

The null subspa e of the operator H

stab

oin ides with the old working spa e

L, so that the supplementary term does not hange the upper bound for the

minimum eigenvalue for the answer \yes".

For the answer \no" the required lower bound for the eigenvalues of the

operator H

ext

+ H

stab

an be re overed in the following way. Both terms

leave the subspa e L invariant, so that we an also examine the a tion of

H

ext

+ H

stab

on L and on its orthogonal omplement L

?

independently.

On L we have H

ext

� (1 �

p

")L

�3

and H

stab

= 0, and on L

?

we have

H

ext

� 0 and H

stab

� 1. (Here we use the fa t that ea h of the terms

of the Hamiltonian, (14.4), (14.5) and (14.6), remains nonnegative for the

hange (13.23).) In both ases

H

ext

+H

stab

� (1�

p

")L

�3

:

This ompletes the proof of Theorem 14.3.

14.5. The pla e of BQNP among other omplexity lasses. It fol-

lows dire tly from the de�nition that the lass BQNP ontains the lass MA

(and so also BPP and NP). Nothing more de�nitive an be said at present

about the strength of \nondeterministi quantum algorithms".

12

Nor an mu h more be said about its \weakness".

12

Caveat: in the literature there is also a di�erent de�nition of \quantum NP", for whi h a

omplete hara terization in terms of lassi al omplexity lasses an be obtained ( f. [2, 73℄).


Proposition 14.5. BQNP � PSPACE.

Proof. The maximum probability that Merlin's message will be a epted

by Arthur is equal to the maximum eigenvalue of the operator X = X

(1)

( f. formula (14.3)). We will need to ompute this quantity with pre ision

O(n

��

), � > 0.

We note that 0 � X � 1. For the estimate of the maximum eigenvalue

we will use the following asymptoti equality:

ln�

max

= lim

d!1

lnTrX

d

d

:

Let �

max

= �

1

� �

2

� � � � � �

2

m

be the eigenvalues of the operator X (here

m = poly(n) is the length of the message). We have the inequality

ln�

max

�

lnTrX

d

d

=

ln

P

2

m

j=1

�

d

j

d

� ln�

max

+

m

d

ln 2:

Therefore, in order to estimate �

max

with polynomial pre ision, it suÆ es

to ompute the tra e of the d-th power of X, with d polynomial in m.

The omputation of the quantity TrX

d

is a hieved with polynomial

memory by the same means that was used to simulate a quantum ir uit. �

Remark 14.2. The result obtained an be strengthened: BQNP � PP.

The proof is ompletely analogous to the solution of Problem 9.5.

Remark 14.3. We have limited ourselves to the ase of Arthur-Merlin

games in whi h only one message is ommuni ated. In general, Arthur and

Merlin an play several rounds, sending messages ba k and forth. Re ently

it has been shown [72, 38℄ that su h a quantum game with three messages

(i.e., 1:5 rounds) has the same omplexity as the game with polynomially

many rounds. The orresponding omplexity lass is alled QIP; it on-

tains PSPACE. This ontrasts with the properties of lassi al Arthur-Merlin

games. In the lassi al ase, the game with polynomially many rounds yields

PSPACE [58, 59℄. But in wide ir les of narrow spe ialists the opinion pre-

vails that no �xed number of rounds would suÆ e.

13

15. Classi al and quantum odes

In this se tion we explain the on ept of error- orre ting ode, in its lassi al

and quantum formulations. Our exposition does not go beyond de�nitions

and basi examples; we do not address the problem of �nding odes with

optimal parameters. The interested reader is referred to [47℄ ( lassi al odes)

and [16℄ (quantum odes).

13

The game with an arbitrary onstant number of rounds orresponds to a omplexity lass

AM � �

2

[6℄. It is widely believed that \the polynomial hierar hy does not ollapse", i.e.,

o-NP = �

1

� �

2

� �

3

� � � � � PSPACE (the in lusions are stri t).


First, a bit of motivation. As dis ussed earlier, quantum omputation

is \not too" sensitive to errors in the realization of unitary operators: er-

rors a umulate linearly (see, e.g., the result of Problem 9.2). Therefore, a

physi al implementation of elementary gates with pre ision Æ will allow one

to use ir uits of size L � Æ

�1

. But this is not enough to make quantum

omputation pra ti al. Therefore a question arises: is it possible to avoid

a umulation of errors by using ir uits of some spe ial type? More spe i�-

ally, is it possible to repla e an arbitrary quantum ir uit by another ir uit

that would realize the same unitary operator (or ompute the same Boolean

fun tion), but in an error-resistant fashion?

The answer to this question is aÆrmative. In fa t, the new ir uit will

resist not only ina urate realization of unitary gates but also some intera -

tion with the environment and sto hasti errors (provided that they o ur

with small probability). The rough idea is to en ode (repla e) ea h qubit

used in the omputation (logi al qubit) by several physi al qubits; see Re-

mark 8.3 on page 74. The essential fa t is that errors usually a�e t only

few qubits at a time, so that en oding in reases the stability of a quantum

state.

Organization of omputation in a way that prevents a umulation of er-

rors is alled fault-tolerant omputation. In the lassi al ase, fault-toleran e

an be a hieved by the use of the repetition ode: 0 is en oded by (0; : : : ; 0)

(n times), and 1 is en oded by (1; : : : ; 1). Su h a simple ode does not work

in the quantum ase, but more ompli ated odes do. The �rst method of

fault-tolerant quantum omputation was invented by P. Shor [65℄ and im-

proved independently by several authors [42, 36℄. Alternative approa hes

were suggested by D.Aharonov and M.Ben-Or [3℄ and A.Kitaev [35℄.

Fault-toleran e is a rather diÆ ult subje t, but our goal here is more

modest. Suppose we have a quantum state of n qubits that is subje ted

to an error. Under what ondition is it possible to re over the original

state, assuming that the exe ution of the re overy pro edure is error-free?

(The fault-tolerant omputation deals with the more realisti situation where

errors o ur onstantly, though at a small rate.) Of ourse, error re overy

is not possible for a general state j�i 2 B

n

. However, it an be possible

for states j�i 2 M, where M � B

n

is a suitable �xed subspa e. Likewise,

in the lassi al ase we should onsider states that belong to a �xed subset

M � B

n

.

De�nition 15.1. A lassi al ode of type (n;m) is a subset M � B

n

whi h

onsists of 2

m

elements (where m | the number of en oded bits | is not

ne essarily integral). Elements of M are alled odewords.

A quantum ode of type (n;m) is a subspa e M � B

n

of dimension

2

m

. Elements of M are alled odeve tors.


Remark 15.1. In the theory of fault-tolerant quantum omputation, a

slightly di�erent kind of odes is used. Firstly, an en oding must be spe i-

�ed, i.e., the subspa e M must be identi�ed with a �xed spa e L; usually,

L = B. In other words, an en oding is an isometri embedding V : L ! B

n

su h that M = ImV . Se ondly, sometimes one needs to onsider one-to-

many en odings (be ause errors happen and get orre ted onstantly, so at

any moment there are some errors that have not been orre ted yet). A

one-to-many en oding is an isometri embedding V : L F ! B

n

, where

F is some auxiliary spa e.

Besides the ode, we need to de�ne an error model. It is also alled om-

muni ation hannel : one may think that errors o ur when a state ( las-

si al or quantum) is transferred from one lo ation to another. Intuitively,

this should be something like a multivalued map B

n

! B

n

0

or B

n

! B

n

0

(where n

0

is the number of bits at the output of the hannel; usually, n

0

= n).

We begin with the lassi al ase and then onstru t the quantum de�nition

by analogy.

15.1. Classi al odes. There are two models of errors: a more realisti

probabilisti model and a simpli�ed set-theoreti version. A ording to the

probabilisti model, a ommuni ation hannel is given by a set of onditional

probabilities p(yjx) for re eiving the word y upon transmission of the word x.

We will onsider the ase of independently distributed errors, where n

0

= n,

and the onditional probabilities are determined by the probability p

1

of an

error (bit ip) in the transmission of a single bit:

(15.1) p(yjx) = p

d(x;y)

1

(1� p

1

)

n�d(x;y)

:

Here d(x; y) is the Hamming distan e | the number of distin t bits.

There is a standard method for simplifying a probabilisti error model by

lassifying errors as \likely" and \unlikely". Let us estimate the probability

that in the model de�ned above, more than k bit ips o ur (as is lear from

formula (15.1), this probability does not depend on x). Suppose that n and

k are �xed, whereas p

1

! 0. Then

(15.2) Pr

�

number of bit ips > k

�

=

X

j>k

�

n

j

�

p

j

1

(1� p

1

)

n�j

= O(p

k+1

1

):

Thus the probability that more than k bit ip is small. So we say that

this event is unlikely; we an greatly simplify the model by assuming that

su h an event never happens. We will suppose that, upon transmission of

the word x, some word y is re eived su h that d(x; y) � k. This simpli�ed

model only de�nes a set of possible (or \likely") out omes but says nothing

about their probabilities.

We introdu e the notation:


N = B

n

| set of inputs,

N

0

= B

n

0

| set of outputs,

E � N �N

0

| set of transitions

(i.e., set of errors),

E(n; k) =

�

(x; y) : d(x; y) � k

.

De�nition 15.2. A ode M orre ts errors from a set E � N � N

0

if for

any x

1

; x

2

2 M , (x

1

; y

1

) 2 E, (x

2

; y

2

) 2 E, the ondition x

1

6= x

2

implies

that y

1

6= y

2

.

In the parti ular ase E = E(n; k), we say that the ode orre ts k errors.

Remark 15.2. The term \error orre ting ode" is impre ise. It would

be more a urate to say that the ode o�ers the possibility for orre ting

errors. An error- orre ting transformation is a map P : N

0

! N su h that,

if (x; y) 2 E and x 2M; then P (y) = x.

Example 15.1. The repetition ode of type (3; 1):

M

3

=

�

(0; 0; 0); (1; 1; 1)

� B

3

:

Su h a ode will orre t a single error.

An obvious generalization of this example leads to lassi al odes M

n

of

type (n; 1) whi h orre t k = b(n � 1)=2 errors (see below). We will also

onstru t more interesting examples of lassi al odes. To start, we give yet

another standard de�nition.

De�nition 15.3. The ode distan e is

d(M) = min

�

d(x

1

; x

2

) : x

1

; x

2

2M; x

1

6= x

2

:

For the ode M

3

of Example 15.1 the ode distan e is 3.

Proposition 15.1. A ode M orre ts k errors if and only if d(M) > 2k.

Proof. A ording to De�nition 15.2, the ode does not orre t k errors

if and only if there exist x

1

; x

2

2 M (x

1

6= x

2

) and y 2 B

n

su h that

d(x

1

; y) � k and d(x

1

; y) � k. For �xed x

1

; x

2

, su h a y exists if and only if

d(x

1

; x

2

) � 2k. �

15.2. Examples of lassi al odes.

1. The repetition ode M

n

of type (n; 1) and distan e n:

M

n

=

�

(0; : : : ; 0

| {z }

n

); (1; : : : ; 1

| {z }

n

)

:

This ode an be used with the obvious en oding: we repeat a single bit

n times. To restore the odeword after an error, we repla e the value of


the bits with the value that o urs most frequently. This series of odes,

as will be shown later, does not generalize to the quantum ase.

2. Parity he k: this is a ode of type (n; n� 1) and distan e 2. It onsists

of all even words, i.e., of words ontaining an even number of 1s.

3. The Hamming ode H

r

. This ode is of type (n; n�r), where n = 2

r

�1.

It is de�ned as follows.

Elements of B

n

are sequen es of bits x = (x

�

: � = 1; : : : ; n).

In turn, the index of ea h bit an be represented in binary as � =

(�

1

; : : : ; �

r

). We introdu e a set of he k sums �

j

: B

n

! B (j =

1; : : : ; r) and de�ne the Hamming ode by the ondition that all the

he k sums are equal to 0:

�

j

(x) =

X

�:�

j

=1

x

�

mod 2;

H

r

=

�

x 2 B

2

r

: �

1

(x) = � � � = �

r

(x) = 0

:

For example, the Hamming odeH

3

is de�ned by the system of equations

x

100

+ x

101

+ x

110

+ x

111

= �

1

(x) = 0;

x

010

+ x

011

+ x

110

+ x

111

= �

2

(x) = 0;

x

001

+ x

011

+ x

101

+ x

111

= �

3

(x) = 0:

where (mod 2) arithmeti is assumed.

We will see that the Hamming ode has distan e d(H

r

) = 3 for any

r � 2.

15.3. Linear odes. The set N = B

n

an be regarded as the n-dimensio-

nal linear spa e over the two-element �eld F

2

. A linear ode is a linear

subspa e M � F

n

2

. All the examples given above are of this kind. A linear

ode of type (n;m) an be de�ned by a dual basis, i.e., a set of n � m

linearly independent linear forms ( alled he k sums) whi h vanish on M .

The oeÆ ients of the he k sums onstitute rows of the he k matrix. For

example, the he k matrix of the Hamming ode H

3

is

2

4

0 0 0 1 1 1 1

0 1 1 0 0 1 1

1 0 1 0 1 0 1

3

5

:

Proposition 15.2. The ode distan e of a linear ode equals the minimum

number of distin t olumns of the he k matrix that are linearly dependent.

Proof. A linear dependen y between the olumns of the he k matrix is a

nonzero odeword. If a subset S � f1; : : : ; ng of olumns is dependent, the

orresponding word x 2M has nonzero symbols only at positions � 2 S (and

vi e versa). Thus, if k olumns are dependent, then there is x 2M , x 6= 0,

su h that d(x; 0) � k. Therefore d(M) � k. Conversely, if d(x

1

; x

2

) � k for


some x

1

; x

2

2 M (x

1

6= x

2

), then x = x

2

� x

1

2 M , x 6= 0, d(x; 0) � k,

hen e k (or fewer) olumns are linearly dependent. �

The olumns of the he k matrix of the Hamming ode H

r

orrespond to

the nonzero elements of F

r

2

. Any two olumns are di�erent, hen e they are

linearly independent. On the other hand, the sum of the �rst three olumns

is 0 (for r � 2). Therefore the ode distan e is 3.

15.4. Error models for quantum odes. Firstly, we will de�ne a quan-

tum analogue of the transition set E � N �N

0

. This is an arbitrary linear

subspa e E � L(N ;N

0

), alled an error spa e. There is also an analogue

of the set E(n; k). Let us assume that N = N

0

= B

n

. For ea h subset of

qubits A � f1; : : : ; ng, let E [A℄ be the set of linear operators that a t only

on those qubits and do not a�e t the remaining qubits (su h an operator

has the form X[A℄). Then we de�ne the spa e

E(n; k) =

X

A: jAj�k

E [A℄;

where we take the sum of linear subspa es:

P

j

L

j

=

n

P

j

X

j

: X

j

2 L

j

o

.

In the sequel we will be interested in the possibility of orre ting errors from

the spa e E(n; k).

Next, we will introdu e a physi al model of quantum errors, whi h has

no lassi al analogue. Consider a system of n qubits whi h intera t with

the environment ( hara terized by a Hilbert spa e F). We assume that the

intera tion is des ribed by the Hamiltonian

(15.3) H = H

0

+ V; H

0

= I

B

n Z; V =

n

X

j=1

X

�2fx;y;zg

�

�

j

B

j�

:

Here �

�

j

= �

�

[j℄ denotes the appli ation of the Pauli matrix �

�

(see (8.4))

to the j-th qubit; Z;B

j�

2 L(F) are Hermitian operators a ting on the

environment: B

j�

des ribes intera tion of the environment with the j-th

qubit, whereas Z is the Hamiltonian of the environment itself. It is an

important assumption that qubits intera t with the environment by small

groups. The Hamiltonian (15.3) ontains only one-qubit terms (�

�

j

B

j�

),

but we ould also in lude two-qubit (�

�

j

�

�

l

B

(2)

j�l�

) and higher-order terms,

up to some onstant order.


If the intera tion lasts for some time � , it results in the evolution of the

quantum state by the unitary operator

U = exp(�i�H) = e

�i�(H

0

+V )

= lim

N!1

�

e

�i

�

N

H

0

�

1� i

�

N

V

��

N

= lim

N!1

e

�i�H

0

�

1� i

�

N

V

�

N�1

N

�

�

�

� � �

�

1� i

�

N

V

�

1

N

�

�

� �

1� i

�

N

V (0)

�

;

where V (t) = e

itH

0

V e

�itH

0

. We an expand this expression in powers of

V , repla ing sums by intergrals in the N ! 1 limit. Thus we obtain the

following result:

(15.4)

U = exp(�i�H) =

1

X

k=0

X

k

; X

k

2 E(n; k) L(F);

X

k

= e

�i�H

0

0

�

(�i)

k

Z

� � �

Z

0<t

1

<��<t

k

<�

V (t

k

) � � � V (t

1

) dt

1

� � � dt

k

1

A

;

where

V (t) = e

itH

0

V e

�itH

0

=

X

j;�

�

�

j

B

j�

(t); B

j�

(t) = e

itZ

B

j�

e

�itZ

:

Suppose that the intera tion of the qubits with the environment is small,

(15.5) kB

j�

k �

Æ

3�

:

Then we an obtain an upper bound for the norm of ea h term X

k

in (15.4),

kX

k

k �

n

k

k!

Æ

k

. Therefore U is approximated by an operator U

(k)

su h that

(15.6) kU � U

(k)

k � O(Æ

k+1

); U

(k)

2 E(n; k) L(F):

Namely, U

(k)

=

P

k

l=0

X

l

(note that U

(k)

is not unitary). If errors from

the spa e E(n; k) are re overable (assuming that the initial state belongs to

MF , where M is a suitable ode), then the error- orre ting pro edure

will an el the e�e t of U with pre ision O(Æ

k+1

).

Finally, we will de�ne a quantum version of the model of independent

errors. Let us assume that the quantum state of n qubits undergoes the

transformation that is des ribed by the physi ally realizable superoperator

T = T

n

1

, where kT

1

� Ik

}

� Æ (see Se tion 11.5 for the de�nition of the

superoperator norm k � k

}

). Let T

1

� I = R; then T = (I + R)

n

. This is

essentially a spe ial ase of the model des ribed by formulas (15.3){(15.4):

ea h qubit intera ts with its own pie e of environment, whi h is initially

not entangled with the rest of the system and is dis arded after the a tion

of the operator U . However, the ondition kT

1

� Ik

}

� Æ is di�erent from

ondition (15.5), so we need to onsider this model separately.


One an obtain an estimate that is analogous to (15.2) or (15.6). Let us

write the de omposition

T = (I +R)

n

=

X

A: jAj�k

R

A

I

(f1;:::;ngnA)

| {z }

T

(k)

+

X

A: jAj>k

R

A

I

(f1;:::;ngnA)

| {z }

P

:

The �rst term T

(k)

an be represented as

P

p

X

p

�Y

y

p

, where X

p

; Y

p

2 E(n; k).

So we may write symboli ally T

(k)

2 E(n; k) � E(n; k)

y

. Using the properties

of the superoperator norm, we estimate the norm of the remaining term,

kPk

}

�

X

j>k

�

n

j

�

kRk

j

}

= O(Æ

k+1

):

The model of independent errors in ludes two extreme ases. If T =

U � U

y

(where U is unitary, kU � Ik � Æ=2 ), the errors are alled oherent.

In the ase where

T = (1� p)I � I +

X

j

p

j

U

j

� U

y

j

; U

y

j

U

j

= I; p =

X

j

p

j

� Æ=2;

the errors are alled sto hasti , indi ating that they an be des ribed in

terms of probability rather than operator or superoperator norms.

15.5. De�nition of quantum error orre tion. Following the lassi-

al analogy, we would like to say that quantum errors are re overable if

they take distin t odeve tors to distin t odeve tors. However, the general

philosophy of quantum me hani s suggests that we repla e \distin t" by

\orthogonal".

De�nition 15.4. A quantum ode (a subspa e M � N ) orre ts errors

from E � L(N ;N

0

) if

(15.7) 8j�

1

i; j�

2

i 2 M 8X;Y 2 E

�

h�

2

j�

1

i = 0

�

)

�

h�

2

jY

y

Xj�

1

i = 0

�

:

In the ase where E = E(n; k), one says that the ode orre ts k errors.

De�nition 15.5. Let M � N and E � L(N ;N

0

). A physi ally realizable

superoperator P : L(N

0

)! L(M) is alled an error- orre ting transforma-

tion for the ode M and the error spa e E if

8T 2 E � E

y

9 = (T ) 8� 2 L(M) PT� = (T )�:

Note that if T is tra e-preserving, then (T ) = 1.

Theorem 15.3. If the odeM orre ts errors from E, then an error- orre t-

ing transformation exists.

A proof will be given below. The onverse assertion is proved in [36℄.


Example 15.2. Trivial ode of type (n;m): let M = B

m

j0

n�m

i and

E = E [m+ 1; : : : ; n℄, i.e., the �rst m qubits are used for the oding whereas

the errors a t on the other qubits. Condition (15.7) is learly satis�ed. For

the role of error- orre ting transformation we an take P = I

L(B

m

)

R,

where R : X 7! (TrX) j0

n�m

ih0

n�m

j. The transformation P is realized very

simply: we dis ard the last n �m qubits and repla e them by new qubits

in the state j0i. There is, of ourse, little pra ti al use for su h a ode. It

is interesting, however, that any error- orre ting quantum ode has, in a

ertain sense, the same stru ture as the trivial one ( f. Lemma 15.5 below).

Example 15.3. We examine a quantum analog of the repetition ode:

M

z

n

= C

�

j0; : : : ; 0i; j1; : : : ; 1i

�

. It omes with the standard en oding V

z

n

:

jai 7! ja; : : : ; ai. (The index z in the notation indi ates that we opy a

qubit relative to the basis whi h onsists of the eigenve tors of �

z

, i.e., the

standard basis.) Consider two ve tors, j�

1

i = j0; : : : ; 0i + j1; : : : ; 1i and

j�

2

i = j0; : : : ; 0i � j1; : : : ; 1i, and two errors X;Y 2 E(n; 1), namely, X = I,

Y = �

z

[j℄ (where j is arbitrary). It is obvious that Y j�

2

i = Xj�

1

i = j�

1

i 6=

0, whi h ontradi ts ondition (15.7). We see that the repetition ode of

any size does not prote t against a one-qubit error.

Thus the existen e of nontrivial quantum odes is far from obvious. See

formula (15.11) for the simplest example.

In De�nition 15.4 the statement was only about pairs of orthogonal

states. However, we an infer a onsequen e regarding an arbitrary pair of

states. Let us �x X;Y 2 E and set Z = Y

y

X. Then

(15.8) 8j�

1

i; j�

2

i 2 M h�

2

jZj�

1

i = (Z)h�

2

j�

1

i;

where (Z) is some omplex number, independent of j�

1

i; j�

2

i. Indeed, let

j�

1

i; : : : ; j�

m

i be an orthonormal basis for the subspa e M. By De�ni-

tion 15.4, h�

j

jZj�

k

i = 0 when j 6= k. It also follows that h�

j

jZj�

j

i does

not depend on j, sin e

h�

j

jZj�

j

i � h�

k

jZj�

k

i = h�

j

� �

k

jZj�

j

+ �

k

i+ h�

k

jZj�

j

i � h�

j

jZj�

k

i = 0:

(All three terms on the right-hand side of the equality are equal to zero,

sin e the ve tors that enter into them are orthogonal.)

Remark 15.3. To understand the meaning of ondition (15.8), let us put

it into this form:

(15.9) 8j�i 2 M �

M

Zj�i = (Z)j�i:

We may repla e the proje tor �

M

by a measurement that distinguishes

\good" states (j i 2 M) from \bad" states (j i ? M). Loosely speaking,

formula (15.9) says that if the odeve tor j�i is a ted upon by Z and still

a epted as \good" (whi h happens with probability j (Z)j

2

), it remains


inta t. Thus any possible damage to the odeve tor aused by the error Z

is being dete ted.

We summarize the above dis ussion as follows.

De�nition 15.6. A quantum ode M� N dete ts an error

14

Z 2 L(N ) if

there exists some = (Z) 2 C su h that

8j�

1

i; j�

2

i 2 M h�

2

jZj�

1

i = (Z)h�

2

j�

1

i:

The ode distan e is the smallest number d = d(M) for whi h the ode does

not dete t errors from the spa e E(n; d).

Proposition 15.4. A ode M � N orre ts errors from E � L(N ;N

0

) if

and only if it dete ts errors from the spa e

E

y

E =

n

X

p

Y

y

p

X

p

: X

p

; Y

p

2 E

o

:

In parti ular, a ode M� B

n

orre ts k errors if and only if d(M) > 2k.

The �rst part of the proposition has been already proved. The se ond

part follows from the fa t that E(n; k)

y

E(n; k) = E(n; 2k).

We now pro eed to the proof of Theorem 15.3.

Lemma 15.5. Let a quantum ode M � N orre t errors from a subspa e

E � L(N ;N

0

). Then there exist a Hilbert spa e F , an isometri embedding

V :MF ! N

0

and a linear map f : E ! F su h that

(15.10) 8X 2 E 8j�i 2 M Xj�i = V

�

j�i jf(X)i

�

:

Proof. Let E

0

=

�

X 2 E : 8j�i 2 M Xj�i = 0

. Consider the quotient

spa e F = E=E

0

together with the natural map f : E ! F . The very

de�nition of F and f implies the existen e of a linear map V :MF ! N

0

that satis�es (15.10). It is only ne essary to he k that V is an isometry.

An inner produ t on the spa e F an be de�ned with the aid of the

fun tion from property (15.8) of the ode: if j�

1

i = jf(X)i and j�

2

i =

jf(Y )i, then h�

2

j�

1

i = (Y

y

X). It is lear that this quantity depends only

on j�

1

i and j�

2

i rather than on the parti ular hoi e of X and Y . It is also

lear that h�j�i > 0 if j�i 6= 0. Formula (15.8) shows at on e that the map

V is an isometry. �

Proof of Theorem 15.3. We de ompose the spa e N

0

into the sum of two

orthogonal subspa es: N

0

= (Im V )�K, where V is the map of the pre eding

14

Admittedly, this terminology may be onfusing. For example, the identity operator is

\dete ted" by any ode. As explained above, it is more adequate to say that the ode dete ts the

error-indu ed damage, if any. But this sounds too awkward.


lemma. Let W : K ! N

0

be the in lusion map, and R : L(K) ! L(M) an

arbitrary physi ally realizable superoperator. Then we de�ne

P : � 7! Tr

F

(V

y

�V ) +R(W

y

�W ); : X � Y

y

7! hf(Y )jf(X)i:

The fun tion extends to the spa e E � E

y

by linearity. �

Lemma 15.5 and the proof of Theorem 15.3 an be explained in the

following way. An error- orre ting ode is hara terized by the property

that the error does not mix with the en oded state, i.e., it remains in the

form of a separate tensor fa tor j�i = f(X) 2 F . (Using the terminology of

Remark 15.1, we may say that the original state j�i 2 M gets en oded with

the one-to-many en oding V .) The orre ting transformation extra ts the

\built-in" error j�i and deposits it in the trash bin.

[1!℄ Problem 15.1. Let the odeM� B

n

dete t errors from E(A). Prove

that the state � 2 L(M) an be restored without using qubits from the set

A.

15.6. Shor's ode. Following Shor [64℄, we de�ne a series of quantum

odes with arbitrary large distan e. The r-th member of this series en odes

one logi al qubit into r

2

physi al qubits; the distan e of this ode equals r.

The idea is to �x the repetition ode M

z

n

de�ned above. We have seen

that it fails to orre t a one-qubit error of the form �

z

j

= �

z

[j℄. Operators

generated by �

z

1

; : : : ; �

z

n

(i.e., ones that preserve the basis ve tors up to a

phase), are alled phase errors. Still, the repetition ode prote ts against

lassi al errors | operators of the form �

x

j

and their produ ts (as well as

linear ombinations of su h produ ts). Spe i� ally, the n-qubit repetition

ode M

z

n

orre ts lassi al errors that a�e t at most b(n � 1)=2 qubits.

However, phase errors and lassi al errors are onjugate to ea h other, sin e

�

x

= H�

z

H. Therefore the \dual repetition ode" M

x

n

, de�ned by the

en oding

V

x

n

: j�

a

i 7! j�

a

i � � � j�

a

i; j�

a

i = Hjai;

jai 7! 2

�(n�1)=2

P

y

1

;:::;y

n

y

1

+��+y

n

�a (mod 2)

jy

1

; : : : ; y

n

i

(a = 0; 1);

will prote t against phase errors (but not lassi al errors).

We may try to ombine the two odes: �rst we en ode the logi al qubit

with V

x

r

; then we en ode ea h of the resulting qubit with V

z

r

. (Su h om-

position of two odes is alled a on atenated ode.) Thus we obtain the

en oding V = (V

z

r

)

r

V

x

r

: B ! B

r

2

. Inasmu h as the number of physi-

al qubits is an exa t square, it is onvenient to write basis ve tors of the


orresponding spa e as matri es, e.g.,

�

�

�

�

x

11

� � � x

1r

: : : : : : : : : : : : :

x

r1

� � � x

rr

�

. In this notation the

en oding V assumes the form jai 7! j�

a

i, where

(15.11) j�

a

i = 2

�(r�1)=2

X

y

1

;:::;y

r

2F

2

y

1

+��+y

r

=a

�

�

�

�

�

y

1

� � � � � � y

1

y

2

� � � � � � y

2

: : : : : : : : : : : : : : : :

y

r

� � � � � � y

r

+

(a = 0; 1):

To analyze the Shor ode, we will de ompose errors over a basis of op-

erators built on Pauli matri es. More pre isely, the basis of the spa e L(B)

onsists of the identity operator I and the three Pauli matri es. We intro-

du e nonstandard notation for these matri es:

�

00

=

�

1 0

0 1

�

= I; �

01

=

�

1 0

0 �1

�

= �

z

;

�

10

=

�

0 1

1 0

�

= �

x

; �

11

=

�

0 �i

i 0

�

= �

y

:

These operators are remarkable in that they are unitary and Hermitian

at the same time. The indexing we have introdu ed allows one to onve-

niently express the multipli ation rules and ommutation relations between

the basis operators:

�

��

�

�

0

�

0

= i

e!(�;�;�

0

;�

0

)

�

��

0

;��

0

; �

��

�

�

0

�

0

= (�1)

��

0

��

0

�

�

�

0

�

0

�

��

;

where e!(�; �

0

;�

0

; �) 2 Z

4

(see (15.16) below for an expli it formula). The

set of indi es forms the Abelian group G = Z

2

� Z

2

, whi h an also be

onsidered as a 2-dimensional linear spa e over the �eld F

2

.

The basis for L(B

n

) onsists of 4

n

operators,

(15.12) �(f) = �(�

1

; �

1

; �

2

; �

2

; : : : ; �

n

; �

n

)

def

= �

�

1

;�

1

�

�

2

;�

2

� � ��

�

n

;�

n

:

Here f 2 G

n

= F

2n

2

.

We now examine the error- orre ting properties of the Shor ode. Our

goal is to prove that its distan e is at least r, i.e., the ode dete ts r � 1

errors (in fa t, the distan e is pre isely r). By the linearity of the de�nition

it suÆ es to study errors of the form �(f). Su h an error an be de omposed

into a lassi al omponent and a phase omponent,

�(f) = �(f

(x)

)�(f

(z)

); f

(x)

= (�

1

; 0; �

2

; 0; : : : ); f

(z)

= (0; �

1

; 0; �

2

; : : : );

where is a phase fa tor. Sin e we assume that jf j < r (where jf j denotes

the number of nonzero pairs (�

j

; �

j

) ), we have jf

(x)

j; jf

(z)

j < r. It suÆ es

to show that for Z = �(f)

(15.13) h�

1

jZj�

0

i = 0; h�

1

jZj�

1

i = h�

0

jZj�

0

i:

Let us onsider two ases.


1. f

(x)

6= 0. The error Z = �(f

(x)

)�(f

(z)

) takes a basis ve tor to a basis

ve tor. It ips s = jf

(x)

j bits; in our ase 0 < s < r. The odeve tors of

the Shor ode are linear ombinations of spe ial basis ve tors: all bits

in ea h row are equal. Flipping s bits breaks this spe ial form; therefore

h�

a

jZj�

b

i = 0.

2. f

(x)

= 0. The error Z = �(f

(z)

) =

Q

j;k

(�

z

jk

)

�

jk

multiplies ea h basis

ve tor by �1. The spe ial basis ve tors in (15.11) are transformed as

follows:

Z

�

�

�

�

�

y

1

� � � � � � y

1

y

2

� � � � � � y

2

: : : : : : : : : : : : : : : :

y

r

� � � � � � y

r

+

= (�1)

P

j

�

j

y

j

�

�

�

�

�

y

1

� � � � � � y

1

y

2

� � � � � � y

2

: : : : : : : : : : : : : : : :

y

r

� � � � � � y

r

+

;

where �

j

=

P

k

�

jk

2 F

2

. Let j�j denote the number of nonzero ompo-

nents in the ve tor � = (�

1

; : : : ; �

r

) 2 F

r

2

. Then j�j � jf

(z)

j < r. There

are three possibilities:

a) � = (0; : : : ; 0). In this ase Zj�

b

i = j�

b

i, i.e., the error does not

a�e t odeve tors. Therefore h�

a

jZj�

b

i = Æ

ab

.

b) � = (1; : : : ; 1). This is a tually impossible sin e j�j < r.

) � 6= (0; : : : ; 0); (1; : : : ; 1). Then

h�

a

jZj�

b

i =

0

B

B

�

2

�(r�1)

X

y

1

;:::;y

r

2F

2

y

1

+��+y

r

=a

(�1)

P

j

�

j

y

j

1

C

C

A

Æ

ab

= 0:

15.7. The Pauli operators and symple ti transformations. The

onstru tion of the Shor ode uses the symmetry between �

x

and �

z

. We

will now study symmetries between �-operators in more detail.

As already mentioned, the Pauli matri es are onveniently indexed by

elements of the group G = (Z

2

)

2

. General �-operators (see (15.12)) are

indexed by = (�

1

; �

1

; : : : ; �

n

; �

n

) 2 G

n

. The �-operators form a basis in

L(B

n

). Moreover, L(B

n

) be omes a G

n

-graded algebra,

L(B

n

) =

M

2G

n

C

�

�( )

�

;

the operation y : X 7! X

y

preserves the grading.

The ommutation rules for the �-operators are as follows:

(15.14)

�(

1

)�(

2

) = (�1)

!(

1

;

2

)

�(

2

)�(

1

);

!(�

1

; �

1

; : : : ; �

n

; �

n

; �

0

1

; �

0

1

; : : : ; �

0

n

; �

0

n

) =

P

n

j=1

(�

j

�

0

j

� �

0

j

�

j

) mod 2:

The multipli ation rules are similar:

(15.15) �(

1

)�(

2

) = i

e!(

1

;

2

)

�(

1

+

2

); e! : G

n

�G

n

! Z

4

:


Obviously, !(

1

;

2

) = e!(

1

;

2

) mod 2.

To obtain an expli it formula for the one-qubit version of the fun tion

e!, we use the equation �

��

= i

��

�

�0

�

0�

. We express �

��

and �

�

0

�

0

this

way, and ommute �

0�

through �

�

0

0

. The result is as follows:

(15.16) e!(�; �;�

0

; �

0

) = �� + �

0

�

0

� (� � �

0

)(� � �

0

) + 2�

0

� mod 4:

Note that the inner sums in (15.16) are taken modulo 2, whereas the outer

sum is taken modulo 4. (For this reason, one annot expand the produ t

(� � �

0

)(� � �

0

) and an el the terms �� and �

0

�

0

.) Su h a mixture of

Z

2

-operations and Z

4

-operations is rather onfusing, so we prefer to write

the formula (15.16) in a di�erent form:

e!(�; �;�

0

; �

0

) = �

2

�

2

+ (�

0

)

2

(�

0

)

2

� (�+ �

0

)

2

(� + �

0

)

2

+ 2�

0

�:

In this ase, the value 0 2 Z

2

an be represented by either 0 2 Z

4

or 2 2 Z

4

,

whereas 1 an be represented by either 1 or 3 | the result will be the same.

Finally, we write down the general formula for the fun tion e!:

(15.17)

e!( ;

0

) = �( ) + �(

0

)� �( +

0

) + 2{( ;

0

);

�(�

1

; �

1

; : : : ; �

n

; �

n

) =

P

n

j=1

�

2

j

�

2

j

2 Z

4

;

{(�

1

; �

1

; : : : ; �

n

; �

n

; �

0

1

; �

0

1

; : : : ; �

0

n

; �

0

n

) =

P

n

j=1

�

0

j

�

j

2 Z

2

:

A unitary transformation is a superoperator of the form T = U � U

y

:

X ! UXU

y

. It is lear that U an be re onstru ted from T up to a phase

fa tor, so the group of unitary transformations on n qubits isU(B

n

)=U(1).

We note that the unitary transformations are pre isely the automorphisms of

the �-algebra L(B

n

) (i.e., the linear maps L(B

n

)! L(B

n

) that ommute

with the operator multipli ation and the operation y ).

We are interested in those transformations whi h preserve the grading of

L(B

n

) by the �-operators, i.e., U�( )U

y

= ( )�(u( )), where u( ) 2 G

n

and ( ) is a phase fa tor. Sin e both U�( )U

y

and �(u( )) are Hermitian,

( ) = �1. Thus we may write

(15.18) U�( )U

y

= (�1)

v( )

�(u( )); u : G

n

! G

n

; v : G

n

! Z

2

:

The group of su h transformations is alled the extended symple ti group

and is denoted by ESp

2

(n). The operators in this group will be alled

symple ti . We give some examples.

1. �-operators: �(f)�( )�(f)

y

= (�1)

!(f; )

�( ). In the ase at hand,

u( ) = .


2. The operators H andK from the standard basis. We immediately verify

that

H�

x

H

y

= �

z

; H�

y

H

y

= ��

y

; H�

z

H

y

= �

x

;

K�

x

K

y

= �

y

; K�

y

K

y

= ��

x

; K�

z

K

y

= �

z

:

Therefore the transformations H �H

y

and K �K

y

belong to ESp

2

(1). It

is easy to see that ESp

2

(1) � U(2)=U(1)

�

=

SO(3) is the group of rota-

tional symmetries of a ube; jESp

2

(1)j = 24. This group is generated by

the above transformations. The operators H andK themselves generate

the Cli�ord group of 24 �8 = 192 elements. Thus ESp

2

(1) is the quotient

of the Cli�ord group by the subgroup

�

i

k=2

I

B

: k = 0; : : : ; 7

�

=

Z

8

.

3. The ontrolled NOT | the operator U = �(�

x

)[1; 2℄. By de�nition we

have U ja; bi = ja; a + bi. The a tion of U on generators of the algebra

L(B

2

) is as follows:

U�

z

1

U

y

= �

z

1

; U�

x

1

U

y

= �

x

1

�

x

2

;

U�

z

2

U

y

= �

z

1

�

z

2

; U�

x

2

U

y

= �

x

2

:

These equations an be veri�ed without diÆ ulty by dire t al ulation.

However, it is useful to bring an explanation to the fore. The operator

�

z

j

is a phase shift that depends on the value of the orresponding qubit.

The equations in the left olumn show how these values hange under

the a tion of U . The �rst equation in the right olumn indi ates that

ipping the �rst qubit before applying U has the same e�e t as ipping

both qubits after the a tion of U . The last equation indi ates that

ipping the se ond qubit ommutes with U .

Let T = U � U

y

be an arbitrary symple ti transformation. The asso i-

ated fun tion u : G

n

! G

n

(see (15.18)) has the following properties:

1. u is linear.

2. u preserves the form !, i.e., !(u(f); u(g)) = !(f; g).

Maps with su h properties, as is known, are alled symple ti ; they form

the symple ti group Sp

2

(n) . It is lear that the orresponden e � : T 7! u,

� : ESp

2

(n)! Sp

2

(n) is a homomorphism of groups.

Theorem 15.6. Im � = Sp

2

(n), Ker � = G

n

(the kernel is the set of �-

operators). Therefore, ESp

2

(n)=G

n

�

=

Sp

2

(n).

For an understanding of the proof it is desirable to know something about

ohomology and extensions of groups [57℄. For the reader una quainted with

these on epts we have prepared a \roundabout way" (see below).

Proof. The transformation (15.18) must be an automorphism of the �-

algebra L(B

n

). This is the ase if and only if the multipli ation rules (15.15)


are preserved by the a tion of T = U � U

y

(the operation of taking the

Hermitian adjoint ommutes with T automati ally). This means that the

fun tion u has properties indi ated, and v satis�es the equation

(15.19) v(x+ y)� v(x)� v(y) = w(x; y);

where w(x; y) =

e!(u(x);u(y))�e!(x;y)

2

2 Z

2

.

In the ase where u is the identity map, the right-hand side of (15.19)

equals zero. The solutions are all linear fun tions; this proves that Ker � =

G

n

.

The assertion that Im � = Sp

2

(n) is equivalent to equation (15.19) having

a solution for any u 2 Sp

2

(n). To prove the existen e of the solution, we

note that the fun tion w has the following properties:

w(y; z) � w(x+ y; z) + w(x; y + z)� w(x; y) = 0;(15.20)

w(x; y) = w(y; x);(15.21)

w(x; x) = 0:(15.22)

Formula (15.20) is the o y le equation. It indi ates that the fun tion w

yields a group stru ture on the Cartesian produ t G

n

� Z

2

with the multi-

pli ation rule (x; p)�(y; q) = (x+y; p+q+w(x; y)). The group we obtain (we

denote it by E) is an extension of G

n

with Z

2

, i.e., there is a homomorphism

� : E ! G

n

with kernel Z

2

; the homomorphism is de�ned by � : (x; p) 7! x.

Equation (15.21) indi ates that the group E is Abelian. Finally, (15.22)

means that ea h element of the group E has order 2 or 1. Consequently,

E

�

=

(Z

2

)

2n+1

.

It follows that the extension E ! G

n

is trivial: there exists a homomor-

phism � : G

n

! E su h that �� = id

G

n

. Writing this homomorphism in

the form � : x 7! (x; v(x)); we obtain a solution to equation (15.19). �

There is another, somewhat ad ho way of proving that Im � = Sp

2

(n).

Consider the following symple ti transformations:

�

H �H

y

�

[j℄,

�

K �K

y

�

[j℄

and

�

�(�

x

) � �(�

x

)

y

�

[j; k℄. Their images under the homomorphism � gen-

erate the whole group Sp

2

(n). Idea of the proof: it is possible, using

these transformations, to take an arbitrary pair of ve tors

1

;

2

2 G

n

su h

that !(

1

;

2

) = 1 into (1; 0; 0; 0; : : : ) and (0; 1; 0; 0; : : : ). (See the proof of

Lemma 15.9 for an implementation of a similar argument.)

In fa t, in this way we an obtain another interesting result. The spe -

i�ed elements of the group ESp

2

(n) generate all the �-operators, i.e., the

kernel of the homomorphism �. Consequently, the following statement is

true:


Proposition 15.7. The group ESp

2

(n) is generated by the elements

�

H �H

y

�

[j℄;

�

K �K

y

�

[j℄;

�

�(�

x

) � �(�

x

)

y

�

[j; k℄:

15.8. Symple ti (stabilizer) odes. These are analogous to the las-

si al linear odes. The role of he k sums is played by the �-operators

�(�

1

; �

1

; : : : ; �

n

; �

n

).

For example, the Shor ode an be represented in this way. Re all

that this is a two-dimensional subspa e M � B

r

2

spanned by the ve -

tors (15.11).

What equations do ve tors of M satisfy?

1. For ea h j = 1; : : : ; r and k = 1; : : : ; r�1 we have �

z

jk

�

z

j(k+1)

j�i = j�i.

That is, j�i is a linear ombination of spe ial basis ve tors: ea h row

onsists of the repetition of a single bit.

2. For ea h j = 1; : : : ; r�1 we have

�

Q

r

k=1

�

�

x

jk

�

x

j(k+1)

�

�

j�i = j�i. What

does this rule mean? The operator

Q

r

k=1

�

�

x

jk

�

x

j(k+1)

�

ips all the bits

in the j-th and j+1-th rows,

�

�

�

�

�

: : : : : : : : : : :

y

j

: : : y

j

y

j+1

: : : y

j+1

: : : : : : : : : : :

+

7�!

�

�

�

�

�

: : : : : : : : : : : : : : : : :

y

j

+1 : : : y

j

+1

y

j+1

+1 : : : y

j+1

+1

: : : : : : : : : : : : : : : : :

+

(the bits in the other rows do not hange). These two basis ve tors must

enter j�i with the same oeÆ ients.

It is lear that if j�i satis�es onditions 1 and 2, then j�i =

0

j�

0

i+

1

j�

1

i,

where j�

a

i (a = 0; 1) are de�ned by (15.11).

We now give the general de�nition of symple ti odes, also alled stabi-

lizer odes. They were introdu ed (without name) in [17℄. We will �rst give

a noninvariant de�nition, whi h is easier to understand.

A symple ti quantum ode is a subspa e of the form

M =

n

j�i 2 B

n

: 8j X

j

j�i = j�i

o

; where

(15.23) X

j

= (�1)

�

j

�(f

j

); f

j

2 G

n

; �

j

2 Z

2

:

The operators X

j

must ommute with ea h other. They are alled he k

operators.

The requirement that the he k operators ommute is equivalent to the

ondition !(f

j

; f

k

) = 0. Indeed, the general ommutation relation for op-

erators of the form (15.23) is X

j

X

k

= (�1)

!(f

j

;f

k

)

X

k

X

j

. (Note that if

!(f

j

; f

k

) 6= 0 for some j and k, the subspa e M is empty; this is why we


have ex luded this ase from onsideration.) Without loss of generality we

may assume that the f

j

are linearly independent.

Note that di�erent hoi es of he k operators may orrespond to the

same ode. In fa t, the ode depends only on the subspa e F � G

n

spanned

by the ve tors f

j

, and on the fun tion � : F ! Z

2

, interpolating the values

�(f

j

) = �

j

so that the operators X

�

(f) = (�1)

�(f)

�(f) satisfy the ondition

X

�

(f + g) = X

�

(f)X

�

(g) = X

�

(g)X

�

(f):

Therefore it is preferable to use the following invariant de�nition.

De�nition 15.7. Let F � G

n

be an isotropi subspa e, i.e., !(f; g) = 0 for

any f; g 2 F . Also let a fun tion � : F ! Z

2

satisfy the equation

(15.24) �(f + g)� �(f)� �(g) = �(f; g); where �(f; g) =

e!(f; g)

2

2 Z

2

:

(The fun tion � is de�ned on pairs (f; g) for whi h !(f; g) = 0.) Then the

orresponding symple ti ode is

(15.25) SympCode(F; �)

def

=

n

j�i 2 B

n

: 8f 2 F �(f)j�i = (�1)

�(f)

j�i

o

:

Note that the restri tion of � to the subspa e F satis�es equations anal-

ogous to (15.20){(15.22); therefore equation (15.24) has a solution. In fa t,

there are 2

dimF

solutions; any two solutions di�er by a linear fun tion. We

all the orresponding odes ongruent.

Theorem 15.8. dim(SympCode(F; �)) = 2

n�dimF

. The ongruent odes

form an orthogonal de omposition of B

n

.

Lemma 15.9. By symple ti transformations, an arbitrary symple ti ode

SympCode(F; �) an be redu ed to a trivial one, for whi h the he k opera-

tors are �

z

[1℄; : : : ; �

z

[s℄ (s = dimF ).

Proof. Let f

1

2 F be a nonzero ve tor. The group Sp

2

(n) a ts transitively

on nonzero ve tors, so there is an element S

1

2 Sp

2

(n) su h that S

1

f

1

=

(0; 1; 0; 0; : : : ) = g

1

. The isotropi spa e S

1

F onsists of ve tors of the form

(0; �

1

; �

2

; �

2

; : : : ). Let F

1

� S

1

F onsist of those ve tors for whi h �

1

= 0.

Then S

1

F = F

2

(g

1

)� F

1

and F

1

� G

n�1

.

Iterating this argument, we �nd an S 2 Sp

2

(n) su h that

SF =

�

(0; �

1

; 0; �

2

; : : : ; 0; �

s

; : : : ; 0; 0; : : : ) : �

1

; : : : ; �s 2 F

2

:

By Theorem 15.6, S orresponds to some symple ti transformation

U � U

y

. It takes the ode F to a symple ti ode given by the he k op-

erators ��

z

[j℄, (j = 1; : : : ; s). All the signs an be hanged to \+" by

applying a supplementary transformation of the form �(f) � �(f)

y

. �


We now examine whether a symple ti ode SympCode(F; �) is apable

of dete ting k-qubit errors. Re all that the property of a ode to dete t an

error Z is given by formula (15.9). By linearity, it is suÆ ient to onsider

errors of the form Z = �(g), jgj � k.

Let j�i 2 SympCode(F; �), i.e., �(f)j�i = (�1)

�(f)

j�i for any f 2 F .

We denote j i = �(g)j�i. As we will now show, the ve tor j i belongs to

one of the ongruent odes SympCode(F; �

0

). Indeed,

�(f)j i = �(f)�(g)j�i = (�1)

!(f;g)

�(g)�(f)j�i = (�1)

!(f;g)+�(f)

�(g)j�i

= (�1)

�

0

(f)

j i;

where �

0

(f) = �(f) + !(f; g). We note that �

0

= � if and only if g 2 F

+

,

where

F

+

=

�

g 2 G

n

: 8f 2 F !(f; g) = 0

:

Obviously, F � F

+

.

For the error Z = �(g) there are three possibilities:

1. g =2 F

+

. Then �

0

6= �, hen e j i ? SympCode(F; �). The ode dete ts

su h an error.

2. g 2 F . In this ase j i = �(g)j�i = (�1)

�(g)

j�i. Su h an error is

indistinguishable from the identity operator, sin e it does not alter the

odeve tor (up to the onstant phase fa tor (�1)

�(g)

). Condition (15.9)

is ful�lled.

3. g 2 F

+

nF . As in the previous ase, �

0

= �, hen eM = SympCode(F; �)

is an invariant subspa e for the operator Z = �(g). However, the a tion

of Z on M is not multipli ation by a onstant. (Otherwise Z or �Z

ould be added to the set of he k operators without redu ing the ode

subspa e, whi h is impossible.) The ode does not dete t su h an error.

These onsiderations prove the following theorem.

Theorem 15.10. The ode M = SympCode(F; �) has distan e

d(M) = min

�

jf j : f 2 F

+

n F

:

We observe a di�eren e from lassi al linear odes. There the minimum

is taken over a subspa e with 0 ex luded, whereas for symple ti odes 0 is

repla ed by the nontrivial subspa e F .

Example 15.4. A symple ti ode of type (5; 1) that orre ts 1 error: the

subspa e F is generated by the rows of the matrix

2

6

6

4

1 0 1 0 0 1 0 1 0 0

1 0 0 1 1 0 0 0 0 1

0 1 0 0 0 1 1 0 0 1

0 1 0 1 0 0 0 1 1 0

3

7

7

5

:


It an be veri�ed that !(f

j

; f

k

) = 0 for any two rows f

j

, f

k

. We note that

the olumns of the matrix ome in pairs. If we take any two pairs, then the

orresponding four olumns are linearly independent. Consequently, for any

g 6= 0 supported by these olumns, there is a row f

j

su h that !(f

j

; g) 6= 0.

Thus the ode distan e is greater than 2. (In fa t, the distan e is 3 sin e

the �rst 6 olumns are linearly dependent.)

[3!℄ Problem 15.2. Prove that no quantum ode of type (4; 1) is apable

of orre ting a single error.

15.9. Tori ode. We introdu e an important example of a symple ti

ode. It is onstru ted as follows. Consider an r � r latti e on the torus.

We put a qubit on ea h of its edges. In this way we have 2r

2

qubits. The

he k operators will be of two types.

s

u

Fig. 15.1. Stabilizer operators for the tori ode.

Type I operators are given by verti es. We hoose some vertex s and

asso iate with it the he k operator

A

x

s

= �(f

x

s

) =

Y

j2star(s)

�

x

j

:

Type II operators are given by fa es. We hoose some fa e u and

asso iate with it the he k operator

A

z

u

= �(f

z

u

) =

Y

j2boundary(u)

�

z

j

:

The operators A

x

s

and A

z

u

ommute, sin e the number of ommon edges

between a vertex star and a fa e boundary is either 0 or 2. (The inter-

hangeability of operators within ea h type is obvious.)


Although we have indi ated r

2

+ r

2

= 2r

2

he k operators (one per a

fa e or a vertex), there are relations between them:

Y

s

A

x

s

=

Y

u

A

z

u

= I:

(It an be shown that these are the only relations.) Therefore dimF =

2r

2

� 2 and dimM = 2

2

, so two logi al qubits an be en oded.

We will now determine the ode distan e.

For the tori ode we have the natural de ompositions

F = F

(x)

� F

(z)

; F

+

= F

(x)

+

� F

(z)

+

into subspa es of operators that onsist either only of �

x

j

or only of �

z

j

. Su h

odes are alled CSS odes (by the names of their inventors, Calderbank,

Shor [16℄ and Steane [68℄).

In the ase of the tori ode, the subspa es F

(z)

, F

(x)

, F

(z)

+

, F

(x)

+

have

a topologi al interpretation. Ve tors of the form (0; �

1

; : : : ; 0; �

n

) 2 G

n

an be regarded as 1- hains, i.e., formal linear ombinations of edges with

oeÆ ients �

1

; : : : ; �

n

2 Z

2

. The basis elements of F

(z)

are the boundaries

of 2- ells. Therefore F

(z)

is the spa e of 1-boundaries. Likewise, ve tors of

the form (�

1

; 0; : : : ; �

n

; 0) are regarded as 1- o hains; F

(x)

is the spa e of

1- oboundaries.

Let us take an arbitrary element g 2 F

+

, g = g

(x)

+ g

(z)

. The ommu-

tativity between g and F an be written as follows:

!(f

x

s

; g

(z)

) = 0; !(f

z

u

; g

(x)

) = 0; for ea h vertex s and fa e p.

To satisfy !(f

x

s

; g

(z)

) = 0 it is ne essary that ea h star ontain an even

number of edges from g

(z)

. In other words, g

(z)

is a 1- y le. Analogously,

g

(x)

must be a 1- o y le.

Thus, the spa es F

(z)

+

, F

(x)

+

onsist of 1- y les and 1- o y les (with

Z

2

oeÆ ients), and the spa es F

(z)

, F

(x)

onsist of 1-boundaries and 1-

oboundaries. The sets F

(z)

+

nF

(z)

and F

(x)

+

nF

(x)

are formed by y les and

o y les that are not homologous to 0. Consequently the ode distan e is

the minimum size (the number of nonzero oeÆ ients) of su h a y le or

o y le. It is easy to see that this minimum equals r. This shows that the

tori ode orre ts b(r � 1)=2 errors.

Remark 15.4. The family of tori odes (with r = 1; 2; : : : ) provides an

example of lo al he k odes. Spe i� ally, the following onditions are satis-

�ed:

{ ea h he k operator a ts on a uniformly bounded number of qubits;

{ ea h qubit enters a uniformly bounded number of he k operators;


j�i

�

.

.

.

�

#

"

!

X = �(g)

�

.

.

.

�

j i

�

.

.

.

�

#

"

!

measurement

of

syndrome

�

.

.

.

�

g

0

: syndrome(g

0

) = syndrome(g)

g

0

= g + f; f 2 F

?

6

�

.

.

.

�

�

�

�

�

�(g

0

)

y

�

.

.

.

�

�(f)j�i = j�i

Fig. 15.2. Error orre tion pro edure for symple ti odes.

{ the family ontains odes with arbitrarily large distan e.

Su h odes are interesting in that syndrome measurement (an important

part of error orre tion; see below) an be realized by a onstant depth ir-

uit. Therefore an error in the exe ution of this ir uit will a�e t only a

bounded number of qubits | a useful property for fault-tolerant omputa-

tion.

15.10. Error orre tion for symple ti odes. De�nition 15.4 and

Theorem 15.3 indi ate only an abstra t possibility for restoring the initial

odeve tor. We will show how to realize an error orre tion pro edure for a

symple ti ode M = SympCode(F; �).

We examine a spe ial ase where the error is a �-operator, W = �(g).

Let X

j

= (�1)

�

j

�(f

j

) be the he k operators, and F the orresponding

isotropi subspa e. The sequen e of bits �(g) = (!(f

1

; g); : : : ; !(f

s

; g)) is

alled the syndrome of g. Ea h of these bits an be measured by measuring

the eigenvalue of X

j

on the quantum state j i = W j�i (in fa t, X

j

j i =

(�1)

!(f

j

;g)

j i ). The measurement of one bit does not hange the values of

the other, be ause the he k operators ommute.

Suppose that jgj � k and the ode orre ts k errors (i.e., d(M) > 2k). Is

it possible to re onstru t the error from its syndrome? Two errors g; g

0

2 G

n

have the same syndrome if and only if g

0

� g 2 F

+

. The error orre tion

property of the ode implies that

8 g; g

0

�

jgj � k; jg

0

j � k

�

=)

�

(g

0

� g 2 F ) _ (g

0

� g =2 F

+

)

�

:

Therefore we may take a di�erent error g

0

for g, but only if g

0

� g 2 F .

It is now lear how we should orre t errors. After the syndrome is

determined, we re onstru t the error (up to an element f = g

0

� g 2 F ) and

apply the operator whi h is inverse to the operator of the supposed error.

Thus we obtain a state that di�er from the initial one by a phase fa tor,

�(g

0

)

�1

�(g)j�i = j�i.


We have onsidered the ase of an error of type �(g). But, a tually,

it is required that an error- orre ting transformation prote t against all

superoperators of the form

T =

X

jhj�k;jh

0

j�k

b

h;h

0

�(h) � �(h

0

)

y

:

As an exer ise the reader is en ouraged to verify how the above pro edure

works in this general ase.

[3℄ Problem 15.3. Constru t a polynomial algorithm for re onstru ting

an error from its syndrome for the tori ode.

15.11. Anyons (an example based on the tori ode). Using the

onstru tion of the tori ode, we will try to give a better idea of Abelian

anyons mentioned in the Introdu tion (non-Abelian anyons are onsiderably

more ompli ated).

On e again, we onsider a square latti e on the torus (or on the plane

| now we are only interested in a region with trivial topology). As earlier,

asso iated to ea h vertex s and ea h fa e u are the operators

A

x

s

=

Y

j2star(s)

�

x

j

; A

z

u

=

Y

j2boundary(u)

�

z

j

:

The odeve tors are hara terized by the onditions A

x

s

j�i = j�i, A

z

u

j�i =

j�i. There is a di�erent way to impose the same onditions. Consider the

following Hamiltonian | the Hermitian operator

(15.26) H =

X

s

(I �A

x

s

) +

X

u

(I �A

z

u

):

This operator is nonnegative, and its null spa e oin ides with the ode

subspa e of the tori ode. Thus, the ve tors of the ode subspa e are the

ones that possess the minimal energy (i.e., they are the eigenve tors orre-

sponding to the smallest eigenvalue of the Hamiltonian). In physi s, su h

states are alled ground states, and ve tors in the orthogonal omplement

are alled ex ited states.

Ex ited states an be lassi�ed by the set of onditions they violate.

Spe i� ally, the states violating a parti ular set of onditions form a sub-

spa e; su h subspa es form an orthogonal de omposition of the total state

spa e. Note that the number of violated onditions of ea h type is even sin e

Q

s

A

x

s

=

Q

u

A

z

u

= I.

Consider an ex ited state j�i with the smallest nonzero energy. Su h a

state violates pre isely two onditions, for instan e, at two verti es, s and

p. (In fa t, the states violating di�erent pairs of onditions may form linear


ombinations, but we will assume that s and p are �xed.) Then for these

parti ular verti es

A

x

s

j�i = �j�i; A

x

p

j�i = �j�i;

whereas the onditions with the \+" sign for the other verti es remain in

for e. We say that in the state j�i there are two quasiparti les (elementary

ex itations) lo ated at the verti es s and p. Thus quasiparti le is a mental

devi e for lassifying ex ited states. It is a spe ial property of Hamilton-

ian (15.26) that states with ertain quasiparti le positions are also eigen-

states. However, the lassi� ation of low energy ex ited states by quasipar-

ti le positions, though approximate, works amazingly well for most physi al

media.

15

How an we get the state j�i from the ground state j�i? We join p

and s by a latti e path C

1

(see Figure 15.3a) and a t on j�i by the opera-

tor W =

Q

j2C

1

�

z

j

. This operator ommutes with A

x

k

for all internal ver-

ti es of the path C

1

, but at the ends they anti- ommute: WA

x

s

= �A

x

s

W ,

WA

x

p

= �A

x

p

W . We set j�i =W j�i and show that j�i satis�es the required

properties. For the vertex s (and analogously for p) we have

A

x

s

j�i = A

x

s

W j�i = �WA

x

s

j�i = �W j�i = �j�i:

s

p

u

C

2

C

1

s

p

u

C

0

a) b)

Fig. 15.3. Creating pairs of quasiparti les (a) and moving them around (b).

An arbitrary state of the system an be des ribed as a set of quasi-

parti les of two types, one of whi h \lives" on verti es, the other on fa es.

Mathemati ally, a quasiparti le is simply a violated ode ondition, but now

we think of it as a physi al obje t. Parti les-ex itations an move, be re-

ated and annihilated. A pair of vertex quasiparti les is reated by the a tion

15

The position un ertainty is mu h larger than the latti e spa ing, but mu h smaller than the

distan e between the parti les. We annot go into more details here. Thus, reader's a quaintan e

with onventional quasiparti les (su h as ele trons and holes in a semi ondu tor, or spin waves)

would be very helpful.


of the operator W ; a pair of fa e quasiparti les is reated by the operator

V =

Q

j2C

2

�

x

j

, where C

2

is a path on the dual latti e.

What will happen if we move a fa e quasiparti le around a vertex quasi-

parti le (see Figure 15.3b)? The initial state j i ontains two quasiparti les

of ea h type. The movement of the fa e quasiparti le around a losed path

C

0

is expressed by the operator U =

Q

j2C

0

�

x

j

=

Q

s

A

x

s

, where s runs over

all fa es inside C

0

. It is obvious that A

x

k

j i = j i for all k 6= p. As a result

we get

U j i =

Y

j2C

0

�

x

j

j i = A

x

p

j i = �j i:

Thus the state ve tor gets multiplied by �1. This indi ates some sort of

long range intera tion between the parti les: the moving parti le somehow

\knows" about the se ond parti le without ever tou hing it! However, the

intera tion is purely topologi al: the state evolution depends only on the

isotopy lass of the braid the parti le world lines form in spa e-time. In the

ase at hand, the evolution is just the multipli ation by a phase fa tor; su h

parti les are alled Abelian anyons.

On the torus we an move parti les over two di�erent y les that form

a basis of the homology group. For instan e, reate a pair of parti les

from the ground state, move one of them around a y le, and annihilate

with the se ond one. Now it be omes important that the ground state is

not unique. Re all that there is a 4-dimensional spa e of ground states |

the ode subspa e. The pro ess we have just des ribed a�e ts an operator

a ting on this subspa e. We an think of four di�erent operators of this

kind: Z

1

; Z

2

are produ ts of �

z

j

over the basis y les (they orrespond to

moving a vertex quasiparti le), whereas X

1

;X

2

are produ ts of �

x

j

over the

homologous y les on the dual latti e. The ommutation relations between

these operators are as follows:

(15.27)

Z

j

Z

k

= Z

k

Z

j

; X

j

X

k

= X

k

X

j

(j; k = 1; 2);

X

1

Z

1

= Z

1

X

1

; X

2

Z

2

= Z

2

X

2

;

X

1

Z

2

=�Z

2

X

1

; X

2

Z

1

=�Z

1

X

2

:

Thus we an identify the operators Z

1

and X

2

with �

z

and �

x

a ting on

one en oded qubit. Correspondingly, Z

2

and X

1

a t on the se ond en oded

qubit.

Part 3

Solutions

In this part we o�er either omplete solutions to problems or hints whi h

the interested reader an use to work out a rigorous solution.

S1. Problems of Se tion 1

1.1. The idea is simple: the ma hine moves symbols alternately from

left to right and from right to left until it rea hes the enter of the input

string, at whi h point it stops.

Now we give a formal des ription of this ma hine.

We assume that the external alphabet A is f0; 1g. The alphabet S =

f ; 0; 1; �; 0

0

; 1

0

g onsists of the symbols of the external alphabet, the empty

symbol , and three auxiliary marks used to indi ate the positions from

whi h a symbol is taken and a new one should be dropped.

The set of states is

Q = fq

0

; q

f

; r

0

; r

1

; l

0

; l

1

; l

0

0

; l

1

0

g:

The letters r and l indi ate the dire tion of motion, and the subs ripts at

these letters refer to symbols being transferred.

Now we des ribe the transition fun tion.

Beginning of work:

(q

0

; 0) 7! (r

0

; �;+1); (q

0

; 1) 7! (r

1

; �;+1);

(q

0

; ) 7! (q

0

; ;�1):

The �rst line indi ates that the ma hine pla es a mark in the �rst position

and moves the symbol that was there to the right. The se ond line indi ates

that the ma hine stops immediately at the empty symbol.

177

178 3. Solutions

Transfer to the right:

(r

0

; 0) 7! (r

0

; 0;+1); (r

1

; 0) 7! (r

1

; 0;+1);

(r

0

; 1) 7! (r

0

; 1;+1); (r

1

; 1) 7! (r

1

; 1;+1):

The ma hine moves to the right until it en ounters the end of the input

string or a mark.

A hange in the dire tion of motion from right to left onsists of two

a tions: remove the mark (provided this is not the empty symbol)

(r

0

; 0

0

) 7! (l

0

0

; 0;�1); (r

1

; 0

0

) 7! (l

1

; 0;�1);

(r

0

; 1

0

) 7! (l

0

0

; 1;�1); (r

1

; 1

0

) 7! (l

1

0

; 1;�1);

(r

0

; ) 7! (l

0

0

; ;�1); (r

1

; ) 7! (l

1

0

; ;�1)

and pla e it in the left adja ent position

(l

0

0

; 0) 7! (l

0

; 0

0

;�1); (l

1

0

; 0) 7! (l

0

; 1

0

;�1);

(l

0

0

; 1) 7! (l

1

; 0

0

;�1); (l

1

0

; 1) 7! (l

1

; 1

0

;�1):

Transfer to the left:

(l

0

; 0) 7! (l

0

; 0;�1); (l

1

; 0) 7! (l

1

; 0;�1);

(l

0

; 1) 7! (l

0

; 1;�1); (l

1

; 1) 7! (l

1

; 1;�1):

Change of dire tion from left to right:

(l

0

; �) 7! (q

0

; 0;+1); (l

1

; �) 7! (q

0

; 1;+1):

The ompletion of work depends on the parity of the word length: for

even length, the ma hine stops at the beginning of the motion to the right

(q

0

; 0

0

) 7! (q

f

; 0;�1); (q

0

; 1

0

) 7! (q

f

; 1;�1);

and for odd length, | at the beginning of the motion to the left

(l

0

0

; �) 7! (q

f

; 0;�1); (l

1

0

; �) 7! (q

f

; 1;�1):

The transition fun tion is unde�ned for the state q

f

; therefore the ma-

hine stops after swit hing to this state.

1.2. S hemati ally, this is done as follows: to the se ond summand we

add one by one the bits of the �rst summand, the added bit being erased.

Adding one bit takes time that does not ex eed the binary length of the

se ond summand, so that the total working time of the ma hine depends

quadrati ally on the length of the input.

1.3. The proof is by ontradi tion. Suppose that there is su h an algo-

rithm, i.e., that there exists a ma hine B whi h, for the input ([M ℄; x), gives

the answer \yes" if the ma hine M stops at input x and gives the answer

\no" otherwise. (Re all that [M ℄ denotes a des ription of the ma hine M .)


Let us de�ne another ma hine B

0

that, given an input y, simulates the

work of B for the input (y; y). If the answer of the ma hine B is \yes", then

B

0

begins moving the head to the right and does not stop. If the answer of

B is \no", then B

0

stops.

Does B

0

stop for the input [B

0

℄?

If we suppose that it stops, then B gives the answer \yes" for the input

([B

0

℄; [B

0

℄). Then, by de�nition of the ma hine B

0

, it does not stop for the

input [B

0

℄. This is exa tly the opposite of our assumption.

If B

0

does not stop for the input [B

0

℄, then B gives the answer \no" for

the input ([B

0

℄; [B

0

℄). But this implies that B

0

stops for the input [B

0

℄, a

ontradi tion.

Remark S1.1. This kind of proof is fairly ommon in mathemati al logi ;

it is often alled diagonalization. The idea was �rst used by Cantor to

show that the set of real numbers (or in�nite 0-1 sequen es) is un ountable.

We remind the reader Cantor's argument to explain the name \diagonaliza-

tion". Suppose that all 0-1 sequen es are ounted (i.e., assigned numbers

0; 1; 2; : : : ), so that we an think of them as rows of an in�nite table,

x

00

x

01

x

02

: : :

x

10

x

11

x

12

: : :

x

20

x

21

x

22

: : :

.

.

.

.

.

.

.

.

.

.

.

.

Let us look at the diagonal of this table and build the sequen e

y = (1� x

00

; 1� x

11

; 1� x

22

; : : : ):

It is lear that this sequen e annot be a row of the original table, whi h

proves that all 0-1 sequen es annot be ounted.

Note, however, that the unsolvability of the halting problem is a bit more

subtle: the proof is based on the existen e of a universal Turing ma hine

(we used it impli itly when onstru ting B

0

).

1.4. First of all, we show that the elements of an enumerable set an

a tually be produ ed one by one by an algorithmi pro ess. Suppose that

X is the set of all possible outputs of a Turing ma hine E. Let us try all

pairs of the form (x; n) (where x is a string, and n is a natural number) and

simulate the �rst n steps of E for the input x. If E terminates during the

�rst n steps, we in lude its output in the list; otherwise we pro eed to the

next pair (x; n). This way, all elements of X are in luded (possibly, with

repetitions).

180 3. Solutions

Theorem S1.1. A partial fun tion F : A

�

! f0; 1g is omputable if and

only if the sets X

0

= fx : F (x) = 0g and X

1

= fx : F (x) = 1g are both

enumerable.

Proof. Suppose that X

0

and X

1

are enumerable. Given an input string

y 2 X

0

[X

1

, we run the enumerating pro esses for X

0

and X

1

in parallel.

Sooner or later, y will be produ ed by one of the pro esses. If it is the

pro ess for X

0

, we announ e 0 as the result, otherwise the result is 1.

Conversely, if F is omputable, then there is a TM that presents its

input x as the output if F (x) = 0, and runs forever if F (x) is 1 or unde�ned.

Therefore X

0

is enumerable. (Similarly, X

1

is enumerable.) �

Now, let us turn to the original problem. We are interested in the �rst

set of these two:

X

0

=

�

[M ℄ : M does not halt for the empty input

;

X

1

=

�

[M ℄ : M halts for the empty input

:

Note that the se ond set, X

1

, is enumerable. Indeed, we an onstru t a

ma hine E that, given an input x = [M ℄, simulatesM and outputs the same

x when the simulation ends. (If M does not stop, or if the input string is

not a valid des ription of a TM, then E runs forever.) Therefore, if X

0

were

also enumerable, there would exist an algorithm for determining whether a

given Turing ma hine M halts for the empty input.

But then the halting problem (for a Turing ma hine T and an arbitrary

input x) would also be solvable. Indeed, for ea h pair ([T ℄; x) a ma hine M

that �rst writes x on the tape and then simulates the work of T is easily

onstru ted. We have arrived at a ontradi tion: a ording to Problem 1.3,

there is no algorithm for the solution of the halting problem.

1.5. The idea of a solution is as follows. Let b be an arbitrary om-

putable fun tion. For any n there exists a ma hine M

n

that writes n on

the tape, then omputes nb(n) and ounts down from nb(n) to zero. This

ma hine has O(log n) states and a onstant number of symbols. (It is easy

to see that O(log n) states are enough to write n in binary form on the tape.)

1.6. We des ribe brie y a single-tape ma hineM

1

that simulates a two-

tape ma hine M

2

. The alphabet of M

1

is rather large: one symbol en odes

four symbols of M

2

, as well as four additional bits. The symbol in the k-th

ell of M

1

represents the symbols in the k-th ells on the input tape, the

output tape and both work tapes ofM

2

; the additional bits are used to mark

the pla es where the heads ofM

2

are lo ated. The ontrol devi e ofM

1

keeps

the state of the ontrol devi e of M

2

and some additional information.


The ma hine M

1

works in y les. Ea h y le imitates a single step of

M

2

. At the beginning of ea h y le the head of M

1

is lo ated above the

leftmost ell. Ea h y le onsists of two passes. First M

1

moves from left to

right until it �nds all of the four marks (i.e., the heads of M

2

); the ontents

of the orresponding ells are stored in the ontrol devi e. On the way ba k

a tions imitating one step of M

2

are arried out. Ea h su h a tion requires

O(1) steps (and �nite amount of memory in the ontrol devi e).

Ea h y le takes O(s) steps of the ma hine M

1

, where s is the length of

the used portion of the tape. Sin e s � T (n) + 1, the ma hine M

1

works in

time O(sT (n)) = O(T

2

(n)).

1.7. The main problem in eÆ ient simulation of a multitape TM on

an ordinary TM is that the heads of the simulated ma hine may be far

from ea h other. Therefore the simulating head must move ba k and forth

between them to imitate a single step of the multitape TM.

However, if we have a se ond work tape, it an be used to move blo ks

of size n by distan e m along the �rst tape in O(n +m) steps. Indeed, we

an opy the blo k onto the se ond tape, then move the head on the �rst

tape and then opy the blo k ba k. Therefore we an build a \ a he" that

ontains neighborhoods of the heads aligned a ording to head positions:

O

a

1

a

2

a

3

a

4

a

5

a

6

a

7

a

8

a

9

O

b

1

b

2

b

3

b

4

b

5

b

6

b

7

b

8

b

9

7�!

O

a

1

a

2

a

3

a

4

a

5

O

b

5

b

6

b

7

b

8

b

9

After simulating several omputation steps in the a he, we an opy the

result ( hanged a he ontents) ba k. Spe i� ally, to simulate t steps, we

need to opy the t-neighborhoods of the heads (of size 2t + 1). We all it

t- a he.

To get a bound T (n) log T (n), we need to use re ursion and multilevel

a hing (by powers of 2). Suppose we have already simulated T = 2

k

steps

of the 3-tape ma hineM

3

on a 2-tape ma hineM

2

and want to ontinue the

simulation for another T steps. The urrent state of M

3

is represented by

the �rst T +1 ells on the �rst tape of M

2

. We extend this \main memory"

to size 2T+1 and perform the simulation in two portions, using a T=2- a he.

To arry out the omputation in this a he, we use a T=4 a he, and so on.

(All a hes are allo ated on the �rst tape.)

Main memory (T - a he) T=2- a he T=4- a he � � � 1- a he

Thus, ea h level of re ursion onsists in simulating t steps of M

3

in the

t- a he. This is done by a pro edure F that onsists of the the following

operations:

182 3. Solutions

1. opy the t=2-neighborhoods of the heads into the t=2- a he;

2. simulate t=2 steps of M

3

re ursively (by applying the same pro edure F

to the t=2- a he);

3. opy the result ba k to the t- a he;

4. opy the t=2-neighborhoods of the new head positions into the t=2- a he;

5. simulate the remaining t=2 steps;

6. write the result ba k.

To implement the re ursion, we augment ea h a he by a spe ial ell that

indi ates the operation being done at the urrent level. Ca hes are allo ated

as they are needed, and freed (and �lled with blanks) when returning to the

previous re ursion level. (This is a standard implementation of re ursion

using a sta k.)

The re urren e relation T (t) � 2T (t=2) +O(t) implies T (t) = O(t log t).

1.8. Loosely speaking, to opy a string of length n we need to move n

bits by distan e n. Sin e a TM has a �nite number of states, it arries only

O(1) bits by distan e 1 at ea h step; therefore (n

2

) steps are needed.

Here is a more formal argument. For ea h k we onsider a rossing

sequen e at boundary k. It is the sequen e of states of the TM during its

moves from the k-th ell to the (k + 1)-th ell. Note that the behavior of

the TM in the zone to the right of the boundary ( ells k + 1, k + 2, et .) is

determined by the initial ontents of that zone and the rossing sequen e.

(The only information that ows into the zone is arried by the head and is

re orded in the rossing sequen e. Note also that we do not worry how long

the head stays outside the zone and do not in lude this information in the

rossing sequen e.)

Di�erent input strings should generate di�erent rossing sequen es. The-

refore, most rossing sequen es are long (of size (n)). Sin e there are (n)

possible values of k, we have (n) rossing sequen es of length (n). Thus

the sum of their lengths is (n

2

), whi h is a lower bound for the omputation

time.

Here are the details. For simpli ity we assume that n is even (n = 2m)

and onsider inputs of the form x = v0

m

, where v is a binary string of length

m. Let k be a number between m and 2m, and let Q(v; k) be the rossing

sequen e for the omputation on v0

m

at boundary k. As we have said,

di�erent strings v lead to di�erent rossing sequen es (otherwise di�erent

strings would have identi al opies); therefore there are at least 2

m

rossing

sequen es. Sin e the number of states is O(1), some rossing sequen es have

length (m). Moreover, it is easy to see that the average length of the

rossing sequen e Q(v; k) (taken over all strings v 2 f0; 1g

m

for a �xed k) is


(m). Therefore, the average value of

X

m�k�2m

�

length of Q(v; k)

�

is (m

2

). But for ea h v this sum does not ex eed the omputation time;

therefore the average is also a lower bound for the omputation time.

On the other hand, it is easy to onstru t a Turing ma hine M whi h

will dupli ate some strings (e.g., 0

n

) in time T

0

(n) = O(n logn). First M

he ks whether the input onsists only of zeros. If this is the ase, M ounts

them and then produ es the same number of zeros. (To ount zeros, we

need a portable ounter of size O(log n), whi h is arried along the tape.) If

the input has nonzero symbols, then M just opies it in the usual way (in

O(n

2

) steps). But the minimal time is still O(n log n). One an he k that

this bound is tight (i.e., (n log n) is a lower bound).

1.9. Hint. We need to simulate an arbitrary TM. Sin e the values of

the variables an be arbitrarily large, we an store the entire tape of the

ma hine in one variable (tape ontent is the jSj-ary representation of some

number, where S is the ma hine alphabet). The head position is another

integer variable, and the state of the ontrol devi e is yet another.

Changes in these variables after one omputation step are des ribed in

terms of simple arithmeti operations (addition, multipli ation, exponentia-

tion, division, remainder) and omparison of numbers. All these operations

an be redu ed to the in rement and de rement statements and the om-

parison with 0 (using several auxiliary variables).

The transition table be omes a nested if-then-else onstru t.


2.1. Let us �nd all fun tions in two variables that an be expressed

by formulas in the basis A. We begin with two proje tions, namely, the

fun tions p

1

(x; y) = x and p

2

(x; y) = y. Then the following pro edure is

applied to the set F of already onstru ted fun tions. We add to the set F

all fun tions of the form f

�

g

1

(x

1

; x

2

); g

2

(x

3

; x

4

); : : : ; g

k

(x

2k�1

; x

2k

)

�

, where

x

j

2 fx; yg, g

j

2 F , f 2 F . If the set F in reases, we repeat the pro edure.

Otherwise there are two possibilities: either we have obtained all fun tions

in two variables (then the basis is omplete), or not (then the basis is not

omplete).

We estimate the working time of this algorithm. Only 16 Boolean fun -

tions in two variables exist; therefore the set F an be enlarged at most 14

times. At ea h step we must he k at most 16

m

� jAj possibilities, where

m is the maximum number of arguments of a basis fun tion. Indeed, for

184 3. Solutions

ea h basis fun tion f ea h of (at most) m positions an be o upied by any

fun tion from F , and jFj � 16. The length of the input (en oded basis) is

at least 2

m

(be ause the table for a fun tion in m Boolean variables has 2

m

entries). So, the working time of the algorithm is polynomially bounded in

the length of the input.

2.2. An upper bound O(n2

n

) < 2:01

n

(for large n) follows immediately

from the representation of the fun tion in disjun tive normal form (see for-

mula (2.1) on page 19).

To obtain a lower bound we ompare the number of Boolean fun tions in

n variables (i.e., 2

2

n

) and the number of all ir uits of a given size. Assume

that the standard omplete basis is used. For the k-th assignment of the

ir uit there are at most O((n + k)

2

) possibilities (two arguments an be

hosen among n input and k�1 auxiliary variables). Therefore, the number

N

s

of di�erent ir uits of size s does not ex eed

O

��

(n+ s)

2

�

s

�

= 2

2s(log(n+s)+O(1))

:

But the number of Boolean fun tions in n variables equals 2

2

n

. If

(S2.1) 2

n

> 2s

�

log(n+ s) +O(1)

�

;

there are more fun tions than ir uits, so that

n

> s. If s = 1:99

n

, then

inequality (S2.1) is satis�ed for suÆ iently large n.

2.3. We re all the onstru tion of the unde idable predi ate belonging

to P=poly (see Remark 2.1 on page 22).

For any fun tion ' : N ! f0; 1g the predi ate f

'

(x) = '(length(x))

belongs to P/poly. Now let ' be a omputable fun tion that is diÆ ult to

ompute: no TM an produ e output '(n) in polynomial (in n) time. More

pre isely, we use a omputable fun tion ' su h that for any TM M and any

polynomial p with integer oeÆ ients there exists n su h that M(1

n

) does

not produ e '(n) after p(n) steps.

It remains to onstru t su h a fun tion '. This an be done by \diago-

nalization" ( f. Remark S1.1): we onsider pairs (M;p) one by one; for ea h

pair we sele t some n for whi h '(n) is not de�ned yet and de�ne '(n) to

be di�erent from the result of p(n) omputation steps of M on input 1

n

. (If

omputation does not halt after p(n) steps, the value '(n) an be arbitrary.)

2.4. Ea h output depends on O(1) wires; ea h of them depends on O(1)

other wires, et . Therefore, the total number of used wires and gates is

O(1)

depth

= 2

O(log(m+n))

= poly(m+ n).

2.5. A ir uit onsists of assignments of the form y

j

:= f

j

(u

1

; : : : ; u

r

).

We an perform this assignments symboli ally, by substituting formulas for


u

1

; : : : ; u

r

. If these formulas have depth � h, then y

j

be omes a formula of

depth � h+ 1.

2.6.

A formula an be represented by a tree (input variables are leaves, in-

ternal verti es orrespond to subformulas, whereas the root is the formula

itself.)

It is easy to see that any formula X of size L has a subformula Z of

size between M and 2M (in lusive), where M = bL=3 . Indeed, we start

looking for su h a subformula at the root of the tree and ea h time hoose

the largest bran h (of one or two). When the size of the subformula be omes

� 2M , we stop.

Repla ing the subformula Z by a new variable z, we obtain a formula

Y (z) (of size from L� 2M to L�M) su h that X = Y (Z). Note that both

Z and Y have size � d2L=3e.

Suppose that Z and Y an be onverted into equivalent formulas Z

0

and

Y

0

of depth � h. Then the following formula of depth � h+3 will ompute

the same fun tion as X does:

X

0

= (Y (0) ^ :Z) _ (Y (1) ^ Z):

Thus, we have the re urren e relation

h(L) � h

�

d2L=3e

�

+ 3;

where h(L) is the maximum (over all formulas X of size L) of the minimal

depth of a formula X

0

that is equivalent to X. It follows that h(L) =

O(logL).

2.7. Let us use the disjun tive normal form. If we onstru t a ir uit

by formula (2.1) using AND gates and OR gates with arbitrary fan-in (the

number of inputs), only three layers are needed: one layer for negations, one

for onjun tions, and one for the disjun tion.

2.8. Let L be the size of a ir uit omputing the fun tion f = PARITY.

First, we onvert the ir uit into a formula of size L

0

� L

3

(see Problem 2.5).

Using De Morgan's identities

:

_

x

j

=

^

:x

j

; :

^

x

j

=

_

:x

j

;

we an ensure that negations are applied only to the input variables.

Without loss of generality, we may assume that the output is produ ed

by an OR gate. (Otherwise, we apply De Morgan's identities again and

obtain a ir uit for the fun tion :PARITY = PARITY�1; the following

arguments work for this fun tion as well.) We may also assume that the

inputs to the �nal OR gate are produ ed by AND gates. (If some input is

186 3. Solutions

a tually produ ed by an OR gate, this gate an be merged with the �nal

one. If it is produ ed by a NOT gate, we an insert a dummy AND gate

and still have depth 3.)

Now we have

f(x

1

; : : : ; x

n

) = t

1

_ � � � _ t

m

;

where ea h t

i

has the form t

i

= u

1

^ � � � ^ u

k

, and ea h u

k

is either a

disjun tion (a single variable is a spe ial ase of that) or the negation of a

variable. Note that a variable annot appear in a negation and a disjun tion

at the same time, or else the formula an be simpli�ed. For example, if

t

i

= (x

1

_ � � � _ x

k

) ^ � � � ^ :x

1

^ � � � ;

then x

1

an be deleted from the disjun tion (x

1

_ � � � _ x

k

). Therefore, ea h

t

i

has the form

t

i

= :x

j

1

^ � � � ^ :x

j

p

^ (monotone fun tion in the other variables):

Now we use a spe ial property of the fun tion f = PARITY: if f(x) = 1,

and x

0

di�ers from x in just one bit (say, x

j

), then f(x

0

) = 0. This ondition

should be true for all subformulas t

i

. It follows that ea h t

i

is the onjun tion

of n literals (i.e., input variables or their negations). Therefore, t

i

(x) = 1

for exa tly one value of x. Hen e the number of the subformulas t

i

is not

less than the number of points x where f(x) = 1, i.e., 2

n�1

.

Remark. It an be shown that ir uits of �xed depth omputing the fun -

tion PARITY and made of gates NOT, OR and AND with arbitrary fan-in,

always have exponential size. The proof (quite nontrivial!) is by indu tion,

starting with ir uits of depth 2 and 3 (the ase dis ussed above).

The proof of this assertion an be found in [14℄. We give a short exposi-

tion of the main idea. Let us note that OR-gates and AND-gates in a ir uit

of minimal size must alternate. One an try to swit h two layers by repla ing

ea h disjun tion of onjun tions by a onjun tion of disjun tions; this will

redu e the ir uit depth by 2. However, this transformation in reases the

size of the ir uit, and we do not get any satisfa tory bound for the new size.

Still, a reasonable bound an be obtained if we allow further transformation

of the resulting ir uit. We �rst assign random values to some of the input

variables. Thus we obtain a fun tion of a smaller number of variables, whi h

is either PARITY or its negation. As far as the ir uit is on erned, some

of its auxiliary variables an be evaluated using only the values of the input

variables we have just �xed. And with nonnegligible probability our ir uit

be omes simpler, so that the transposition of onjun tions and disjun tions

does not lead to a large in rease in the ir uit size.


2.9. There are three possible results of the omparison of two numbers

x and y: x > y, x = y, or x < y. We onstru t a ir uit whi h yields a

two-bit answer en oding these three possibilities.

We may assume without loss of generality that n is a power of two. (It

is possible to add several zeros on the left so that the total number of bits

be omes a power of two. This at most doubles the number of inputs bits.)

x

n�1

: : : : : : : : : : : : : : : : : : x

n=2

y

n�1

: : : : : : : : : : : : : : : : : : y

n=2

Cir uit Cmp

n=2

e

0

= 1 if the numbers are equal

g

0

= 1 if x > y

x

n=2�1

: : : : : : : : : : : : : : : : : : x

0

y

n=2�1

: : : : : : : : : : : : : : : : : : y

0

Cir uit Cmp

n=2

e

00

= 1 if the numbers are equal

g

00

= 1 if x > y

e := e

0

^ e

00

g := g

0

_ (e

0

^ g

00

)

e = 1 if the numbers are equal

g = 1 if x > y

Fig. S2.1. Cir uit Cmp

n

for omparison of n-bit numbers

(size of ir uit = O(n), depth = O(log n)).

A ir uit for the omparison of n-bit numbers is onstru ted re ursively.

We ompare the �rst n=2 (high) bits and the last n=2 (low) bits separately,

and then ombine the results, see Figure S2.1. (This \divide and onquer"

method will be used for the solution of many other problems.)

We estimate the size L

n

and depth d

n

of this ir uit. It is easy to see

that

L

n

= 2L

n=2

+ 3; d

n

= d

n=2

+ 2:

Therefore, L

n

= O(n) and d

n

= O(log n).

Remark. The inequality x > y holds if and only if the number x+(2

n

�1�y)

is greater than 2

n

� 1, whi h an be he ked by looking at the n-th bit of

this number. Note that 2

n

� 1� y is omputed very easily (by negating all

bits of y), so the omparison an be redu ed to addition (see Problem 2.12).

However, the solution we gave is simpler.

2.10. a) Let j = j

l�1

� � � j

0

. We will gradually narrow the table by

taking into a ount the values of j

l�1

, j

l�2

, and so on. For example, if

j

l�1

= 0, we sele t the �rst half of the table; otherwise we sele t the se ond

half. It is lear that the hoi e is made between x

0j

l�2

��j

0

and x

1j

l�2

��j

0

for

188 3. Solutions

ea h ombination of j

l�2

; : : : ; j

0

. Su h hoi es are des ribed by the fun tion

f(a; b; ) =

�

b if a = 0;

if a = 1;

whi h is applied simultaneously to all pairs of table entries. The operation is

then repeated with the resulting table, so f is used 2

l�1

+2

l�2

+� � �+1 = O(2

l

)

times in l parallel steps. Note that the fun tion f an be omputed by a

ir uit of size O(1).

However, before we an a tually apply f multiple times, we need to

prepare 2

p

opies of j

p

for ea h p (re all the bounded fan-out ondition).

This requires O(2

l

) trivial gates arranged in O(l) layers.

b) The solution is very similar to that of Problem 2.9. Let us onstru t

a ir uit Sear h

n

that outputs l = log

2

n opies of y = x

0

_ � � � _ x

n�1

, as

well as the smallest j su h that x

j

= 1 (if su h j exists). The ir uit Sear h

n

an be obtained from two opies of Sear h

n=2

applied to the �rst and the

se ond half of the string x, respe tively. Let the results of these appli ation

be y

0

; : : : ; y

0

; j

0

and y

00

; : : : ; y

00

; j

00

. Then we make one additional opy of y

0

and y

00

and ompute

y = y

0

_ y

00

; j =

�

j

0

if y

0

= 1;

n=2 + j

00

if y

0

= 0

by a ir uit of size O(l) and depth O(1) (ea h opy of y

0

ontrols a single

bit of j).

2.11. Let

�

f

j

(q); g

j

(q))

def

= D(q; x

j

�

. Then the intermediate states of

the automaton and its output symbols are

q

j+1

= f

j

f

j�1

� � � f

0

(q

0

) ( omposition of fun tions); y

j

= g

j

(q

j

):

The solution of the problem is divided into 4 stages.

1. We tabulate the fun tions f

j

and g

j

(by making m opies of the table

of D and narrowing the j-th opy a ording to x

j

; see the solution to

Problem 2.10a). This is done by a ir uit of size exp(O(k))m and depth

O(k + logm).

2. We ompute a ertain omposition of the fun tions f

0

; : : : ; f

m�1

(see

diagram and text below).

3. We ompute q

j

in a parallel fashion (see below).

4. We apply the fun tions g

j

; this is done by a ir uit of size exp(O(k))m

and depth O(k).


q

8

�

f

7

|{z}

F

0;7

q

7

�

f

6

|{z}

F

0;6

| {z }

F

1;3

q

6

�

f

5

|{z}

F

0;5

q

5

�

f

4

|{z}

F

0;4

| {z }

F

1;2

| {z }

F

2;1

q

4

�

f

3

|{z}

F

0;3

q

3

�

f

2

|{z}

F

0;2

| {z }

F

1;1

q

6

�

f

1

|{z}

F

0;1

q

1

�

f

0

|{z}

F

0;0

| {z }

F

1;0

| {z }

F

2;0

| {z }

F

3;0

q

0

�

At stages 2 and 3 we assume that m = 2

l

(sin e we an always augment

the sequen e f

0

; : : : ; f

m�1

by identity fun tions). We organize the fun tions

f

p

into a binary tree and ompute their ompositions F

r;p

(the ase l = 3 is

shown in the diagram). First we de�ne F

0;p

= f

p

. At step 1 we ompute the

ompositions F

1;0

= f

1

f

0

through F

1;m=2�1

= f

m�1

f

m�2

; we ontinue this

pro ess for l steps until we get the fun tion F

l;0

= f

m�1

� � � f

0

. The general

formula for step r is as follows:

F

r;p

= F

r�1; 2p+1

F

r�1; 2p

; r = 1; : : : ; l; 0 � p < 2

l�r

:

In this omputation fun tions are represented by value tables. Compos-

ing two fun tions, say u and v, amounts to omputing u(v(q)) for all values

of q; this is done by a ir uit of size exp(O(k)) and depth O(k).

With the fun tions F

r;p

, the transition from q

0

to q

j

be omes mu h

qui ker. For example, if j lies between 2

l�1

and 2

l

, we an get to q

2

l�1

in

one leap; then we make smaller jumps until we stop at the right pla e. Doing

the same thing for all j, we ompute q

j

in the following order (for l = 3):

q

0

step 0 : q

8

�

step 1 : q

4

�

step 2 : q

6

�

q

2

�

step 3 : q

7

�

q

5

�

q

3

�

q

1

�

? ? ? ? ? ? ? ? ?

In general, the omputation onsists of l + 1 steps. At step s we obtain the

values of q

j

for every j of the form j = 2

l�s

(2p + 1), using the re urren e

relation

q

2

l�s

(2p+1)

= F

l�s; 2p

�

q

2

l�s+1

p

�

; s = 0; : : : ; l; 0 � 2p+ 1 � 2

s

:

The omputation of F

r;p

and q

j

is performed by a 2

l

exp(O(k))-size,

O(lk)-depth ir uit. (Note that ea h q

j

is used only on e at ea h of the fol-

lowing steps, s+1; : : : ; l, so that we an make opies as we need them, while

keeping the fan-out bounded. Therefore, the bounded fan-out ondition an

be satis�ed without in rease in depth.)

190 3. Solutions

2.12. Suppose we want to ompute z = x + y. Let x = x

n�1

� � � x

0

,

y = y

n�1

� � � y

0

, z = z

n

� � � x

0

. The standard addition algorithm is des ribed

by the formulas

q

0

:= 0; q

j+1

:=

�

0 if x

j

+ y

j

+ q

j

< 2

1 if x

j

+ y

j

+ q

j

� 2

(j = 0; : : : ; n� 1);

z

j

:= x

j

� y

j

� q

j

; z

n

:= q

n

;

where q

0

; : : : ; q

n

are the arry bits. This sequen e of assignments is a ir uit

of size and depth O(n). Note that it also orresponds to a �nite-state au-

tomaton with the input alphabet A

0

= B

2

(pairs of input bits), the output

alphabet A

00

= B (bits of the result) and the state set B (the value of the

arry bit). Hen e the result of Problem 2.11 applies.

2.13. Part b) follows from Part a), so it is enough to solve a). The

�rst idea is obvious | we need to organize the input numbers into a binary

tree of depth dlog

2

me. Addition of two numbers an be done with depth

O(logn), hen e we get a ir uit of depth O(logm log n). However, this is

not exa tly what we want.

Fortunately, addition of two numbers an be done with depth O(1) if

we use a di�erent en oding for the numbers. Spe i� ally, let us represent a

number by the sum of two numbers (obviously, su h a representation is not

unique).

Lemma. There exists a ir uit of size O(n) and depth O(1) that onverts

the sum of three n-digit numbers into the sum of two numbers, i.e., omputes

a fun tion

F : (x; y; z) 7! (u; v) su h that u+ v = x+ y + z for any x; y; z:

(We will �nd a parti ular F su h that u has n+ 1 digits, whereas v has n

digits. If x has n+1 digits instead of n, then both u and v will be n+1-digit

numbers.)

Proof. Let x = x

n

x

n�1

� � � x

0

, y = y

n�1

� � � y

0

, z = z

n�1

� � � z

0

. We an

perform the addition bitwise, without arrying bits between binary pla es.

Thus at ea h pla e j we get the number w

j

= x

j

+ y

j

+ z

j

2 f0; 1; 2; 3g (for

j = 0; : : : ; n� 1). This number an be represented by two digits: w = u

j

v

j

.

Then we put

u = u

n�1

� � � u

0

0; v = x

n

v

n�1

� � � v

0

:

�

Now let (x

1

; x

2

) and (y

1

; y

2

) be pairs ofn-digit numbers that represent

x = x

1

+ x

2

and y = y

1

+ y

2

. Applying the lemma twi e, we get a pair

of n + 1-digit numbers, (z

1

; z

2

), su h that z

1

+ z

2

= x + y. Therefore we

an build a ir uit of size O(nm) and depth O(logm) that adds m n-digit


numbers represented by pairs. This way, we obtain two n

0

-digit numbers,

where n

0

= n+dlog

2

me. At the end, we need to a tually add these numbers

so that the result appear in the usual form. This an be done by a ir uit

of size O(n

0

) and depth O(log n

0

) (see Problem 2.12).

2.14. The standard division algorithm an be des ribed as a sequen e

on n subtra tions, ea h of whi h is skipped under ertain onditions. This

orresponds to a ir uit of size O(n

2

) and depth O(n logn) (assuming that

ea h subtra tion is performed in a parallel fashion, as in Problem 2.12).

Unfortunately, this algorithm annot be parallelized further, not even with

tri ks like in the previous problem. So, we will use a ompletely di�erent

method whi h allows parallelization at the ost of some in rease in ir uit

size (by a fa tor of order O(log n)).

a) Let y = 1�x=2 (it an be represented as 0:0y

1

y

2

� � �, where y

j

= 1�x

j

).

Note that 0 � y � 1=2, so we an express x

�1

by a rapidly onvergent series:

x

�1

=

1

2

(1� y)

�1

=

1

2

m�1

X

k=0

y

k

+ r

m

!

; where 0 � r

m

=

y

m

1� y

� 2

�(m�1)

:

We may set m = 2

l

, l = dlog

2

(n+ 2)e, so that r

m

� 2

�(n+1)

and

m�1

X

k=0

y

k

= (1 + y)(1 + y

2

)(1 + y

4

) � � � (1 + y

2

l�1

);

let us denote the last expression by u

m

.

We need to ompute x

�1

=

1

2

(u

m

+r

m

) with pre ision 2

�n

. By negle ting

r

m

, we introdu e an error � 2

�(n+2)

in the result. An additional error,

bounded by 2

�(n+1)

, omes from the ina urate knowledge of x. Therefore,

it suÆ es to ompute u

m

with pre ision 2

�(n+1)

. This al ulation involves

O(l) = O(log n) multipli ations and additions. In doing it, we must take

into a ount that round-o� errors may a umulate; therefore we need to

keep n+�(logn) digits. Ea h multipli ation or addition an be done by a

ir uit of size O(n

2

) and depth O(logn), hen e the total size and depth are

O(n

2

log n) and O((log n)

2

), respe tively.

b) First, we �nd an integer s su h that 2

s

� b < 2

s+1

(see Prob-

lem 2.10b).

The next step is to �nd an approximate value of ba=b , whi h is done

by a ir uit of size O(k

2

log k) and depth O((log k)

2

). We set x = 2

�s

b

and ompute x

�1

with pre ision 2

�k�3

by the ir uit des ribed above. (The

omputed value, as well as the exa t value, does not ex eed 1.) Similarly,

we approximate the number y = 2

�s

a < 2

k+1

with pre ision 2

�2

. Now we

an al ulate a=b = yx

�1

with the overall error < 1=2 and repla e the result

by the losest integer q.

192 3. Solutions

Let r = a � qb. It is lear that either q = ba=b and r = (a mod b), or

q = ba=b + 1 and r = (a mod b) � b. We an determine whi h possibility

has realized by he king if r is negative. If it is, we repla e q by q � 1, and

r by r + b.

2.15. To ompute the fun tion MAJ we ount 1s among the inputs,

i.e., ompute the sum of the inputs. This an be done by the ir uit from

Problem 2.13a. Then we ompare the result with dn=2e (see Problem 2.9).

2.16. We begin with some elementary graph theory. Graphs an be

des ribed by their adja en y matri es. Rows and olumns of the adja en y

matrix A(G) of a graph G are indexed by the verti es of the graph. If (j; k)

is an edge of G, then a

jk

= 1; otherwise a

jk

= 0. We regard the matrix

elements as Boolean variables.

We an de�ne the operations _ and ^ on Boolean matri es by analogy

with the usual matrix addition and multipli ation:

(P _Q)

uv

= P

uv

_Q

uv

;

(P ^Q)

uv

=

_

w

(P

uw

^Q

wv

):

Then, P

k

is a short notation for P ^ � � � ^ P (k times).

What is the meaning of the matrix A

k

, where A = A(G) is the adja en y

matrix of a graph G? Ea h element of this matrix, (A

k

)

uv

, says whether

there is a path of length k (i.e., a hain of k edges) between the verti es

u and v. Similarly, ea h element of the matrix (A _ I)

k

, where I is the

\identity matrix" (I

uv

= Æ

uv

), orresponds to the existen e of a path of

length at most k. Note that if there is a path between u and v, then there is

a path of length � n, where n is the number of verti es. Therefore, to solve

the problem, it suÆ e to ompute B

k

, where B = A _ I and k � n.

Multipli ation of (n � n)-matri es (i.e., the operation ^) an be per-

formed by a ir uit of depth O(log n) and size O(n

3

). Let l = dlog

2

ne,

k = 2

l

. All we need is to ompute B

2

; B

4

; : : : ; B

2

l

by repeated squaring.

This an be done by a ir uit of size O(n

3


2

).

2.17. As with many problems, a detailed solution of this one is tedious

(we would have to hoose a parti ular en oding for ir uits in the �rst pla e),

but the idea is rather simple. To begin with, we des ribe an algorithm for

evaluating a formula of depth d. Then we will extend that algorithm to

ir uits of depth d with one output variable (the size of su h a ir uit is

exp(O(d)) ). The generalization to several output variables is trivial.

The algorithm is simply an implementation of re ursion on a Turing

ma hine. Suppose we need to ompute a subformula A = f(A

0

; A

1

), where

f denotes an arbitrary gate from the basis. (The spe i� hoi e of f is


determined by an ora le query.) Let us assume that the omputation of

A

0

and A

1

an be done with spa e s. We ompute A

0

�rst and keep the

result (whi h is a single bit), while freeing the rest of the spa e used in

the omputation. Then we ompute A

1

. Finally, we �nd A and free the

spa e o upied by the value of A

0

. Thus the omputation of A requires only

s+ O(1) bits. Likewise, ea h level of re ursion requires a onstant number

of bits, hen e O(d) bits will suÆ e for the d levels.

Now let C be a ir uit of depth d with one output variable. A ording

to Problem 2.5, su h a ir uit an be \expanded" into a formula F of the

same depth. Spe i� ally, subformulas of F are in one-to-one orresponden e

with paths from the output variable to nodes (i.e., input and auxiliary vari-

ables) of the ir uit C. Note that we do not have enough spa e to hold the

expansion. Instead, we need to look up for a path in the ir uit C ea h

time we would a ess a subformula in F a ording to the original algorithm.

Thus the algorithm for ir uits involves traversing all paths. Note that spa e

O(d) is suÆ ient to hold a des ription of a path. We also need to allo ate

additional O(d) bits to build an ora le query on this des ription.

2.18. The idea is to onsider the ma hine M running in spa e s as an

automaton with exp(O(s)) states. This is not quite straightforward, be ause

the ma hine a tively reads bits from random pla es rather than just re eiving

a sequen e of symbols. However, the di�eren e is not so dramati . We

an arrange that the automaton repeatedly re eives the same input string,

xxx � � � , ea h time waiting for a needed bit to ome and skipping the others.

Let V be the set of on�gurations of the ma hineM with spa e s. Denote

by v

k

the initial on�guration when the ma hine is asked to ompute the

k-th bit of f

n;m;s

(x). (Note that v

k

does not depend on x.) We will onstru t

an automaton with the input and output alphabet A

0

= A

00

= B and the

state set

Q = f0; : : : ;m� 1g � V � f0; : : : ; jV j � 1g � f0; : : : ; n� 1g;

elements of whi h are denoted by (k; v; t; j). The initial state is q

0

=

(0; v

0

; 0; 0). What follows is a des ription of the automaton transition fun -

tion.

The variable j ounts bits of the input string; ea h time it is in remented

by one. Whenever j mat hes the ontents of the ma hine supplementary

tape (whi h is a part of v), the urrent input bit is a epted as the an-

swer from the ora le; otherwise it is ignored. When j \turns over", i.e.,

hanges from n�1 to 0, the Turing ma hine lo k ti ks, meaning that t gets

in remented and the ma hine on�guration v hanges a ording to its tran-

sition fun tion. Finally, whenever t turns over, the output bit is set to the

194 3. Solutions

omputation result ( ontained in v), k gets in remented, and the ma hine

on�guration is set to v

k+1

.

Let x be a binary string of length n. If we feed the string x

mjV j

= xx � � � x

(mjV j times) to the automaton, and sele t every l-th bit of the output

(l = jV jn), then we obtain the value of the desired fun tion, y = f

n;m;s

(x).

Therefore we an apply the result of Problem 2.11. Our automaton has

exp(O(s)) states and re eives mjV jn = exp(O(s)) symbols, hen e it an be

simulated by a ir uit of size exp(O(s)) and depth O(s

2

).

2.19. The solution onsists of the following steps.

1. We transform C into an equivalent formula � whi h operates with el-

ements of a �xed �nite group G rather than 0's and 1's. The basis

of � onsists of the group operations, i.e., MULTIPLICATION and

INVERSION. The Boolean value 0 is represented by the unit element

e of the group, whereas 1 is represented by an element u 6= e. More

formally, let ' : B ! G be the fun tion de�ned by '(0) = e, '(1) = u.

Then � omputes a fun tion F (g

1

; : : : ; g

N

) (N = exp(O(d))) su h that

'

�

f(x

1

; : : : ; x

n

)

�

= F (g

1

; : : : ; g

N

); where g

j

= '(x

p

j

) or g

j

= onst:

Ea h variable g

j

is used in the formula � only on e.

2. Using the identity (ab)

�1

= b

�1

a

�1

, we transform � to a form in whi h

the inversion is applied only to input variables. Due to the asso iativity

of multipli ation, the tree stru ture of the formula does not matter.

Thus we arrive at the equation

'

�

f(x

1

; : : : ; x

n

)

�

= h

1

� � � h

n

;

where h

j

= '(x

t

j

) or h

j

= '(x

t

j

)

�1

or h

j

= onst:

3. The produ t of group elements h

1

� � � h

N

is omputed by a �nite-state

automaton with jGj = O(1) states.

4. The work of this automaton is simulated by a ir uit of width O(1) and

size O(N) = exp(O(d)).

Steps 2, 3 and 4 are rather straightforward, so we only explain the �rst step.

Let G = A

5

be the group of even permutations on 5 elements; it onsists

of 5!=2 = 60 elements. (A smaller group will not suÆ e: our onstru tion

works only for unsolvable groups.) We will use the standard y le notation

for permutations, e.g., (245) : 2 7! 4; 4 7! 5; 5 7! 2; 1 7! 1; 3 7! 3.

Lemma. There are elements u; v; w 2 A

5

that are onjugate to ea h other

(i.e., v = aua

�1

, w = bub

�1

) and w = uvu

�1

v

�1

.

Proof. The onditions of the lemma are satis�ed by

u = (12345); v = (13542); w = (14352); a = (235); b = (245):


�

We will assume that the original formula C is written in the basis f:;^g.

Let us set '(0) = e, '(1) = u and use the following relations:

'(:x) = u'(x)

�1

; '(x ^ y) = b

�1

'(x) a'(y) a

�1

'(x)

�1

a'(y)

�1

a

�1

b:

Thus any Boolean formula of depth d is transformed into a formula over

A

5

with N � N(d) on e-only variables, where N(d) satis�es the following

re urren e relation:

N(d+ 1) = max

�

N(d) + 1; 4N(d) + 6

:

From this we get N(d) = exp(O(d)).


3.1. The key to the solution is the following property of Boolean fun -

tions: A _ :B and B _ C imply A _ C (i.e., if the �rst two expressions are

true for given values of A, B, C, then the third one is also true). Applying

this property, alled the resolution rule, we an derive new disjun tions from

old ones. This pro ess terminates at some set F of disjun tions that is not

extendible any more; we all it the losure of the original CNF F . By on-

stru tion, F and F represent the same Boolean fun tion. The presen e of

the empty disjun tion in F indi ates that this fun tion is identi al to 0, i.e.,

F is not satis�able. This way of he king the satis�ability is alled the reso-

lution method. It has polynomial running time for 2-CNFs and exponential

running time in general.

Let us des ribe the resolution method in more detail. Re all that a CNF

is a formula of the form F = D

1

^ � � � ^D

m

, where D

1

; : : : ;D

m

are lauses,

i.e., disjun tions of literals. A literal is either a variable or its negation. To

be pre ise, ea h lause is represented in the standard form, whi h is a set

of literals not ontaining x

j

and :x

j

at the same time. The idea is that a

disjun tion like x

1

_ x

1

is redu ed to x

1

, whereas x

1

_ :x

1

_ x

2

(whi h is

equal to 1) is removed from the list of lauses. The empty set is identi�ed

with the logi al onstant 0. We will regard F as a set of lauses (i.e., ea h

lause enters F at most on e, and the order does not matter).

Suppose F ontains lauses A _ :x

j

and C _ x

j

for some variable x

j

.

Let A _ C 6= 1. Then we an redu e D = A _ C to the standard form

des ribed above. The resolution rule takes F to F

0

= F [fDg. The repeated

appli ation of this rule takes F to its losure F . Note that applying the

resolution rule to 2- lauses and 1- lauses, one an only get 2- lauses, 1-

lauses, or the empty lause. (But if some lauses ontain 3 or more literals,

the size an grow even further.)

Theorem. F is not satis�able if and only if F ontains the empty lause.

196 3. Solutions

Proof. We will prove a more general statement. Let Y = l

1

_ � � � _ l

k

be an

arbitrary lause. Then F implies Y if and only if F ontains some lause D

that implies Y (i.e., D � Y ). The theorem orresponds to the Y = 0 ase

of this assertion.

The \if" part is obvious sin e F implies every lause in F .

To prove the \only if" part, we will use indu tion on k, going from k = n

(the total number of variables) down to 0. If k = n, then the fun tion Y (x)

takes value 0 at exa tly one point x = x

�

. The ondition \F implies Y "

means that F (x

�

) = 0. Therefore D(x

�

) = 0 for some D 2 F � F .

If k < n, let x

j

be a variable that does not enter Y . By the indu tion

hypothesis, the ondition \F implies Y " means that there are some D

1

;D

2

2

F su h that D

1

implies Y

1

= Y _ x

j

, and D

2

implies Y

2

= Y _ :x

j

. If we

regard lauses as sets of literals, this ondition be omes D

1

� Y

1

, D

2

� Y

2

.

If x

j

=2 D

1

or :x

j

=2 D

2

, then D

1

� Y or D

2

� Y , so we are done. Otherwise

we apply the resolution rule to D

1

andD

2

to obtain a new lauseD � Y . �

The resolution method an be optimized for 2-CNFs. To any 2-CNF F

we asso iate a dire ted graph �(F ). The verti es of �(F ) are literals. Ea h

2- lause a_ b is represented by two edges, (:a; b) and (:b; a). Ea h 1- lause

a is represented by the edge (:a; a). Let

e

F be F without the empty lause.

It is easy to see that �(

e

F ) onsists of all pairs of verti es that are onne ted

by paths in �(F ). A ording to the above theorem, F is not satis�able if

and only if

e

F ontains lauses x

j

and :x

j

for some j. This is equivalent to

the ondition that �(F ) ontains paths from x

j

to :x

j

and from :x

j

to x

j

.

Therefore, we an use the algorithm from Problem 2.16.

3.2. A ne essary ondition for the existen e of an Euler y le is the

onne tivity of the graph: any two verti es are onne ted by a path. Another

ne essary ondition: ea h vertex has even degree. (The degree of a vertex in

a graph is the number of edges in ident to that vertex.) Indeed, if an Euler

y le visits some vertex k times, then degree of this vertex is 2k.

Together these two onditions are suÆ ient: if a graph is onne ted and

all verti es have even degrees, it has an Euler y le. To prove this, let us

start at any vertex and extend the path by adding edges that have not been

used before. Sin e all verti es have even degrees, this extension pro ess

terminates only when we ome to the starting point (i.e., get a y le). If

not all edges are used, we repeat the pro ess and �nd another y le (note

the unused edges form a graph where ea h vertex has even degree), et .

After that our onne ted graph is overed by several edge-disjoint y les,

and these y les an be ombined into an Euler y le.


V

U

a) b) )

Fig. S3.1. Extending a partial mat hing by an alternating path:

a) old mat hing; b) alternating path; ) new mat hing.

It remains to note that it is easy to ompute the degrees of all verti es

in polynomial time. One an also �nd out in polynomial time whether the

graph is onne ted or not.

3.3. Let F (x

1

; : : : ; x

n

) be a propositional formula. How do we �nd a

satisfying assignment? First, we ask the ora le whether su h an assignment

exists. If the answer is \yes", we next learn whether there is a satisfying

assignment with x

1

= 0. To this end, we submit the query F

0

= F ^:x

1

. If

the answer is \no" (but F is satis�able), then there is a satisfying assignment

with x

1

= 1. Then we try to �x the value of x

2

and so forth.

Similar arguments an be used for the Hamiltonian y le problem

(and other NP-problems).

3.4. We will des ribe a polynomial algorithm for a more general prob-

lem: given a bipartite graph �, �nd a maximal partial mat hing, i.e., a set

of graph edges that do not share any vertex.

(A bipartite graph is a graph whose vertex set is divided into two parts,

U and V , so that the edges onne t only verti es from di�erent parts. Thus

the edge set is E � U � V .)

We onstru t a maximal mat hing stepwise. At ea h step we have a set

of edges C � E that provides a one-to-one orresponden e between A � U

and B � V . To extend C by one edge, i.e., to �nd a mat hing C

0

of size

jCj+ 1, we try to �nd a so- alled alternating path (see Figure S3.1).

De�nition. An alternating path is a sequen e of onse utively onne ted

verti es x

0

; : : : ; x

l

su h that

(1) all verti es in the sequen e are distin t;

(2) edges from C and from E n C alternate;

(3) x

0

2 U n A, x

l

2 V n B.

Thus (x

0

; x

1

) and (x

l�1

; x

l

) belong to E n C, and the path length l is odd.

198 3. Solutions

Lemma. The mat hing C an be extended if and only if an alternating path

exists.

Proof. If an alternating path exists, we an extend C by repla ing the edges

(x

2j�1

; x

2j

) 2 C with the edges (x

2j

; x

2j+1

) 2 E nC. There is one more edge

of the se ond type than of the �rst type.

Conversely, suppose a larger mat hing C

0

exists. Let us superimpose C

and C

0

to form the symmetri di�eren e X = C �C

0

= (C nC

0

) [ (C

0

nC).

Ea h vertex is in ident to at most one edge from C n C

0

and at most one

edge from C

0

nC. Therefore the onne ted omponents of X are paths and

y les in whi h edges from C n C

0

and C

0

n C alternate. Sin e C

0

is larger

than C, at least one of the omponents ontains more edges from C

0

n C

than from C n C

0

. Su h a omponent is an alternating path. �

Therefore, if there is no alternating path, then C is maximal, and the

algorithm stops. It remains to explain how to �nd an alternating path when

it exists.

Let us introdu e dire tion on edges of the graph a ording to the follow-

ing rule: the edges in C go from V to U , and all other edges go from U to

V (as is shown in Figure S3.1b). Then the existen e of an alternating path

is equivalent to the existen e of a pair of verti es u 2 U n A and v 2 V n B

su h that there is a dire ted path from u to v. (Indeed, we may assume

that the path from u to v is simple, i.e., it does not visit the same vertex

twi e, and therefore is an alternating path.) The existen e of a dire ted path

between two vertex subsets an be he ked by a slightly modi�ed algorithm

of Problem 2.16 (now the graph is dire ted, but this does not matter). The

algorithm an also be extended to �nd the path.

Thus we have proved that the perfe t mat hing problem belongs to P.

(The algorithm des ribed above is not optimal.)

3.5. It is suÆ ient to solve (b); however, we provide a (somewhat sim-

pler) solution for (a) �rst.

The Clique problem be omes an Independent set problem if we re-

pla e the graph by its omplement. (De�nition of the terms: For ea h graph

G the omplementary graph G has the same verti es, whereas its edges are

nonedges of G. An independent set is a set I of verti es su h that no two

verti es in I are joined by an edge.) Clearly, liques in G are independent

sets in G, and vi e versa.

For any 3-CNF F with n variables and m lauses we onstru t a graph

H su h that H has an independent set of ardinality m if and only if F

is satis�able. The verti es of H are the literals in C. (We assume that

ea h lause is a disjun tion of three literals; therefore H has 3m verti es.)

Literals in one lause are joined by edges. (Therefore an independent set


of ardinality m should in lude one literal from ea h lause.) Two literals

from di�erent lauses are joined by an edge if they are ontradi tory (i.e.,

the edge goes between x

j

and :x

j

).

If an independent set I of size m exists, it ontains one literal from

ea h lause, and no two literals are ontradi tory. Therefore we an assign

values to the variables so that all literals in I be true. This will be a satis-

fying assignment for the 3-CNF. Conversely, if the 3-CNF has a satisfying

assignment, we hoose a true literal from ea h lause to be a member of I.

This onstru tion, however, does not solve (b), be ause several indepen-

dent sets may orrespond to the same satisfying assignment.

x

i

: x

i

: l

3

l

2

l

1

: l

1

: l

2

variable lause

a) b)

Fig. S3.2

(b) To onstru t a redu tion that pre-

serves ardinality we must be more autious.

Assume that we have a 3-CNF with n vari-

ables and m lauses. For ea h variable x

i

there are two verti es ( alled V-verti es) la-

beled x

i

and :x

i

(see Figure S3.2a); exa tly

one of them will be in the independent set

I of the required size. Informally, x

i

2 I

[:x

i

2 I℄ means that x

i

= 1 [resp. x

i

= 0℄ in

the assignment.

In addition to these 2n verti es, we have 4m verti es alled C-verti es (m

is the number of lauses). Spe i� ally, for ea h lause we have a group of 4

verti es onne ted by edges (so that an independent set an in lude only one

vertex of the four). Figure S3.2b shows the C-verti es (unlabeled verti es

that form a small square in the middle) for a generi lause l

1

_ l

2

_ l

3

. Here

l

1

; l

2

; l

3

are literals, i.e., variables or their negations. (If l

s

is :x

i

, then :l

s

denotes the vertex x

i

.) The �gure also shows edges between C-verti es and

V-verti es (labeled by l

1

; l

2

;:l

1

;:l

2

;:l

3

). Note that in the graph ea h V-

vertex may be onne ted with several groups of C-verti es, as ea h variable

may enter several lauses.

We will be looking for an independent set of size n+m. Su h a set must

in lude exa tly one V-vertex (labeled x

i

or :x

i

) for ea h variable x

i

and

exa tly one of the four C-verti es for ea h lause. Depending on whether

x

i

or :x

i

is in luded, we set x

i

= 1 or x

i

= 0. The hoi e of C-verti es is

determined uniquely by this assignment. Indeed, let us look at Figure S3.2b,

not paying attention to the V-vertex :l

3

. It is easy to he k that for ea h

pair of values of the literals l

1

and l

2

there is exa tly one C-vertex in the

pi ture that an be in luded in the independent set. For example, if l

1

is

true and l

2

is false, then the verti es l

1

and :l

2

are in the independent set

I; therefore only the rightmost C-vertex an be in I.

200 3. Solutions

The vertex :l

3

taken into a ount, the onstru ted set I may turn not

to be independent. This happens when both :l

3

and the top C-vertex (the

one between l

1

and l

2

) belong to I. Then we have l

1

= l

2

= l

3

= 0. But this

is exa tly the ase where the lause l

1

_ l

2

_ l

3

is false.

Therefore, an independent set of size m + n in the graph exists if and

only if there is a satisfying assignment for the given 3-CNF. The pre ed-

ing argument shows that the orresponden e between independent sets and

satisfying assignments is one-to-one.

3.6. (a) See solution for (b) (though in fa t (a) has a simpler solution

that an be found, e.g., in [67℄).

(b) Note that the number of 3- olorings is a multiple of 6 (we an per-

mute olors).

Consider a 3-CNF C in n variables with m lauses. We onstru t a

graph G that has 7m+ 2n+ 3 verti es and admits 3- oloring if and only if

C is satis�able. Moreover, the number of 3- olorings (up to permutations,

i.e., divided by 6) equals the number of satisfying assignments.

Three verti es of G ( alled 0; 1; 2 in the sequel) are onne ted to ea h

other. In any 3- oloring they have di�erent olors; we assume that they

have olors 0; 1; 2 (see Figure S3.3a); this requirement redu es the number

of olorings by a fa tor 1=6.

2

10

2

x

i

: x

i

0 1

l

3

l

2

l

1

: l

1

: l

2

onstants variables lauses

a) b) )

Fig. S3.3

For ea h of the n variables x

i

we have two verti es, whi h are onne ted

to ea h other and to the vertex 2 (see Figure S3.3b, where the vertex 2 is

shown but the verti es 0 and 1 are not). These two verti es are labeled x

i

and :x

i

. In a 3- oloring they will have either olors 1 and 0 (\x

i

is true")

or 0 and 1 (\x

i

is false").

If no other verti es and edges are added, the graph has 2

n

3- olorings

that orrespond to all possible assignments. We now add some \gadgets"

that orrespond to lauses of C. For a given assignment ea h gadget either

has no olorings (if the lause is false for that assignment) or has exa tly

one oloring.


Figure S3.3 shows su h a gadget that orresponds to the lause l

1

_l

2

_l

3

,

where l

1

; l

2

; l

3

are literals (i.e., variables or their negations).

One an he k that the required property (no olorings or unique olor-

ing, depending on l

1

_ l

2

_ l

3

) is indeed true.

3.7. The tiling problem belongs to NP (if Merlin shows a orre t tiling,

then Arthur an he k its orre tness).

Let us show that 3-SAT an be redu ed to tiling. For any 3-CNF C we

need to onstru t an instan e of the tiling problem (i.e., sele t tile types and

boundary onditions) that has a solution if and only if C is satis�able.

Ea h tile type will be usable in only one pla e of the square (this restri -

tion an be enfor ed by appropriate labeling); for ea h position in the square

we will have several types that are usable in this position. Ea h position an

be regarded as a devi e that re eives information from neighboring devi es,

pro esses it, and sends it further.

Tiles in the bottom row orrespond to the variables of C. For ea h

of them we have two possibilities that orrespond to values 0 and 1; these

values are propagated to the top un hanged. The other rows are used to

ompute the values of lauses, passing intermediate results from left to right.

One additional olumn on the right side is needed to ompute the value of C

(the result appears in the upper right orner). Finally, we add a ondition

asserting that C = 1.

Another possibility is to redu e an arbitrary NP-problem to tiling di-

re tly: the omputation table of a TM is a two-dimensional table that sat-

is�es some lo al rules that an be interpreted as tiling rules.

3.8. Merlin tells Arthur the prime fa tors of x (ea h fa tor may repeat

several times). Sin e the multipli ation of integers an be performed in

polynomial time, Arthur an he k whether or not the fa torization provided

by Merlin is orre t.

3.9. We prove that Primality 2 NP by showing how Merlin onstru ts

a polynomial size \ erti� ate" of primality that Arthur an verify in poly-

nomial time.

Let p > 2 be a prime number. It is enough to onvin e Arthur that

(Z=pZ)

�

is a y li group of order p � 1 (see Appendix A, espe ially The-

orem A.10). Merlin an show a generator g of this group, and Arthur an

verify whether g

p�1

� 1 (mod p) (this requires O(log p) multipli ations; see

Se tion 4.2.2).

This is still not enough sin e the order of g may be a nontrivial divi-

sor of p � 1. If for some reason Arthur knows the fa torization of p � 1

(whi h in ludes prime fa tors q

1

; q

2

; : : : ), he an he k whether g

(p�1)=q

j

6� 1

202 3. Solutions

(mod p). But fa toring is a diÆ ult problem, so Arthur annot ompute the

numbers q

j

himself.

However, Merlin may ommuni ate the fa torization to Arthur. The

only problem is that Merlin has to onvin e Arthur that the fa tors are

indeed prime. Therefore Merlin should re ursively provide erti� ates of

primality for all fa tors.

Thus the omplete erti� ate of primality is a tree. Ea h node of this

tree is meant to ertify that some number q is prime (the root orresponding

to q = p). All leaves are labeled by q = 2. Ea h of the remaining nodes is

labeled by a prime number q > 2 and also arries a generator h of the group

(Z=qZ)

�

; the hildren of this node are labeled by prime fa tors of q � 1.

Let us estimate the total size of the erti� ate. The tree we have just

des ribed has at most n = dlog

2

pe leaves (sin e the produ t of all fa tors

of q � 1 is less than q). The total number of nodes is at most twi e the

number of leaves (this is true for any tree). Ea h node arries a pair of n-bit

numbers (q and h). Therefore the total number of bits in the erti� ate is

O(n

2

).

Now we estimate the erti� ate veri� ation omplexity. For ea h of O(n)

nodes Arthur he ks whether h

p�1

� 1 (mod q). This requires O(log q) =

O(n) multipli ations, whi h is done by a ir uit of size O(n

3

). Similar he ks

are performed for ea h parent- hild pair, but the number of su h pairs is also

O(n). Therefore the erti� ate he k is done by a ir uit of size O(n

4

), whi h

an be onstru ted and simulated on a TM in time poly(n).


5.1. If there is an a epting (i.e., ending with \yes") omputational

path, then there is su h a path of length exp(O(s)). Indeed, we may think

of ma hine on�gurations as points, and possible transitions as edges. Thus

an NMT with spa e s is des ribed by a dire ted graph with N = exp(O(s))

verti es, and the an a epting path is just a path between two given verti es.

If there is su h a path, we an eliminate loops from it and get a path of length

� N . Therefore the proof of Theorem 5.2 is still valid for nondeterministi

Turing ma hines.

Thus we redu e the NTM to a polynomial game; then we simulate this

game on a deterministi ma hine (see the �rst part of Theorem 5.2).

5.2. Negations of predi ates from �

k

belong to �

k

and vi e versa, so

that P

�

k

= P

�

k

.

Similarly to P, the lass P

�

k

is losed under negations. Therefore it

remains to prove the in lusion P

�

k

� �

k+1

.


Consider a predi ate F 2 P

�

k

and a polynomial algorithm A that om-

putes it using an ora le G 2 �

k

. Then F (x) is true if and only if there exists

a (polynomial size) sequen e � of pairs (query to an ora le, answer bit) su h

that (1) it for es A to produ e the answer 1; (2) for ea h pair (x; 1) 2 � the

string x is indeed in G; (3) for ea h pair (x; 0) 2 � the string x does not

belong to G.

Condition (1) has the �

1

-form, ondition (2) belongs to �

k

, and ondi-

tion (3) belongs to �

k

. Standard rules for quanti�ers say that any predi ate

of the type

9x[�

1

(x) ^ �

k

(x) ^�

k

(x)℄

belongs to �

k+1

.

Note that we have used the following property of lasses �

k

and �

k

: if

a predi ate G belongs to �

k

=�

k

, then \any element of a �nite sequen e z =

hz

1

; : : : ; z

n

i belongs to G" is a �

k

/�

k

-property of z. (Indeed, polynomially

many games an be played in parallel.)


6.1. Let F be an arbitrary spa e, and F : L � M ! F a bilinear

fun tion. The bilinearity implies that F (u; v) =

P

j;k

u

j

v

k

F (e

j

; f

k

) for any

u =

P

j

u

j

e

j

and v =

P

j

v

j

f

j

. If we set

G

0

�

X

j;k

w

jk

e

j

f

k

1

A

=

X

j;k

w

jk

F (e

j

; f

k

);

then the equation G(uv) = F (u; v) will hold. Conversely, if G

0

: LM!

F is a linear fun tion satisfying G

0

(u v) = F (u; v), then G

0

(e

j

f

k

) =

F (e

j

; f

k

), hen e G

0

= G by linearity.

6.2. Let F = L

0

M

0

and

F : L�M! F ; F (u; v) = A(u)B(v):

Then the required map C is exa tly the fun tion G from the universality

property.

6.3. The abstra t tensor produ t is unique in the following sense. Sup-

pose that two pairs, (N

1

;H

1

) and (N

2

;H

2

), satisfy the universality property.

Then there is a unique linear map G

21

: N

1

! N

2

su h that G

21

(H

1

(u; v)) =

H

2

(u; v) for any u and v. This map is an isomorphism of linear spa es.

The existen e and uniqueness of G

21

follows from the universality of the

pair (N

1

;H

1

): we set F = N

2

, F = H

2

, and get G

21

= G.

Note that if (N

3

;H

3

) also satis�es the universality property, then G

31

=

G

32

G

21

(the omposition of maps). Therefore G

12

G

21

= G

11

and G

21

G

12

=

204 3. Solutions

G

22

. But G

11

and G

22

are the identity maps on N

1

and N

2

, respe tively.

Thus G

12

and G

21

are mutually inverse isomorphisms.

6.4. Let �

2

j

(�

j

> 0) and j�

j

i be the nonzero eigenvalues and the or-

responding (orthonormal) eigenve tors of X

y

X. Then the ve tors j�

j

i =

�

�1

j

Xj�

j

i also form an orthonormal system. Thus we have

Xj�

j

i = �

j

j�

j

i; Xj i = 0 if h j�

j

i = 0 for all j:

This implies (6.2). Finally, we he k that (XX

y

)j�

j

i = �

j

j�

j

i.

6.5.

H[2℄ =

1

p

2

0

B

B

B

B

B

B

B

B

B

B

B

B

�

1 0 1 0 0 0 0 0

0 1 0 1 0 0 0 0

1 0 �1 0 0 0 0 0

0 1 0 �1 0 0 0 0

0 0 0 0 1 0 1 0

0 0 0 0 0 1 0 1

0 0 0 0 1 0 �1 0

0 0 0 0 0 1 0 �1

1

C

C

C

C

C

C

C

C

C

C

C

C

A

;

U [3; 1℄ =

0

B

B

B

B

B

B

B

B

B

B

�

u

00;00

u

00;10

0 0 u

00;01

u

00;11

0 0

u

10;00

u

10;10

0 0 u

10;01

u

10;11

0 0

0 0 u

00;00

u

00;10

0 0 u

00;01

u

00;11

0 0 u

10;00

u

10;10

0 0 u

10;01

u

10;11

u

01;00

u

01;10

0 0 u

01;01

u

01;11

0 0

u

11;00

u

11;10

0 0 u

11;01

u

11;11

0 0

0 0 u

01;00

u

01;10

0 0 u

01;01

u

01;11

0 0 u

11;00

u

11;10

0 0 u

11;01

u

11;11

1

C

C

C

C

C

C

C

C

C

C

A

:


7.1. Sin e the onjun tion ^ and the negation : form a omplete basis

for Boolean ir uits, Lemmas 7.1, 7.2 show that it is suÆ ient to realize the

fun tions ^

�

(i.e., the To�oli gate), :

�

: (x; y) 7! (x; x�y�1) and

he

. But

the To�oli gate is already in the basis, :

�

[1; 2℄ = :[2℄

he

[1; 2℄, so it suÆ es

to realize

he

. Let us introdu e an auxiliary bit u initialized by 0. Then the

a tion of

he

[1; 2℄ an be represented as :[u℄ ^

�

[u; 1; 2℄:[u℄.


8.1. Any rotation in three-dimensional spa e an be represented as a

omposition of three rotations: through an angle � about the z axis, then


through an angle � about the x axis, and then through an angle about

the z axis. Therefore any operator a ting on one qubit an be represented

in the form

(S8.1) U = e

i'

e

i( =2)�

z

e

i(�=2)�

x

e

i(�=2)�

z

:

Ea h of the operators on the right-hand side of (S8.1) an be expressed in

terms of H and a ontrolled phase shift:

e

i'

= �(e

i'

)�

x

�(e

i'

)�

x

; e

i'�

z

= �(e

�i'

)�

x

�(e

i'

)�

x

;

�

x

= H�(e

i�

)H; e

i'�

x

= He

i'�

z

H:

8.2. Let U = e

i'

Z, where Z 2 SU(2). Then �(U) = �(e

i'

)�(Z). The

operator �(e

i'

) a ts only on the ontrol qubit, so it remains to realize �(Z).

Any operator Z 2 SU(2) an be represented in the form

(S8.2) Z = A�

x

A

�1

B�

x

B

�1

; A;B 2 SU(2):

Therefore �(Z) is realized by the ir uit shown in Figure S8.1a.

Geometri ally, equation (S8.2) is equivalent to the assertion that any

rotation of the three-dimensional spa e is the omposition of two rotations

through 180

Æ

. The proof of this assertion is shown in Figure S8.1b.

A

�1

B

�1

A B�

x

�

x

�=2

�

180

Æ

180

Æ

a) b)

Fig. S8.1. Realization of the operator �(Z), Z 2 SU(2).

8.3. Let e

i�

be the required phase shift. The operators X = e

i'

1

�(e

i�

)

and Y = e

i'

2

�

x

an be realized over the basisA. Although the phases '

1

, '

2

are arbitrary, we an realize X

�1

= e

�i'

1

�(e

�i�

) and Y

�1

= e

�i'

2

�

x

with

exa tly the same phases by inverting the ir uits for X and Y . Therefore

we an realize the operator

e

i��

z

= X

�1

Y

�1

XY ;

206 3. Solutions

the unknown phases an el out. It remains to apply this operator to an

an illa in the state j0i.

8.4. We an onstru t a ir uit for the operator �(U) using the Fredkin

gate F = �($) | a ontrolled bit ex hange. It is de�ned and realized as

follows:

F ja; b; i

def

=

(

j0; b; i if a = 0;

j1; ; bi if a = 1;

F [1; 2; 3℄ = �

2

(�

x

)[1; 2; 3℄ �

2

(�

x

)[1; 3; 2℄ �

2

(�

x

)[1; 2; 3℄:

Figure S8.2 shows how, given an arbitrary gate U preserving j0i, one an

onstru t a ir uit for �(U). The ontrolled ex hange (shown in re tangles)

is performed with two n-qubit registers by applying n opies of the Fredkin

gate. If the ontrolling qubit ontains 1, the state j�i will be submitted to

U , otherwise the input to U is j0i.

U

$$

j�i

j0i

Fig. S8.2. Realization of the operator �(U), assuming that U j0i = j0i.

8.5 ( f. [7℄). We are going to give a solution with r = 1, but this will

require some preparation. Our �nal ir uit will be built of sub ir uits realiz-

ing �

n

(�

x

) for n < k with r = O(n) an illas. The idea is that if a sub ir uit

operates only on some of the qubits, it an borrow the remaining qubits and

use them as indi ated, i.e., realizing U I

B

r

instead of U . An illas used in

this spe ial way will be alled dummy qubits.

a

m

k �m+ 1

a) �

k

(�

x

) b) )

Fig. S8.3


Let us introdu e a graphi notation for the operator �

k

(�

x

); see Fig-

ure S8.3a. The key onstru tion is shown in Figure S8.3b. This ir uit

performs the following a tions:

1. The value a of the fourth (se ond to bottom) bit is hanged if and only

if the �rst two bits are 1. (This hange an be ompensated by applying

the To�oli gate one more time.)

2. The bottom bit is altered if and only if the �rst three bits are 1.

The most obvious use of this primitive is shown in Figure S8.3 , where

the operator �

k

(�

x

) is realized by two opies of �

m

(�

x

) and two opies of

�

k�m+1

(�

x

), using one dummy qubit (the se ond to the bottom one).

Now re all the idea we mentioned at the beginning. In implementing

the operators �

n

(�

x

) for n = m and n = k �m + 1 we an use k �m + 1

and m dummy qubits, respe tively. If we set m = dk=2e, then in both ases

we will have at least n� 1 dummy qubits available. Therefore it remains to

realize �

n

(�

x

) using at most n� 1 dummy qubits.

Fig. S8.4. Realization of �

n

(�

x

) with n� 2 dummy qubits.

The ir uit in Figure S8.4 realizes �

n

(�

x

) with n � 2 dummy qubits

(number n+ 1 through 2n� 2). The part of the ir uit on the right (whi h

is applied �rst) hanges the values of the bits as follows:

x

n+j

7! x

n+j

� x

1

x

2

� � � x

j+1

; j = 1; : : : ; n� 2:

The left part a ts similarly, but j runs from 1 to n� 1. These two a tions

an el ea h other, ex ept for the hange in the last bit,

x

2n�1

7! x

2n�1

� x

1

x

2

� � � x

n

:

8.6. The operator �

k

(U) has been realized by the ir uit shown in Fig-

ure 8.5. Ea h of the operators �

j

(Z

k�j

) in that ir uit is represented by

two opies of �

j

(�

x

) (or one opy of �

j

(i�

x

) and one opy of �

j

(�i�

x

) )

and four appli ations of one-qubit gates ( f. Figure S8.1a). Let us examine

208 3. Solutions

this onstru tion on e again. We note that for the realization of the op-

erator �

j

(�

x

) one an use up to k � j dummy qubits (see the solution to

Problem 8.5). Thus for j = 1; : : : ; k� 1 ea h of these operators an be real-

ized by a ir uit of size O(k). The remaining onstant number of operators

�

j

(�i�

x

) are realized by ir uits of size O(k

2

) as suggested in the main text

(see Figure 8.4). The total size of the resulting ir uit is O(k

2

).

8.7. Property (8.10) follows from this hain of inequalities:

XY j�i

� kXk

Y j�i

� kXk kY k

j�i

:

To prove (8.11), we note that the the operators XX

y

and X

y

X have the

same nonzero eigenvalues (see Problem 6.4).

Equation (8.12) follows from the fa t that the eigenvalues of the operator

(X Y )

y

(X Y ) = (X

y

X) (Y

y

Y ) have the form x

j

y

k

, where x

j

and y

k

are the eigenvalues of X

y

X and Y

y

Y , respe tively.

Equation (8.13) is obvious.

8.8. a) The ondition that

~

U approximates U with pre ision Æ is ex-

pressed by inequality (8.16). Multiplying the expression under the norm by

~

U

�1

on the left and by U

�1

on the right, we get kV U

�1

�

~

U

�1

V k � Æ.

b) It is suÆ ient to verify the statement for L = 2:

~

U

2

~

U

1

V � V U

2

U

1

=

~

U

2

(

~

U

1

V � V U

1

) + (

~

U

2

V � V U

2

)U

1

�

~

U

2

(

~

U

1

V � V U

1

)

+

(

~

U

2

V � V U

2

)U

1

�

~

U

2

~

U

1

V � V U

1

+

~

U

2

V � V U

2

U

1

=

~

U

1

V � V U

1

+

~

U

2

V � V U

2

:

8.9. We rephrase the problem as follows. Let Q = U I

B

(N�n)

and

M = B

n

j0

N�n

i. We have

k

~

U�

M

�Q�

M

k � Æ; QM =M;

and we are looking for a unitary operator W su h that W�

M

= Q�

M

and

kW �

~

Uk � O(Æ). Let L =

~

UM. We will try to �nd a unitary operator X

su h that

(S8.3) XL =M; kX � Ik � O(Æ):

If su h an X exists, then the following operator will serve as a solution:

W = Q�

M

+X

~

U(I ��

M

):

To satisfy the onditions (S8.3), we show �rst that L and M are lose

enough:

k�

L

��

M

k =

~

U�

M

~

U

y

�Q�

M

Q

y

� 2Æ;

sin e k

~

U�

M

�Q�

M

k � Æ. It remains to apply the following lemma.


Lemma. Let L;M � N and k�

L

� �

M

k � " < 1=2. Then there is a

unitary operator X su h that XL =M and kX � Ik = O(").

Proof. Let Y = �

M

�

L

+(I��

M

)(I��

L

). It is immediately evident that

Y takes L into M and L

?

into M

?

. The norm kY � Ik an be estimated

as follows:

kY � Ik = k�

M

�

L

��

M

��

L

+�

M

�

L

k

� k(�

M

��

L

)�

L

k+ k�

M

(�

M

��

L

)k � 2" < 1:

The operator Y is not unitary, but the above estimate shows that it is non-

degenerate. Therefore we an de�ne the operator X = Y (Y

y

Y )

�1=2

, whi h

is unitary. Sin e Y

y

Y leaves the subspa e L invariant, X takes L into M.

To estimate the norm of X we expand (Y

y

Y )

�1=2

into a Taylor series:

(Y

y

Y )

�1=2

= I +

1

2

Z +

3

8

Z

2

+ � � � ; where Z = I � Y

y

Y .

Therefore

(Y

y

Y )

�1=2

� I

� (1 � kZk)

�1=2

� 1 = O("), from whi h we

obtain kX � Ik = O("). �

8.10. Ea h of the onsidered rotations generates an everywhere dense

subset in the group of rotations about a �xed axis. Indeed, this group is

isomorphi to R=Z, and the generated subset onsists of elements of the

form n� mod 1, n 2 Z, where � is irrational (2�� is the rotation angle).

Therefore it remains to prove that the rotations about two di�erent axes

generate SO(3). For this it suÆ es to show that the subgroup generated

by all rotations about two di�erent lines a ts transitively on the sphere.

This fa t be omes obvious by looking at Figure S8.5 (if we an move along

two families of ir les, then from any point of the sphere we an rea h

any other point). A rigorous proof is obtained similarly for the solution to

Problem 8.11.

Fig. S8.5. Rotating about two axes in R

3

.

Remark. This solution is non onstru tive: we annot give an upper bound

for the number of generators X;X

�1

; Y; Y

�1

whose produ t approximates

210 3. Solutions

a given element U 2 SO(3) with a given pre ision Æ, even if X and Y are

�xed. Therefore the implied algorithm for �nding su h an approximation

may exhibit arbitrary long running times. The reason for this non onstru -

tiveness is as follows. Although the number � is irrational, it might be very

losely approximated by rational numbers (this o urs when the oeÆ ients

of the ontinued fra tion expansion of � grow rapidly). Then any r 2 R=Z is

approximated by elements of the form n� (n 2 Z) with arbitrary pre ision

Æ > 0, but the number n an be arbitrarily large: larger than any spe i�ed

fun tion of Æ.

A onstru tive proof and an e�e tive (for �xed X and Y ) algorithm for

�nding an approximation are rather ompli ated; see Se tion 8.3.

8.11. If we set j�

0

i = V

�1

j�i, then H

0

= V

�1

HV is the stabilizer of

C (j�

0

i). Then the problem takes the following form: prove that the union of

the stabilizers of two distin t one-dimensional subspa es generates U(M).

It suÆ es to show that the group G generated by H[H

0

a ts transitively

on the set of unit ve tors. Indeed, suppose that for ea h unit ve tor j i there

exists U

2 G su h that U

j�i = j i. Then

U(M) =

[

j i2M

U

H:

We prove the transitivity of the a tion of the group G. We note that for

any unit ve tor j i,

Hj i = Q(#)

def

=

�

j�i : jh�j�ij = os#

;

H

0

j i = Q

0

(#

0

)

def

=

�

j�i : jh�j�

0

ij = os#

0

;

where # = #(j i), #

0

= #

0

(j i) denote the angles between j i and j�i; j�

0

i,

respe tively:

os# = jh j�ij; os#

0

= jh j�

0

ij; 0 � #; #

0

� �=2:

In further formulas we will also use the angle � between the ve tors j�i and

j�

0

i: os� = jh�j�

0

ij, 0 � � � �=2.

It an be veri�ed that for dimM � 3 the angle between the ve tor

j�i and elements of Q

0

(#

0

) varies from j� � #

0

j to minf�+ #

0

; �=2g. Sim-

ilarly, the angle between j�

0

i and elements of Q(#) varies from j� � #j to

minf�+ #; �=2g. Therefore

HQ

0

(#

0

) =

[

j��#

0

j�#�minf�+#

0

;�=2g

Q(#);

H

0

Q(#) =

[

j��#j�#

0

�minf�+#;�=2g

Q

0

(#

0

):


It follows that

H

0

j�i = Q

0

(a);

HH

0

j�i =

[

0�#�minf2�;�=2g

Q(#);

H

0

HH

0

j�i =

[

0�#

0

�minf3�;�=2g

Q

0

(#

0

);

and so forth. Hen e, a ting on the ve tor j�i alternately with elements from

H

0

and H suÆ iently many times, we an obtain any unit ve tor j i.

8.12. First, we will realize the operator �

2

(i) I

B

, i.e., the appli ation

of �

2

(i) to the �rst two qubits. To this end, we will use the operators �

2

(�

�

)

(� = x; y; z). We note that �

y

= K�

x

K

�1

, �

z

= H�

x

H, hen e

�

2

(�

y

)[1; 2; 3℄ = K[3℄ �

2

(�

x

)[1; 2; 3℄ K

�1

[3℄;

�

2

(�

z

)[1; 2; 3℄ = H[3℄ �

2

(�

x

)[1; 2; 3℄ H[3℄:

Using the identity �

x

�

y

�

z

= iI

B

, we obtain

�

2

(�

x

)�

2

(�

y

)�

2

(�

z

) = �

2

(iI

B

) = �

2

(i) I

B

:

The ir uit for the realization of �

2

(i) I

B

is shown in Figure S8.6. The

inverse operator, �

2

(�i) I

B

, an be realized in a similar way.

HHK K

�1

�

x

�

x

�

x

Fig. S8.6. Realization of the operator �

2

(i) I

B

over the standard basis.

Now we onsider a new basis,

Q

0

=

�

H; �

2

(i); �

2

(�i)

:

It it suÆ ient to show that the appli ations of elements of this basis to two

qubits generate a dense subset of U(B

2

)=U(1). Let X = �(HKH); this

operator an be realized as follows:

X[1; 2℄ = H[2℄ �(K)[1; 2℄ H[2℄; �(K) = �

2

(i):

We a t with X on B

2

in two possible ways: X

1

= X[1; 2℄ and X

2

= X[2; 1℄.

The operators Y

1

= X

1

X

�1

2

, Y

2

= X

�1

2

X

1

are also realizable over the basis

Q

0

.

The operatorsX

1

, X

2

(and onsequently also Y

1

, Y

2

) preserve the ve tors

j00i and j�i = j01i+ j10i+ j11i. Dire t al ulations show that Y

1

and Y

2

do

212 3. Solutions

not ommute and have eigenvalues 1; 1; �

+

; �

�

, where �

�

= (1�

p

�15)=4 =

e

�i'=2

. The last two eigenvalues hara terize the a tion of Y

1

; Y

2

on the

subspa e L = C

�

j00i; j�i

�

?

. In SO(3)

�

=

U(L)=U(1) an operator with su h

eigenvalues orresponds to a rotation through the angle ' about some axis.

We will show that this angle is in ommensurate with �, so that Y

1

, Y

2

generate an everywhere dense subset in U(L)=U(1) (see Problem 8.10).

If '=� were rational, then �

+

, �

�

would be roots of 1, and hen e al-

gebrai integers. The minimal (over rationals) polynomial for an algebrai

integer � has the form f

�

(x) = x

n

+ a

1

x

n�1

+ � � �+ a

n

, where a

j

2 Z. How-

ever, the minimal polynomial for �

�

is x

2

�

1

2

x + 1; therefore �

�

are not

algebrai integers.

To omplete the proof, we apply the result of Problem 8.11 twi e. The

operators Y

1

, Y

2

generate an everywhere dense subset in U(L)=U(1), and

the operator V = �(K) preserves C (j00i) but not C (j�i), so that Y

1

, Y

2

,

V

�1

Y

1

V , V

�1

Y

2

V generate an everywhere dense set inU

�

L�C (j�i)

�

=U(1).

The operator H[1℄ does not preserve C(j00i); applying the result of Prob-

lem 8.11 on e again, we obtain an everywhere dense subset inU(B

2

)=U(1).

8.13. It is lear that powers of the operator R

4

= exp(4�i��

x

) 2 SU(2)

approximate exp(i'�

x

) for any given ' with any given pre ision. Hen e

powers of R approximate operators of the form

i

s

exp(i'�

x

) = i

s

�

os' i sin'

i sin' os'

�

; s = 0; 1; 2; 3:

For s = 3 and ' = �=2 this expression yields �

x

, so that we obtain the

To�oli gate: (�

2

(R))

k

� �

2

(�

x

) for suitable k. For s = 1; 3 and ' = 0 we

get �

2

(�iI

B

) = �

2

(�i) I

B

(the identity fa tor may be ignored).

Now we show how to eliminate unne essary ontrol qubits:

�

x

[1℄ �(X) �

x

[1℄ �(X) = I

B

X:

Thus we an realize �(�

x

), K = �(i), K

�1

= �(�i) and any operator

of the form exp(i'�

x

). A ording to Problem 8.2, these operators form a

omplete basis.

8.14. a) Let �

0

� � be a maximal subset with the property that the

distan e between any pair of distin t elements is greater than �Æ

0

, where

Æ

0

= Æ=(1� �) (\maximal" means that any larger subset does not have this

property). Then �

0

is an �Æ

0

-net for �. But � is a Æ-net for R, so �

0

is an

�Æ

0

+ Æ-net for R. Note that �Æ

0

+ Æ = Æ

0

. The interse tion of �

0

with the

Æ

0

-neighborhood of R is an �-sparse Æ

0

-net for R.

b) The proof is based on a volume onsideration. For our purposes it

is suÆ ient to onsider the volume on the ambient spa e P = L(C

M

) �

SU(M) rather than the intrinsi volume on SU(M). We regard P as a


2M

2

-dimensional real spa e endowed with a norm. The volume of the a-

neighborhood of any point in P is a

2M

2

(up to an arbitrary onstant fa tor).

Let a = �r=(2q), b = r + r=q + a. Then the a-neighborhoods of the

elements of the net do not overlap and are ontained in the b-neighborhood

of I. Therefore the number of elements does not ex eed (b=a)

2M

2

. But

b

a

=

q

�

�

2 +

2 + �

q

�

�

5q

�

;

sin e � < 1 and q > 1.

8.15. The in lusion [R

a

; R

b

℄ � R

2ab

is almost obvious: if kXk � a,

kY k � b, then

[X;Y ℄

= kXY � Y Xk � 2kXk kY k � 2ab:

So, we need to prove that R

ab=4

� [R

a

; R

b

℄. Without loss of generality we

may assume that a = b = 2. Let us onsider an arbitrary Z 2 su(M) su h

that kZk � 1. Our goal is to represent Z as [X;Y ℄, where kXk; kY k � 2.

Let us hoose an arbitrary basis in C

M

in whi h Z is diagonal,

Z = i

0

B

�

z

1

0

.

.

.

0 z

M

1

C

A

; z

1

; : : : ; z

M

2 R;

M

X

j=1

z

j

= 0; jz

j

j � 1:

Lemma. There exists a permutation � : f1; : : : ;Mg ! f1; : : : ;Mg su h

that 0 �

P

k

j=1

z

�(j)

� 2 for k = 0; : : : ;M .

Proof. We will pi k elements from the set fz

1

; : : : ; z

M

g one by one. Suppose

that k�1 elements z

�(1)

; : : : ; z

�(k�1)

has been already hosen so that w

k�1

=

P

k�1

j=1

z

�(j)

satis�es the inequality 0 � w

k�1

� 2. The sum of the remaining

elements equals �w

k�1

, so there are some z

p

� �w

k�1

and z

q

� 0 among

them. We set �(k) = p if w

k�1

< 1, and �(k) = q otherwise. In both ases

the number w

k

= w

k�1

+ z

�(k)

satis�es 0 � w

k

� 2. �

By applying the permutation � to the basis ve tors, we an arrange that

the partial sums

w

k

=

k

X

j=1

z

�(j)

; k = 0; : : : ;M;

214 3. Solutions

satisfy the inequality 0 � w

k

� 2. Let v

k

=

p

w

k

=2. Then Z = XY � Y X,

where

X = i

0

B

B

B

B

B

B

B

�

0 v

1

0 � � � 0

v

1

0 v

2

.

.

.

0 v

2

0

.

.

.

0

.

.

.

.

.

.

.

.

.

v

M�1

0 � � � 0 v

M�1

0

1

C

C

C

C

C

C

C

A

;

Y =

0

B

B

B

B

B

B

B

�

0 �v

1

0 � � � 0

v

1

0 �v

2

.

.

.

0 v

2

0

.

.

.

0

.

.

.

.

.

.

.

.

.

�v

M�1

0 � � � 0 v

M�1

0

1

C

C

C

C

C

C

C

A

:

We will estimate the norms of X and Y , using the fa t that 0 � v

k

� 1.

Note that X and Y are onjugate (by the matrix that has i

k

on the diagonal

and 0 o� the diagonal), so kXk = kY k. Let us examine the matrix A =

i

�1

X, whi h obviously has the same norm, kAk = max

j

j�

j

j, where f�

j

g is

the spe trum of A. All entries of A are nonnegative, so the Perron{Frobenius

theorem [50℄ applies. Thus there exists an eigenvalue �

max

= �

max

(A) su h

that j�

j

j � �

max

for all j. The orresponding eigenve tor has nonnegative

oeÆ ients. It is easy to see that

�

max

(A) = lim

N!1

�

h�jA

N

j�i

�

1=N

;

where j�i is an arbitrary ve tor with positive oeÆ ients. Therefore �

max

(A)

is a monotone fun tion of the matrix entries, i.e., �

max

annot de rease if the

entries in rease. It follows that kAk does not ex eed the largest eigenvalue

of the matrix

B =

0

B

B

B

B

B

B

B

�

0 1 0 � � � 0

1 0 1

.

.

.

0 1 0

.

.

.

0

.

.

.

.

.

.

.

.

.

1

0 � � � 0 1 0

1

C

C

C

C

C

C

C

A

:

The latter is equal to 2 os(�=(M + 1)) < 2.

8.16. By bringing X to the diagonal form, the inequality (8.19) an be

derived from the inequality je

ix

� 1� ixj � O(x

2

), x 2 R.

To prove (8.20), let us �x X and Y and use the formulas

exp(X + Y ) = lim

n!1

�

e

X=n

e

Y=n

�

n

; exp(X) exp(Y ) =

�

e

X=n

�

n

�

e

Y=n

�

n

:


To pass from P =

�

e

X=n

e

Y=n

�

n

to Q =

�

e

X=n

�

n

�

e

Y=n

�

n

, one needs to pull

all e

X=n

fa tors to the left, ommuting them with e

Y=n

fa tors on the way.

Thus the di�eren e Q�P an be represented as a sum of n(n� 1)=2 terms

of the form U(e

X=n

; e

Y=n

� e

Y=n

e

X=n

)V , where U; V are unitary. The norm

of ea h term equals

e

X=n

e

Y=n

� e

Y=n

e

X=n

�

1

n

2

[X;Y ℄

+O

�

1

n

3

�

;

hen e

exp(X) exp(Y )� exp(X + Y )

�

1

2

[X;Y ℄

� kXk kY k:

The inequality (8.21) follows from (8.19). Indeed, let A � B denote that

kA�Bk � O

�

kXk kY k

�

kXk + jY k

��

(assuming that X and Y are �xed). Then

[[e

X

; e

Y

℄℄� I =

�

(e

X

� I)(e

Y

� I)� (e

Y

� I)(e

X

� I)

�

e

�X

e

�Y

� XY � Y X = [X;Y ℄ � exp([X;Y ℄)� I:

8.17. a) In view of the result of Problem 8.14a), it is suÆ ient to show

that [[�

1

;�

2

℄℄ is an

�

r

1

r

2

=4; (25=6)r

1

r

2

=q

�

-net.

Let V 2 S

r

1

r

2

=4

. Due to the property of the group ommutator (8.18),

there are some V

1

2 S

r

1

and V

2

2 S

r

2

su h that

d

�

[[V

1

; V

2

℄℄; V

�

� O(r

1

r

2

(r

1

+ r

2

)):

Ea h V

j

(j = 1; 2) an be approximated by an element U

j

2 �

j

with pre-

ision Æ

j

= r

j

=q. Using the biinvarian e of the distan e fun tion and the

property of the group ommutator (8.17), we obtain

d

�

[[U

1

; U

2

℄℄; [[V

1

; V

2

℄℄

�

� d

�

[[U

1

; U

2

℄℄; [[V

1

; U

2

℄℄

�

+ d

�

[[V

1

; U

2

℄℄; [[V

1

; V

2

℄℄

�

= d

�

[[V

�1

1

U

1

; U

2

℄℄; I

�

+ d

�

I; [[V

1

; U

�1

2

V

2

℄℄

�

� 2d(V

1

; U

1

)d(U

2

; I) + 2d(V

1

; I)d(V

2

; U

2

)

� 2Æ

1

(r

2

+ Æ

2

) + 2r

1

Æ

2

= 4r

1

r

2

=q + 2r

1

r

2

=q

2

:

Therefore

d

�

[[U

1

; U

2

℄℄; V

�

� O(r

1

r

2

(r

1

+ r

2

)) + 4r

1

r2(q

�1

+ q

�2

=2)

� (4 + f(r; q))r

1

r

2

=q;

where f(r; q) = q

�1

=2 + O(r

1

+ r

2

)q. If r

1

, r

2

are small enough (as in the

ondition of the problem), then f(r; q) is small too; we may assume that

f(r; q) � 1=6. Thus d

�

[[U

1

; U

2

℄℄; V

�

� 25=6.

216 3. Solutions

b) Let V 2 S

r

1

. Then there is some U

1

2 �

1

su h that d(V;U

1

) � Æ

1

.

Therefore U

�1

1

V 2 S

Æ

1

� S

r

2

. It follows that d(U

�1

1

V;U

2

) � Æ

2

for some

U

2

2 �

2

, but d(U

�1

1

V;U

2

) = d(V;U

1

U

2

).

) We just need to iterate the previous argument. Note that the elements

Z

j

2 �

j

an be found by an e�e tive pro edure whi h involves O(n) group

operations and al ulations of the distan e fun tion.


9.1. First, we apply k opies of the ir uit U ; then we ompute the

majority fun tion bitwise, i.e., we apply m opies of an operator M that

realizes MAJ

�

with s an illas. Thus the omplete ir uit an be represented

symboli ally as W = M

m

U

k

. We need to estimate the probability of

obtaining the orre t result y = F (x), given by

p(yjx) =

X

y

1

;:::;y

k

z

1

;:::;z

k

�

�

�

y

1

; z

1

; : : : ; y

k

; z

k

; y; 0

ms

�

�

W

�

�

(x; 0

N�n

)

k

; 0

m

; 0

ms

�

�

�

�

2

;

assuming that

P

z

�

�

hF (x); zjU jx; 0

N�n

i

�

�

2

= 1 � "

x

, where "

x

� " < 1=2 for

ea h x.

If more than half of the output registers of the initial ir uit ontain

F (x), then the result of the bitwise majority vote will ne essarily be F (x).

Therefore, similarly to (4.1) on p. 37, we have

1� p(F (x)jx)

�

X

S�f1;:::;kg;

jSj�k=2

X

y

1

;:::;y

k

;

y

j

=F (x), j2S

X

z

1

;:::;z

k

�

�

�

y

1

; z

1

; : : : ; y

k

; z

k

�

�

U

k

�

�

(x; 0

N�n

)

k

�

�

�

�

2

=

X

S�f1;:::;kg;

jSj�k=2

(1� "

x

)

jSj

"

k�jSj

x

< �

k

; where � = 2

p

(1� ")".

9.2. Let j�i = jx; 0

N�n

i and M = jF (x)i B

N�m

( f. Remark 10.1).

We have

h�jU

y

�

M

U j�i � 1� "; k

~

U � Uk � LÆ:


Thus, we do the following estimate:

�

�

�

h�j

~

U

y

�

M

~

U j�i � h�jU

y

�

M

U j�i

�

�

�

=

�

�

�

h�j(

~

U

y

� U

y

)�

M

~

U j�i+ h�jU

y

�

M

(

~

U � U)j�i

�

�

�

�

�

�

�

h�j(

~

U

y

� U

y

)�

M

~

U j�i

�

�

�

+

�

�

�

h�jU

y

�

M

(

~

U � U)j�i

�

�

�

� k

~

U

y

� U

y

k+ k

~

U � Uk � 2LÆ;

whi h implies that h�j

~

U

y

�

M

~

U j�i � 1� "� 2LÆ.

9.3. If we hange the basis in the ontrolled qubit, we get the operator

�

2

(�1) = �(�

z

). Indeed,

�

x

H H

=

H�

x

H

=

�

z

This operator multiplies j1; 1i by �1 and does not hange the remaining

basis ve tors.

Let us see what happens if we hange the basis in the ontrolling qubit.

The resulting operator is H[1℄ �(�

x

)[1; 2℄H[1℄; we al ulate its matrix ele-

ments:

a; b

�

�

H[1℄ �(�

x

)[1; 2℄ H[1℄

�

�

; d

�

=

X

x;y

�

1

p

2

(�1)

ax

�

x; b

�

�

�(�

x

)[1; 2℄

�

�

y; d

�

�

1

p

2

(�1)

y

�

=

1

2

X

x;y

(�1)

ax+ y

Æ

x;y

Æ

b; y�d

=

1

2

(�1)

(a+ )(b+d)

:

Thus

�

x

H H

=

1

2

0

B

B

�

1 1 1 �1

1 1 �1 1

1 �1 1 1

�1 1 1 1

1

C

C

A

= �

x

[1℄�

x

[2℄V;

where V = I � 2j�ih�j and j�i =

1

2

P

x;y

jx; yi. (Re all that a multiqubit

version of the operator V was de�ned on page 85.)

9.4. Part a) follows from part b).

b) Let us des ribe a ir uit that gives an approximate solution. First of

all, we write the re ursive formula

(S9.1) j�

n;q

i = os# j0i j�

n�1;q

0

i+ sin# j1i j�

n�1;q

00

i;

218 3. Solutions

where

q

0

= 2

n�1

; q

00

= q � 2

n�1

; # = ar os

p

q

0

=q if q > 2

n�1

;

q

0

= q; q

00

= 1; # = 0 if q � 2

n�1

.

We des ribe a omputation pro edure based on formula (S9.1). It on-

sists of the following steps.

1. Compute q

0

, q

00

and #=�, with the latter number represented as an ap-

proximation by l binary digits. Store the results of the omputation in

supplementary qubits.

2. Apply the operator

R(#) =

�

os# � sin#

sin# os#

�

to the �rst qubit of the register in whi h we are omposing j�

n;q

i. (It

initially ontains j0

n

i.)

3. In the remaining n� 1 bits, produ e a state depending on the value of

the �rst bit: if it equals 0, then reate the state j�

n�1;q

0

i (by alling the

pro edure re ursively); otherwise reate j�

n�1;q

00

i.

4. Reverse step 1 to lear the supplementary memory.

The operator R(#) is realized approximately. Let #=� =

P

l

k=1

a

k

2

�k

.

Then R(#) � R(�=2

l

)

a

l

� � �R(�=2)

a

1

with pre ision O(2

�l

). Thus the a -

tion of the operator R(#), ontrolled by #, is represented as the produ t of

operators �(R(�=2

k

)), where the k-th bit of the number #=� ontrols the

appli ation of the operator R(�=2

k

).

The overall pre ision of the onstru ted ir uit is Æ = O(n2

�l

); its size,

expressed in terms of the length of the input and the pre ision, is poly(n+

log(1=Æ)).

) We des ribe the realization of the Fourier transform operator found

by D.Coppersmith [19℄ and, independently, by D.Deuts h.

We enumerate the qubits in des ending order from n � 1 to 0. Thus a

number x, 0 � x < n, is represented as x

n�1

� � � x

0

=

P

n�1

k=0

2

k

x

k

, so that

the exponent in the de�nition of the operator F

q

(q = 2

n

) an be written as

follows:

xy

2

n

=

n�1

X

k=0

n�1

X

j=0

2

k+j�n

x

k

y

j

�

X

k+j<n

2

k+j�n

x

k

y

j

(mod 1):

It is onvenient to reverse the bit order in x, i.e., to repla e k by n� 1� k.

This way the Fourier transform operator is represented in the form

F

2

n

= V

n

R

n

;


H

H

H

H

!

4

!

8

!

16

!

4

!

8

!

4

x

0

x

1

x

2

x

3

!

n

= e

2�i=n

Fig. S9.1. Realization of the Fourier transform operator F

2

n

(for n = 4).

where R

n

: jx

n�1

; : : : ; x

0

i 7! jx

0

; : : : ; x

n�1

i, and V

n

has the following matrix

elements:

hy

n�1

; : : : ; y

0

jV

n

jx

n�1

; : : : ; x

0

i =

1

2

n=2

exp

0

�

2�i

X

0�j�k<n

2

�(k�j+1)

x

k

y

j

1

A

:

Let us analyze the above equation. If we only keep terms with j = k in

the sum, we will get the matrix elements of the operator H

n

. It is seen by

inspe tion (and an also be proved by indu tion) that the remaining terms

are reprodu ed by this formula:

(S9.2)

V

n

= H[n�1℄ P

n�1

H[n�2℄ P

n�2

� � � H[1℄ P

1

H[0℄;

hyjP

k

jxi = exp

�

2�i

P

k�1

j=0

x

k

y

j

�

Æ

y;x

:

Indeed, omputing the matrix element hyjV jxi from formula (S9.2) amounts

to the summation over paths from x to y. Looking at any path with nonzero

ontribution, we see that x

k

passes through H[0℄; : : : ;H[k�1℄ un hanged,

whereas y

0

; : : : ; y

k�1

pass through H[n�1℄; : : : ;H[k℄ un hanged.

It remains to realize the operators P

k

:

P

k

= S

k;k�1

S

k;k�2

� � � S

k;0

; where S

k;j

= �

2

�

e

2�i=2

k�j+1

�

[k; j℄ (k > j):

The resulting ir uit for the operator F

2

n

is shown in Figure S9.1.

9.5. BPP � BQP. A lassi al probabilisti omputation an be repre-

sented by an invertible ir uit U

L

� � �U

1

, whi h, together with the input word

x, uses a random sequen e r 2 B

s

of 0s and 1s. (In addition to the result,

the ir uit an also produ e some garbage | this does not matter.) For

the quantum simulation of this ir uit, we regard U

j

as unitary operators

permuting the basis ve tors and, instead of the random word r, we prepare

the state

j i = H

�s

j0

s

i = 2

�s=2

X

r2B

s

jri:

220 3. Solutions

BQP � PP. Let a ir uit U

L

� � �U

1

evaluate the predi ate Q(x) for jxj =

n with error probability � 1=3, the total number of bits in the ir uit being

equal to N . The probability p(x) of obtaining the result 1 an be expressed

in terms of the proje tion �

(1)

= j1ih1j applied to the �rst qubit:

p(x) =

x; 0

N�n

�

�

U

y

1

U

y

2

� � �U

y

L

�

(1)

[1℄U

L

U

L�1

� � �U

1

�

�

x; 0

N�n

�

= 2

�h

x; 0

N�n

�

�

V

L

V

L�1

� � � V

�L+1

V

�L

�

�

x; 0

N�n

�

:

(S9.3)

Here V

L

; : : : ; V

�L

are the operators U

y

1

; : : : ;�

(1)

[1℄; : : : ; U

1

, whi h are renum-

bered and also renormalized as follows: if U

k

= H[m℄ (or if U

y

k

= H[m℄),

then the orresponding operator V

j

equals

p

2H[m℄. The number of H gates

in the ir uit is denoted by h.

The matrix elements of the operators V

j

2

�p

2H;K;K

y

;�(�

x

);�

2

(�

x

);

�

(1)

belong to the set

M = f0; +1; �1; +i; �ig:

Multiplying the matri es, we obtain a sum of numbers from the set M .

Sin e the quantity p(x) we are interested in is real, we an limit ourselves to

summing �1. The multipli ities of the summands are expressed in the form

#

a

(x) =

�

�

fw : C

a

(x;w) = 1g

�

�

; a 2 f1;�1g;

where the predi ates C

a

(x;w) will be de�ned below. Thus we obtain the

representation

p(x) = 2

�h

�

#

1

(x)�#

�1

(x)

�

:

The remaining part of the proof does not involve any quantum me hani s

at all. First, we need an expli it des ription of the predi ates C

a

(x;w). To

this end, we express the matrix elements of the produ t V

L

� � � V

�L

in (S9.3)

as a sum over all paths from the initial to the �nal state:

(V

L

� � � V

�L

)

ab

=

X

x

L�1

;:::;x

�L+1

(V

L

)

ax

L�1

� � � (V

�L

)

x

�L+1

b

:

By de�nition, C

a

(u

�L

; : : : ; u

L

) equals 1 if and only if

u

�L

= u

L

= (x; 0

N�n

) and

L

Y

j=�L+1

(V

j

)

u

j

u

j�1

= a:

It is easy to see that C

a

2 P: we only need to represent the matrix elements

as powers of i and sum the exponents modulo 4.

If Q(x) = 0, then p(x) � 1=3; if Q(x) = 1, then p(x) � 2=3. Thus

Q(x) = 1 if and only if

p(x) = 2

�h

�

#

1

(x)�#

�1

(x)

�

>

1

2

:


This is equivalent to the ondition

#

�1

(x) + 2

h�1

< #

1

(x):

It remains to verify that both sides of the last inequality an be represented

in the form

f(x) =

�

�

fy : F (x; y) = 1g

�

�

; F 2 P:

(Fun tions f of this form onstitute the so- alled lass #P.) We have already

proved that #

a

(x) has the spe i�ed form, so it will suÆ e to show that the

lass #P is losed under addition. Let g(x) =

�

�

fy : G(x; y) = 1g

�

�

, G 2 P.

Then f(x) + g(x) =

�

�

f(y; z) : T (x; y; z) = 1g

�

�

, where T (x; y; 0) = F (x; y)

and T (x; y; 1) = G(x; y). The proof is ompleted.

PP � PSPACE. This is obvious. We introdu e two ounters: one for

R

0

, the other for R

1

. We s an through all possible values of y and add 1 to

the k-th ounter (k = 0; 1) if R

k

(x; y) = 1. Then we ompare the values of

the ounters.


10.1. Let � =

P

k

p

k

j�

k

ih�

k

j. We verify onditions 1){3) for �.

1): This is obvious.

2): h�j�j�i =

P

k

p

k

h�j�

k

ih�

k

j�i =

P

k

p

k

�

�

h�j�

k

i

�

�

2

� 0.

3): Tr � =

P

k

p

k

h�

k

j�

k

i =

P

k

p

k

= 1.

Conversely, if � satis�es 1){3), then � =

P

k

�

k

j�

k

ih�

k

j, where �

k

are the

eigenvalues and fj�

k

ig is an orthonormal basis of eigenve tors of �.

10.2. We note that N F

�

=

L(F

�

;N ), so that the ve tor j i 2 N F

an be translated to a linear map X : F

�

! N . The S hmidt de omposition

for j i is basi ally the singular value de omposition forX (see formula (6.2))

| we just need to hange the designation of the bra-ve tors h�

j

j 2 (F

�

)

�

�

=

F to j�

j

i. The ondition �

j

� 1 follows from the fa t that �

2

j

are the

nonzero eigenvalues of the operator XX

y

= Tr

F

�

j ih j

�

, whi h implies

that

P

j

�

2

j

= 1.

10.3. As follows from the solution of the previous problem, the ondition

Tr

F

�

j

1

ih

1

j

�

= Tr

F

�

j

2

ih

2

j

�

allows us to hoose S hmidt de ompositions

for j

1

i and j

2

i with identi al �

j

and j�

j

i. We write down these de ompo-

sitions:

j

k

i =

X

j

�

j

j�

j

i j�

(k)

j

i; k = 1; 2:

222 3. Solutions

Sin e

�

j�

(1)

j

i

and

�

j�

(2)

j

i

are orthonormal families, there exists a unitary

operator U su h that U j�

(1)

j

i = j�

(2)

j

i for all j. Then

(I

N

U)j

1

i =

X

j

�

j

j�

j

i U j�

(1)

j

i =

X

j

�

j

j�

j

i j�

(2)

j

i = j

2

i:

10.4. First, we prove (10.6). Let A =

P

j

�

j

j

j

ih�

j

j be a singular value

de omposition of A (see Problem 6.4). Re all that �

j

> 0 and h

j

j

k

i =

h�

j

j�

k

i = Æ

jk

. The numbers �

j

are pre isely the nonzero eigenvalues of

p

A

y

A, so that kAk

tr

=

P

j

�

j

. Therefore

jTrABj =

�

�

�

X

j

�

j

h�

j

jBj

j

i

�

�

�

�

X

j

�

j

�

�

h�

j

jBj

j

i

�

�

�

X

j

�

j

kBk = kAk

tr

kBk

for any B. On the other hand, if U is a unitary operator that takes j

j

i to

j�

j

i, then TrAU = kAk

tr

.

To prove (10.7), suppose that

P

k

j�

k

ih�

k

j = A, and U is the operator

de�ned above. Then

kAk

tr

= jTrAU j =

�

�

�

X

k

h�

k

jU j�

k

i

�

�

�

�

X

k

�

�

h�

k

jU j�

k

i

�

�

�

X

k

j�

k

i

j�

k

i

:

But if we set j�

k

i = �

k

j

k

i and j�

k

i = j�

k

i, then

P

k

j�

k

i

j�

k

i

= kAk

tr

.

Finally, we prove that k � k

tr

is a norm. The only nontrivial property is

the triangle inequality. It an be derived as follows:

kA

1

+A

2

k

tr

= max

U2U(N )

�

�

Tr(A

1

+A

2

)U

�

�

� max

U2U(N )

jTrA

1

U j+ max

U2U(N )

jTrA

2

U j = kA

1

k

tr

+ kA

2

k

tr

:

10.5. Property a):

kABk

tr

= max

U2U(N )

jTrABU j � max

U2U(N )

kAk

tr

kBUk � kAk

tr

kBk:

Property b) is a spe ial ase of ), so we prove ). Let A 2 N M; then

kTr

M

Ak

tr

= max

U2U(N )

�

�

Tr

�

(Tr

M

A)U

�

�

�

= max

U2U(N )

�

�

Tr

�

A(U I

M

)

�

�

�

� kAk

tr

:

Property d):

kABk

tr

= Tr

q

(AB)

y

(AB) = Tr

�

p

A

y

A

p

B

y

B

�

= kAk

tr

kBk

tr

:

10.6. a) Let j�i and j�i be two unit ve tors, a = h�j�i. We an represent

j�i as aj�i +

p

1� jaj

2

j i, where j i is another unit ve tor orthogonal to

j�i. Hen e

j�i � j�i

2

= j1� aj

2

+ (1� jaj

2

) = 2(1�Re a):


In the de�nition of the �delity distan e, one an multiply j�i by an

arbitrary phase fa tor without leaving the minimization domain. This or-

responds to multiplying a by the same fa tor. Therefore the minimum is

a hieved when a is real, nonnegative and the largest possible, i.e., a =

p

F (�; ).

b) Let F = N

�

, so that the ve tors j�i; j�i 2 N N

�

an be asso iated

with operators X;Y 2 L(N ) (due to the isomorphism N N

�

�

=

L(N ) ).

The ondition Tr

F

(j�ih�j) = � be omes XX

y

= �. One solution to this

equation is X =

p

�; the most general solution is X =

p

�U , where U is an

arbitrary unitary operator ( f. Problem 10.3). Similarly, Y =

p

V , where

V is unitary. Thus

h�j�i = Tr(X

y

Y ) = Tr

�

U

y

p

�

p

V

�

= Tr

�

p

�

p

W

�

; where W = V U

y

;

F (�; ) = max

W2U(N )

�

�

Tr

�

p

�

p

W

�

�

�

2

=

p

�

p

2

tr

:

) Let j�i and j�i provide the maximum in (10.8). Then

k�� k

tr

=

Tr

F

�

j�ih�j � j�ih�j

�

tr

�

j�ih�j � j�ih�j

tr

= 2

q

1�

�

�

h�j�i

�

�

2

= 2

p

1� F (�; ):

Thus F (�; ) � 1 �

1

4

k� � k

2

tr

, whi h is the required upper bound for the

�delity.

To obtain the lower bound, we will need the following lemma.

Lemma. Let X and Y be nonnegative Hermitian operators. Then

Tr(X � Y )

2

� kX

2

� Y

2

k

tr

:

Proof. Let j

j

i, �

j

be orthonormal eigenve tors and the orresponding

eigenvalues of the operator X � Y . We have the following bound:

kX

2

� Y

2

k

tr

�

X

j

�

�

h

j

j(X

2

� Y

2

)j

j

i

�

�

:

(Indeed, the right-hand side an be represented as Tr((X

2

� Y

2

)U), where

U =

P

j

�j

j

ih

j

j.) To pro eed, we need to estimate ea h term in the sum,

h

j

j(X

2

� Y

2

)j

j

i

=

1

2

h

j

j(X � Y )(X + Y )j

j

i+

1

2

h

j

j(X + Y )(X � Y )j

j

i

= �

j

h

j

j(X + Y )j

j

i;

h

j

j(X + Y )j

j

i � j�

j

j:

Thus

224 3. Solutions

X

j

�

�

h

j

j(X

2

� Y

2

)j

j

i

�

�

�

X

j

�

2

j

= Tr(X � Y )

2

:

�

Now we use the lemma:

p

F (�; ) =

p

�

p

tr

� Tr

�

p

�

p

�

= 1�

1

2

Tr

�

p

��

p

�

2

� 1�

k�� k

tr

2

:


11.1. We will solve this problem together with Problem 11.2 if we prove

the following three things:

a) Any superoperator of type 2 or 3 (as des ribed in the main text) has an

operator sum de omposition (11.2).

b) The set of superoperators of the form (11.2) is losed under multipli a-

tion.

) Any su h superoperator an be represented as Tr

F

(V � V

y

).

We pro eed with the proofs.

a) Superoperators of type 3 already have the required form. For the

partial tra e Tr

F

we have the following representation:

Tr

F

� =

X

m

W

m

�W

y

m

; W

m

= I

N

hmj : N F ! N ;

where fjmig is an orthonormal basis of F . We note thatW

m

�

jj; ki

�

= Æ

mk

jji

and W

y

m

�

jji

�

= jj;mi.

b) Let T =

P

m

A

m

�A

y

m

and R =

P

k

B

k

�B

y

k

. Then

RT =

X

k

B

k

�

X

m

A

m

�A

y

m

�

B

y

k

=

X

k;m

(B

k

A

m

) � (B

k

A

m

)

y

;

X

k;m

(B

k

A

m

)

y

(B

k

A

m

) =

X

m

A

y

m

�

X

k

B

y

k

B

k

�

A

m

=

X

m

A

y

m

A

m

= I:

) Let a superoperator T be de omposed into the sum (11.2) of s terms,

and let F be an s-dimensional spa e with the basis ve tors denoted by jmi.

The linear map

V =

X

m

A

m

jmi : j�i 7!

X

m

A

m

j�i jmi

is an isometri embedding sin e

V

y

V =

X

k;m

�

A

y

k

hkj

��

A

m

jmi

�

=

X

k;m

A

y

k

A

m

hkjmi = I:


Moreover, T = Tr

F

(V � V

y

). Indeed,

Tr

F

(V �V

y

) =

X

m;k

Tr

F

�

A

m

�A

y

k

jmihkj

�

=

X

m

A

m

�A

y

m

:

11.2. See the solution to the previous problem.

11.3.

Tr

F

�

(U Y )�(U Y )

y

�

= Tr

F

�

(U Y )

X

j;l;k;m

�

jlkm

jj; lihk;mj (U Y )

y

�

=

X

j;l;k;m

�

jlkm

Tr

F

�

�

U jjihkjU

y

�

�

Y jlihmjY

y

�

�

=

X

j;l;k;m

�

jlkm

�

U jjihkjU

y

�

Tr

�

Y jlihmjY

y

�

| {z }

Æ

lm

= U(Tr

F

�)U

y

:

11.4. The physi al realizability of T is equivalent to the existen e of a

de omposition T =

P

m

A

m

� A

y

m

su h that

P

m

A

y

m

A

m

= I. In the oordi-

nate form, these equations read as follows:

T

(j

0

j)(k

0

k)

=

j

0

�

�

T (jjihkj)

�

�

k

0

�

=

X

m

hj

0

jA

m

jji hkjA

y

m

jk

0

i =

X

m

a

m(j

0

j)

a

�

m(k

0

k)

;

(S11.1)

X

m;l

a

�

m(lk)

a

m(lj)

= Æ

kj

;(S11.2)

where a

m(j

0

j)

= hj

0

jA

m

jji. If we repla e ea h index pair in parentheses by a

single index, (S11.1) be omes T

JK

=

P

m

a

mJ

a

�

mK

. This is a general form

of a nonnegative Hermitian matrix; therefore (S11.1) is equivalent to ondi-

tions b) and ) in question. Equation (S11.2) is equivalent to ondition a),

provided (S11.1) is the ase.

11.5. Properties a) and b) are equivalent to properties a) and b) in the

previous problem.

It follows from the operator sum de omposition that a physi ally re-

alizable superoperator takes nonnegative operators to nonnegative oper-

ators. On the other hand, if T is physi ally realizable, it has the form

T : � 7! Tr

F

(V �V

y

), hen e the superoperator

T I

L(G)

: � 7! Tr

F

�

(V I

G

)�(V I

G

)

y

�

is also physi ally realizable. Therefore TI

L(G)

takes nonnegative operators

to nonnegative operators.

226 3. Solutions

For the proof of the onverse assertion, we will dedu e property ) of the

previous problem from property ) of the present problem. Spe i� ally, we

will show that the matrix (T

JK

) (where J=(j

0

j) andK=(k

0

k) ) orresponds

to an operator Y that an be represented as (T I

L(G)

)X for some non-

negative Hermitian operator X 2 L(N G).

Let dimG = dimN , and let jji denote the basis ve tors in both spa es.

We set

X =

X

j;k

jjihkj jjihkj = j ih j; where j i =

X

j

jji jji:

Then

Y = (T I

L(G)

)X =

X

j

0

;j;k

0

;k

T

(j

0

j)(k

0

k)

jj

0

ihk

0

j jjihkj;

hen e hj

0

; jjY jk

0

; ki = T

(j

0

j)(k

0

k)

.

11.6. Let us represent T in the form T = Tr

F

0

(V �V

y

), where V : N !

N F F

0

is a unitary embedding ( f. Problem 11.1). By assumption, for

any pure state j�i 2 N we have

Tr

FF

0

�

V j�ih�jV

y

�

= Tr

F

�

Tr

F

0

�

V j�ih�jV

y

��

= Tr

F

�

T (j�ih�j)

�

= j�ih�j:

Therefore j i = V j�i is a produ t state: V j�i = j�i j�i (this follows from

the observation made after the formulation of Problem 10.2). A priori, j�i

depends on the ve tor j�i, but the linearity of V implies that j�i is a tually

onstant. Therefore TX = X , where = Tr

F

0

(j�ih�j).

11.7. A superoperator T of the type L(N )! L(N )�f1; : : : ; rg has the

form T =

P

m

T

m

jmihmj, where T

m

: L(N ) ! L(N ). If T is physi ally

realizable, then it satis�es onditions a){ ) of Problem 11.5, hen e ea h

superoperator T

m

satis�es onditions b) and ). If T is onsistent with (11.4),

then T

m

(j�ih�j) = Æ

mj

j�ih�j for any j�i 2 L

j

; this ondition extends by

linearity as follows:

(S11.3) T

m

X = Æ

mj

X for any X 2 L(L

j

):

Under these assumptions, we will prove that T

m

= �

L

m

� �

L

m

.

Let

�

j�

j;p

i : p = 1; : : : ;dimL

j

be an orthonormal basis of the spa e L

j

.

Formula (S11.3) determines the value of T

m

(j�

j;p

ih�

k;s

j) in the ase j = k.

Therefore it is suÆ ient to prove that if j 6= k, then T

m

(j�

j;p

ih�

k;s

j) = 0.

Let us �x m; j; p; k; s and denote the operator T

m

(j�

j;p

ih�

k;s

j) by A. It

suÆ es to prove that TrAB = 0 for any B or, equivalently, for any B of

the form j�ih�j. Let a = Tr(Aj�ih�j) =

�

�

�

T

m

(j�

j;p

ih�

k;s

j)

�

�

�

�

. Consider the


fun tion

f(x; y)

def

=

�

�

�

T

m

(j ih j)

�

�

�

�

; where j i = xj�

j;p

i+ yj�

k;s

i;

f(x; y) = Æ

mj

jh�j�

j;p

ij

2

jxj

2

+ axy

�

+ a

�

x

�

y + Æ

mk

jh�j�

k;s

ij

2

jyj

2

:

Sin e the operator T

m

(j ih j) is nonnegative, f(x; y) � 0 for any x and y.

But the ondition j 6= k implies that Æ

mj

= 0 or Æ

mk

= 0. Therefore a = 0.

11.8. In formula (11.7), let k run from 1 to r, whereas � 2 L(N ). We

de�ne the \larger spa e" to be N

0

= N M, where M = C

�

j1i; : : : ; jri

�

.

The isometri embedding V : N ! N

0

(whi h takes � to = V �V

y

) and

the subsequent proje tive measurement 7!

P

j

Tr( �

L

j

) � (j) are de�ned

by the formulas

V j�i =

X

k

p

X

k

j�i jki; L

j

= N C (jji):

It is lear that �

L

j

= I

N

jjihjj, hen e Tr(V �V

y

�

L

j

) = Tr(�X

j

).

11.9. We assume that the measurement is destru tive (whi h is indi-

ated in Figure S11.1 by the measured qubits being dis arded to the trash

bin). Thus the measurement is the following transformation of two quantum

bits into two lassi al bits:

T : � 7!

X

a;b

h�

ab

j�j�

ab

i � (a; b):

To realize the transformation T , Ali e applies the unitary operator

H[1℄ �(�

x

)[1; 2℄ : j�

ab

i 7! jb; ai;

inter hanges the qubits and measures them in the basis fj0i; j1ig. Then she

sends the measurement results to Bob.

�

x

�

x

�

z

H

~

~

9

>

>

>

>

>

>

=

>

>

>

>

>

>

;

j�

00

i

Fig. S11.1. The ir uit for quantum teleportation. The ~ symbol in-

di ates the quantum state being teleported. and denote Ali e's

and Bob's halves of the auxiliary state j�

00

i. The dashed lines represent

lassi al bits sent by Ali e to Bob.

228 3. Solutions

Suppose the initial state of the �rst qubit was �. After the measurement,

the overall state of the system be omes

=

X

a;b

�

a; b;

ab

�

=

X

a;b

p

ab

�

�

a; b; p

�1

ab

ab

�

; p

ab

= Tr

ab

;

ab

= (T I

L(B)

)

�

� j�

00

ih�

00

j

�

=

�

h�

ab

j I

B

�

�

� j�

00

ih�

00

j

�

�

j�

ab

i I

B

�

:

Here p

ab

is the probability to get the measurement out ome (a; b), whereas

p

�1

ab

ab

is the orresponding onditional quantum state. Note that h�

ab

j and

j�

ab

i are regarded as operators of types B

2

! C and C ! B

2

(resp.), so

that h�

ab

j I

B

: B

3

! B and j�

ab

i I

B

: B ! B

3

.

We now des ribe Bob's a tions aimed at the re overy of the initial state

�. Without loss of generality we may assume that � = j ih j. (Indeed,

the whole pro ess of measurement and re overy an be des ribed by a su-

peroperator. If it preserves pure states, it also preserves mixed states, due

to the linearity.) In this ase the state after the measurement is also pure:

ab

= j

ab

ih

ab

j. Let j i = z

0

j0i+ z

1

j1i; then

j

ab

i =

�

h�

ab

j I

B

�

�

j i j�

00

i

�

=

�

h�

ab

j I

B

�

0

�

X

;d

z

j i

1

p

2

jd; di

1

A

=

1

p

2

X

;d

z

h�

ab

j ; di jdi =

1

2

X

;d

z

(�1)

b

Æ

�a;d

jdi

=

1

2

X

z

(�1)

b

ja� i =

1

2

(�

x

)

a

(�

z

)

b

j i:

Note that the probability p

ab

= h

ab

j

ab

i =

1

4

does not depend on the

initial state j i. (In fa t, if it depended on j i, then the re overy would not

be possible; this follows from the result of Problem 11.6.) To restore the

initial state, Bob applies the operators �

x

and �

z

with lassi al ontrol: the

ontrolling parameters are the measured values of a and b. The result is as

follows:

(�

z

)

b

(�

x

)

a

j

ab

i

p

p

ab

= j i:

Remark. One may ask this question: what if there is some other quantum

system, whi h is not involved in the teleportation pro edure, but the qubit

being teleported is initially entangled with it? Will the state be preserved

in this ase? The answer is \yes". Indeed, we have proved that the mea-

surement followed by the re overy e�e ts the superoperator R = I

L(B)

on

the teleported qubit. When the additional system is taken into a ount, the

superoperator be omes R I

L(G)

= I

L(BG)

.


11.10. To express the �delity F

�

Tr

M

(A�A

y

);Tr

M

(B B

y

)

�

, we start

with some puri� ations of � and over the auxiliary spa e G = N

�

, i.e.,

� = Tr

G

(j�ih�j), = Tr

G

(j�ih�j), where j�i; j�i 2 N G. The states

j�

0

i = (A I

G

)j�i; j�

0

i = (B I

G

)j�i; j�

0

i; j�

0

i 2 F M G

are parti ular puri� ations of �

0

= Tr

M

(A�A

y

) and

0

= Tr

M

(B B

y

)

(�

0

;

0

2 L(F)) over the auxiliary spa e M G. The �delity an be de-

�ned in terms of general puri� ations over the same spa e.

1

As follows

from the result of Problem 10.3, all puri� ations of �

0

and

0

have the form

(I

F

U)j�

0

i or (I

F

V )j�

0

i for some unitary U and V . Therefore

p

F (�

0

;

0

) = max

U;V 2U(MG)

�

�

h�

0

j(I

F

V

y

)(I

F

U)j�

0

i

�

�

= max

W2U(MG)

�

�

h�

0

jI

F

W j�

0

i

�

�

= max

W2U(MG)

�

�

�

�

�

(B

y

I

G

)(I

F

W )(A I

G

)

�

�

�

�

�

�

= max

W2U(MG)

�

�

�

Tr

�

W Tr

F

�

(A I

G

)j�ih�j(B

y

I

G

)

�

�

�

�

�

= max

W2U(MG)

�

�

�

Tr

�

W (T I

L(G)

)(j�ih�j)

�

�

�

�

=

(T I

L(G)

)(j�ih�j)

tr

:

The �delity should be maximized over � and , whi h is equivalent to

maximizing the last expression over all unit ve tors j�i and j�i. On the

other hand, Theorem 11.1 and formula (10.7) imply that

kTk

}

= kT I

L(G)

k

1

= sup

A6=0

k(T I

L(G)

)Ak

tr

kAk

tr

= sup

j�

k

i;j�

k

i

(T I

L(G)

)

P

k

j�

k

ih�

k

j

tr

P

k

j�

k

i

j�

k

i

= max

�

(T I

L(G)

)(j�ih�j)

tr

:

j�i

=

j�i

= 1

o

:

The two results agree.


12.1. We will use the following simple property of the operator norm:

if X

j

2 L(N

j

;M

j

) and X =

L

j

X

j

:

L

j

N

j

!

L

j

M

j

, then kXk =

max

j

kX

j

k. Indeed, the set of the eigenvalues for X

y

X is the union of the

orresponding sets for X

y

j

X

j

.

1

The de�nition applies dire tly only if dimF = (dimN )(dimM). As far as the general ase

is on erned, we refer to the remark after formula (10.8).

230 3. Solutions

Let V : K ! KB

N

be the standard embedding, i.e., V j�i = j�ij0

N

i.

Then k

~

U

j

V � V U

j

k � Æ for ea h j. Therefore

~

W

�

I

N

V

�

�

�

I

N

V

�

W

=

M

j

�

L

j

�

~

U

j

V � V U

j

�

� Æ:

12.2. We have

~

U =

X

j2

�

L

j

P

j

; P

j

=

X

y2�

Q

y

�

R

y

j

�

M

y

R

j

�

:

Due to the result of Problem 12.1, it is suÆ ient to show that P

j

approxi-

mates Q

f(j)

for ea h j.

Let j�i 2 K. We need to estimate the norm of the ve tor j i = j~�i �

j�i j0

N

i, where j�i = Q

f(j)

j�i and j~�i = P

j

�

j�i j0

N

i

�

. Su h an estimate

follows from the al ulation

h j i = 2�

�

h�j h0

N

j

�

j~�i � h~�j

�

j�i j0

N

i

�

= 2� 2Re

�

h�j h0

N

j

�

j~�i

= 2� 2Re

X

y2�

h�jQ

y

f(j)

Q

y

j�i h0

N

jR

y

j

�

M

y

R

j

j0

N

i

= 2

X

y2�

�

1�Reh�jQ

y

f(j)

Q

y

j�i

�

P(yjj)

� 2

X

y 6=f(j)

2P(yjj) � 4":

Thus

j i

� 2

p

".

In the ase where V is the opy operator, Q

y

f(j)

Q

y

= Æ

y;f(j)

I

K

, so we get

2" instead of 4" in the above estimate. Hen e

j i

�

p

2".


13.1. We denote the required probability by p(X; l). If h

1

; : : : ; h

l

do

not generate the whole group X, then they are ontained in some maximal

proper subgroup Y � X. For ea h Y the probability of su h an event does

not ex eed 2

�l

, be ause jY j � jXj=2. Therefore p(X; l) � 1 �K(X) � 2

�l

,

where K(X) is the number of maximal proper subgroups of the group X:

Subgroups of an Abelian group X are in one-to-one orresponden e with

subgroups of the hara ter group X

�

; maximal proper subgroups orrespond

to minimal nonzero subgroups. Ea h minimal nonzero subgroup is generated

by a single nonzero element, so that K(X) < jX

�

j = jXj.

13.2. We onstru t a lassi al operator V

b

2 L(B B

n

) (the basis

ve tors in B

n

are numbered from 0 to 2

n

� 1) su h that

V

b

j0; 0i = j0; 1i; V

b

j1; 0i = j1; bi:


Then the ir uit V

�1

b

[0; B℄ U [B;A℄ V

b

[0; B℄ realizes the operator �(U

b

)[0; A℄,

where B denotes a set of n an illas.

13.3. Let us al ulate the expe ted value of exp

�

h

�

P

s

r=1

v

r

� sp

��

and

hoose h so that to minimize the result:

E

h

exp

�

h

�

s

X

r=1

v

r

� sp

��i

=

�

e

�hp

�

(1� p

�

) + p

�

e

h

�

�

s

= exp

�

�H(p; p

�

)s

�

for h = ln

p

1�p

� ln

p

�

1�p

�

, where H(p; p

�

) = (1� p) ln

1�p

1�p

�

+ p ln

p

p

�

. Now we

use the obvious inequality E[e

f

℄ � Pr[f � 0℄, whi h holds for any random

variable f . Note that h � 0 if p � p

�

, and h � 0 if p � p

�

. Thus we get the

inequalities

Pr

h

s

�1

s

X

r=1

v

r

� p

i

� exp

�

�H(p; p

�

)s

�

if p � p

�

;

Pr

h

s

�1

s

X

r=1

v

r

� p

i

� exp

�

�H(p; p

�

)s

�

if p � p

�

:

This is a sharper version of Cherno�'s bound ( f. [61℄).

The fun tion H(p; p

�

) ( alled the relative entropy) an be represented

as (1� p) ln

1

1�p

�

+ p ln

1

p

�

�H(p), where H(p) = (1� p) ln

1

1�p

+ p ln

1

p

is the

entropy of the probability distribution (w

0

= 1 � p; w

1

= p). It is easy to

he k that

H(p

�

; p

�

) = 0;

�H(p; p

�

)

�p

�

�

�

�

p=p

�

= 0;

�

2

H(p; p

�

)

�p

2

= �

�

2

H(p)

�p

2

� 4;

hen e H(p; p

�

) � 2(p� p

�

)

2

. The inequality (13.4) follows.

13.4. The Fourier transform operator F = F

q

a ts on n = dlog

2

qe

qubits; more pre isely, it a ts on the subspa eN = C

�

j0i; : : : ; jq�1i

�

� B

n

.

Let

j

x

i =

1

p

q

q�1

X

y=0

exp

�

�2�i

xy

q

�

jyi; x = 0; : : : ; q � 1:

These are the eigenve tors of the operator X : jyi 7! j(y + 1) mod qi, the

orresponding eigenvalues being equal to �

x

= e

2�i'

x

, '

x

= x=q. Obviously,

F jxi = j

�x

i. The Fourier transform operator is also hara terized by its

a tion on the ve tors j

x

i: F j

x

i = jxi.

Thus, we need to transform j

x

i into jxi. The general s heme of the

solution is as follows:

j

x

i j0i 7

W

�! j

x

i jxi 7

$

��! jxi j

x

i 7

V

�! jxi j

0

i 7

IU

�1

��! jxi j0i:

(The extra j0i in the �rst and the last expression orresponds to an illas.)

We will realize ea h of the operators W , V , U with pre ision Æ

0

= Æ=3.

232 3. Solutions

W is a measuring operator with respe t to the orthogonal de ompo-

sition of N into the eigenspa es of X; it performs a garbage-free mea-

surement of the parameter x that orresponds to the eigenvalue �

x

. To

realize W with pre ision Æ

0

, we need to measure x with error probability

� " = (Æ

0

)

2

=2 = �(2

�2l

) and remove the garbage (see Se tion 12.3). To

measure x, we estimate the orresponding phase '

x

= x=q with pre ision

2

�(n+1)

(this is a di�erent kind of pre ision!), multiply by q and round to

an integer.

A ording to Theorem 13.3, the phase estimation is done by a ir uit of

size O(n logn+ nl) and depth O(logn+ log l) with the aid of the operator

�

m

(X) : jp; yi 7! j(y + p) mod qi; p = 0; : : : ; 2

m

�1; y = 0; : : : ; q�1;

where m = n+ k, k = O(log l + log log n). Note that the operator �

m

(X)

itself has smaller omplexity: it an be realized by a ir uit of size O(nk+k

2

)

and depth O(log n + (log k)

2

) (see Problem 2.14b). However, the multipli-

ation of the estimated phase by q makes a signi� ant ontribution to the

overall size, namely, O(n

2

).

The operator V a ts as follows:

V jx; yi = exp

�

�2�i

xy

q

�

jx; yi:

To realize this operator, we ompute xy=q with pre ision 2

�m

, where m =

l +O(1). More exa tly, we �nd a p 2 f0; : : : ; 2

m

�1g su h that

�

�

2

�m

p� xy=q

�

�

mod 1

� 2

�m

:

Then we apply the operator �

m

(e

2�i=2

m

) ontrolled by this p, and un-

ompute p.

When estimating the value of xy=q, we operate with �(l)-digit real

numbers; therefore p is omputed by a ir uit of size O(l

2

log l) and depth

O((log l)

2

) (see Problem 2.14a). The operator �

m

(e

2�i=2

m

) has approxi-

mately the same omplexity (see Lemma 13.4). Thus V is realized by an

O(l

2

log l)-size, O((log l)

2

)-depth ir uit over the standard basis.

Finally, we need to realize the operator U : j0i 7! j

0

i. If q = 2

n

, this is

very simple: U = H

n

. The general ase was onsidered in Problem 9.4a,

but the pro edure proposed in the solution does not parallelize. We now

des ribe a ompletely di�erent pro edure, whi h lends itself to paralleliza-

tion. Instead of onstru ting the ve tor j

0

i = j�

0;q

i, we will try to reate

the ve tor

j�

a;q

i =

r

1� e

�2a

1� e

�2aq

q�1

X

x=0

e

�ax

jxi


for a =

1

2

�(n+l)

, where

1

is a suitable onstant. It is obvious that

j�

a;q

i � j�

0;q

i

� O(aq);

so the desired pre ision �(2

�l

) is a hieved this way.

Let us onsider the ve tor j�

a;1

i, whi h belongs to the in�nite-dimen-

sional Hilbert spa e H = C

N

(where N = f0; 1; 2; : : : g). Of ourse, it is

impossible to reate this state exa tly, but the m-qubit state j�

a;2

m

i may be

a good approximation, provided m is large enough. Clearly,

j�

a;r

i � j�

a;1

i

� O(e

�ar

);

so it suÆ es to hoose r = 2

m

su h that e

�ar

�

2

2

�l

for a suitable onstant

2

. Thus m =

�

log

2

�

�a

�1

ln(

2

2

�l

)

��

= n + �(l). We will show how to

onstru t the state j�

a;2

m

i later.

Now, we invoke the fun tion

G : N ! N � f0; : : : ; q � 1g; G(x) =

�

bx=q ; (x mod q)

�

and the orresponding linear map

G

q

: H ! H C

�

j0i; : : : ; jq � 1i

�

. It

transforms the state j�

a;1

i into j�

aq;1

i j�

a;q

i. Thus, the state j�

a;q

i an

be obtained as follows: we reate j�

a;1

i, split it into j�

a;q

i and j�

aq;1

i, and

get rid of the last state (whi h is as diÆ ult as reating it).

In the omputation, we must repla e the operator

G

q

by its �nite version,

jx; 0

n

i 7!

�

�

bx=q ; (x mod q)

�

, where x 2 f0; : : : ; 2

m

�1g. Note that the ratio

x=q ranges from 0 to 2

m�n+1

= 2

O(l)

, hen e the orresponding ir uit has

size O(nl + l

2

log l) and depth O(logn+ (log l)

2

).

Thus, it remains to show how to reate the state j�

b;2

m

i for b = a and

b = aq. This is not hard be ause j�

b;2

m

i is a produ t state:

j�

b;2

m

i = j�(�

m�1

)i � � � j�(�

0

)i; j�(�)i = os(��)j0i+ sin(��)j1i;

where �

j

= (1=�) ar tan(e

�2

m

b

). Ea h ve tor j�(�

j

)i is obtained from j0i as

follows:

j�(�

j

)i = exp(�i��

j

�

y

)j0i = K

�1

H exp(i��

j

�

z

)H j0i:

Moreover, all these ve tors an be easily onstru ted at on e by a ir uit

over the standard basis. Indeed,

exp(i��

m�1

�

z

) � � � exp(i��

0

�

z

) jx

m�1

; : : : ; x

0

i

= exp

0

�

2�i

m�1

X

j=0

(1=2 � x

j

)�

j

1

A

jx

m�1

; : : : ; x

0

i:

Therefore we just need to ompute the sum in the exponent with pre ision

�(2

�l

), and pro eed by analogy with the operator V .

234 3. Solutions

The ir uits for the reation of the ve tors j�

a;2

m

i and j�

aq;2

m

i have size

O(nl+ l

2

log l) and depth O(log n+(log l)

2

). This estimate does not in lude

the omplexity of omputing the numbers �

j

be ause su h omputation be-

longs to the pre-pro essing stage. We mention, however, that it an be done

in time poly(l).

To summarize, the Fourier transform operator F

q

is realized by a ir uit

with the the following parameters:

size = O(n

2

+ l

2

log l); depth = O(log n+ (log l)

2

):


15.1. We represent the total Hilbert spa e in the form B

n

= K L,

where K is the state spa e of qubits in A, and L is the state spa e of the

remaining qubits. We need to re onstru t the state � from Tr

K

�.

Let us write an operator sum de omposition for the superoperator T :

� 7! Tr

K

� ( f. Problem 11.2):

T =

X

m

A

m

�A

y

m

; A

m

= hmj I

L

: K L ! L;

where fjmig is an orthonormal basis of K. We see that T 2 D � D

y

, where

D = K

�

I

L

� L(K L; L):

If X;Y 2 D, then Y

y

X 2 L(K) I

L

= E(A). Consequently, the ode M

orre ts errors from D. It remains to apply Theorem 15.3.

15.2 (Cf. [11, 41℄). Suppose that M is a ode of type (4; 1) orre ting

one-qubit errors. Then it must dete t two-qubit errors, in parti ular errors

in qubits 1; 2 as well as in qubits 3; 4. This means that an arbitrary state

� 2 L(M) an be restored both from the �rst two qubits and from the last

two qubits ( f. Problem 15.1). Let us show that this is impossible.

Let N

1

be the spa e of qubits 1; 2, and N

2

the spa e of qubits 3; 4. Then

M is a subspa e of N

1

N

2

. Denote the in lusion map M ! N

1

N

2

by V . We have assumed the existen e of error- orre ting transformations

| physi ally realizable superoperators P

1

: N

1

! M and P

2

: N

2

! M

satisfying

P

1

Tr

N

2

(V �V

y

) = P

2

Tr

N

1

(V �V

y

) = � for any � 2M:

Therefore we an de�ne a physi ally realizable superoperator

P = (P

1

P

2

)(V � V

y

) : M!MM;


whi h has the following properties:

Tr

N

2

P� = Tr

N

2

�

(P

1

P

2

)(V �V

y

)

�

= P

1

Tr

N

2

(V �V

y

) = �;

Tr

N

1

P� = Tr

N

1

�

(P

1

P

2

)(V �V

y

)

�

= P

2

Tr

N

1

(V �V

y

) = �:

A ording to Problem 11.6, the �rst identity implies that P� = �

2

,

where

2

does not depend on �. Similarly, P� =

1

�. We have arrived at

a ontradi tion:

1

� = �

2

for any �.

15.3. We will only give the idea of the solution. It is suÆ ient to ex-

amine the phase omponent g

(z)

of the error g = g

(x)

+ g

(z)

(the lassi al

omponent g

(x)

is treated similarly). A syndrome bit equals 1 if and only if

the star of the orresponding vertex ontains an odd number of edges from

g

(z)

. Therefore we obtain the following problem. Let D be the boundary of

a 1- hain C with Z

2

- oeÆ ients (i.e., D is a set of an even number of latti e

verti es); we need to �nd su h a hain C

min

whi h ontains the smallest

number of edges.

It is not diÆ ult to �gure out that C

min

is the disjoint union of paths

that onne t pairs of verti es of the set D (two di�erent paths annot have

a ommon edge). Therefore the problem of determining the error by its

syndrome is redu ed to the the following weighted mat hing problem. There

is a graph G (in our ase, the omplete graph whose verti es are the verti es

of the latti e) with a weight assigned to ea h edge (in our ase, the shortest

path length on the latti e). It is required to �nd a perfe t mat hing with

minimal total weight. (A perfe t mat hing on a bipartite graph was de�ned

in Problem 3.4, but here we talk about mat hing on an arbitrary unoriented

graph.)

There exist polynomial algorithms solving the weighted mat hing prob-

lem (see, for example, [52, Chapter 11℄, where an algorithm based on ideas

of linear programming is des ribed).

Appendix A

Elementary Number

Theory

In this Appendix we outline some basi de�nitions and theorems of arith-

meti . This, of ourse, is not a substitute for more detailed books; see

e.g., [70, 33℄.

A.1. Modular arithmeti and rings. One says that a is ongruent to b

modulo q and writes

a � b (mod q)

if a � b is a multiple of q. This ondition an be also expressed by the

notation q j (a � b), whi h reads \q divides a � b". A set of all (mod q)-

ongruent integers is alled a ongruen e lass. (For example, the set of even

numbers and the set of odd numbers are ongruen e lasses modulo 2.) Ea h

lass an be hara terized by its anoni al representative, i.e., an integer

r 2 f0; : : : ; q � 1g. We write

a mod q = r;

whi h means pre isely that r is the residue of a (the remainder of integer

division of a by q), i.e., a = mq + r, where m 2 Z, r 2 f0; : : : ; q � 1g. In

most ases we do not need to make a distin tion between ongruen e lasses

and their anoni al representatives, so the term \residue" refers to both.

Thus 7 mod 3 = 1, but we may also say that (7 mod 3) is the ongruen e

lass ontaining 7 (i.e., the set f: : : ;�5;�2; 1; 4; 7; : : : g) whi h has 1 as is

anoni al representative. In any ase, the operation x 7! (x mod q) takes

an integer to a residue.

237

238 Appendix A

Residues an be added, subtra ted or multiplied by performing the or-

responding operation on integers they represent. Thus r = r

1

r

2

(modu-

lar multipli ation) if and only if a � a

1

a

2

(mod q), where a mod q = r,

a

1

mod q = r

1

, and a

2

mod q = r

2

. It is important to note that a

1

and

a

2

an be repla ed by any (mod q)- ongruent numbers, the produ t being

ongruent too. Indeed,

if a

1

� b

1

and a

2

� b

2

; then a

1

a

2

� b

1

b

2

(mod q):

What are the ommon properties of integer arithmeti operations and

operations with residues? In the most abstra t form, the answer is that both

integers and (mod q) residues form ommutative rings.

De�nition A.1. A ring is a set R equipped with two binary operations,

\+" and \ � " (the dot is usually suppressed in writing), and two spe ial

elements, 0 and 1, so that the following relations hold:

(a+ b) + = a+ (b+ ); a+ b = b+ a; a+ 0 = a;

(ab) = a(b ); 1 � a = a � 1 = a;

(a+ b) = a + b ; (a+ b) = a+ b:

For any a 2 R there exists an element v su h that a+ v = 0.

If, in addition, ab = ba for any a and b, then R is alled a ommutative

ring.

In what follows we onsider only ommutative rings, so we omit the

adje tive \ ommutative".

Note that the element v in the above de�nition is unique. Indeed, if

another element v

0

satis�es a+ v

0

= 0, then

v

0

= v

0

+ 0 = v

0

+ (a+ v) = (v

0

+ a) + v=(a+ v

0

) + v = 0 + v = v + 0 = v:

Su h v is denoted by �a. The relations on the list imply other well-known

relations, for example, a � 0 = 0. Indeed,

a � 0 = a � 0 + a � 0 + (�(a � 0))

= a(0 + 0) + (�(a � 0)) = a � 0 + (�(a � 0)) = 0:

The di�eren e between two elements of a ring is de�ned as a � b

def

=

a+ (�b). Any ring be omes an Abelian group if we forget about the multi-

pli ation and 1, but keep + and 0. This group is alled the additive group

of the ring. The ring of residues modulo q is denoted by Z=qZ, whereas the

orresponding additive group is denoted by Z

q

(this is just the y li group

of order q).

What are the di�eren es between the ring of integers Z and residue

rings Z=qZ? There are many; for example, Z is in�nite while Z=qZ is �nite.

Elementary Number Theory 239

Another important distin tion is as follows. For integers, xy = 0 implies

that x = 0 or y = 0. This is not true for all residue rings (spe i� ally, this

is false in the ase where q is a omposite number). Example: 2 � 5 � 0

(mod 10), although both 2 and 5 represent nonzero elements of Z=10Z. We

say that an element x of a ring R is a zero divisor if 9y 6= 0 (xy = 0). For

example 0; 2; 3; 4; 6; 8; 9; 10 are zero divisors in Z=12Z, whereas 1; 5; 7; 11 are

not. It will be shown below that r is a zero divisor in Z=qZ if and only if r

and q (regarded as integers) have a nontrivial ommon divisor.

1

Let us introdu e another important on ept. An element x 2 R is alled

invertible if there exists y su h that xy = 1; in this ase we write y = x

�1

.

For example, 7 = 4

�1

in Z=9Z, sin e 4 � 7 � 1 (mod 9). It is obvious

that if a and b are invertible, then ab is also invertible, and that (ab)

�1

=

a

�1

b

�1

. Therefore invertible elements form an Abelian group with respe t

to multipli ation, whi h is denoted by R

�

. For example, Z

�

= f1;�1g and

(Z=12Z)

�

= f1; 5; 7; 11g. In the latter ase, invertible elements are exa tly

the elements whi h are not zero divisors. This is not a oin iden e.

Proposition A.1. If an element x 2 R is invertible, then x is not a zero

divisor. In the ase where R is �nite, the onverse is also true.

Proof. Suppose that xy = 0. Then y = (x

�1

x)y = x

�1

(xy) = x � 0 = 0.

Now let us assume that x is not a zero divisor, and R is �nite. Then

some elements in the sequen e 0; x; x

2

; x

3

; : : : must repeat, so there are some

n > m � 0 su h that x

n

= x

m

. Therefore x

m

(x

n�m

� 1) = 0. This implies

that x

m�1

(x

n�m

� 1) = 0 be ause x is not a zero divisor. Iterating this

argument, we get x

n�m

� 1 = 0, i.e., x

n�m

= 1. Hen e x

�1

= x

n�m�1

. �

A.2. Greatest ommon divisor and unique fa torization. One of the

most fundamental properties of integers is as follows.

Theorem A.2. Let a and b be integers, at least one of whi h is not 0. Then

there exists d � 1 su h that

d j a; d j b; d = ma+ nb; where m;n 2 Z:

Su h d is alled the greatest ommon divisor of a and b and denoted by

g d(a; b). Explanation of the name: �rstly, d divides a and b by de�nition.

Inasmu h as d = ma + nb, any ommon divisor of a and b also divides d;

therefore it is not greater than d.

There are several ways to prove Theorem A.2. There is a onstru tive

proof whi h a tually provides an eÆ ient algorithm for �nding the numbers

1

One an easily prove this assertion by fa toring r and q into prime numbers. The argument,

however, relies on the fa t that the fa torization is unique. The uniqueness of fa torization is

a tually a theorem whi h requires a proof. We will derive this theorem from an equivalent of the

above assertion.

240 Appendix A

d, m and n. This algorithm will be des ribed in Se tion A.6. For now,

we will use a shorter but more abstra t argument. We note that the set

M =

�

ma + nb : m;n 2 Z

is a group with respe t to addition, and that

a; b 2M . Therefore Theorem A.2 an be obtained from the following more

general result.

Theorem A.3. LetM 6= f0g be a subgroup in the additive group of integers.

Then M is generated by a single element d � 1, i.e., M = (d), where

(d)

def

=

�

kd : k 2 Z

.

Proof. It is lear that M ontains at least one positive element. Let d be

the smallest positive element of M . Obviously, (d) � M , so it suÆ es to

prove that any integer x 2M is ontained in (d).

Let r = x mod d, i.e., x = kd+r, where 0 � r < d. Then r = x�kd 2M

(be ause x; d 2 M , and M is a group). Sin e d is the smallest positive

element in M , we on lude that r = 0. �

Now we derive a few orollaries from Theorem A.2.

Corollary A.2.1. The residue r = (a mod b) 2 Z=bZ is invertible if and

only if g d(a; b) = 1.

Proof. The residue r = (a mod b) being invertible means that ma � 1

(mod b) for some m, i.e., the equation ma + nb = 1 is satis�ed for some

integers m and n. But this is exa tly the ondition that g d(a; b) = 1. �

If p is a prime number, then every nonzero element of the ring Z=pZ is

invertible. A ring with this property is alled a �eld.

2

The �eld Z=pZ is also

denoted by F

p

.

Corollary A.2.2. If g d(x; q) = 1 and g d(y; q) = 1, then g d(xy; q) = 1.

Proof. Using Corollary A.2.1, we reformulate the statement as follows: if

(x mod q) and (y mod q) are invertible residues, then (xy mod q) is also

invertible. But this is obvious. �

Theorem A.4 (\Unique fa torization"). Any nonzero integer x an be

represented in the form x = �p

1

� � � p

m

where p

1

; : : : ; p

m

are prime numbers.

This representation is unique up to the order of fa tors.

Proof. Existen e. Without loss of generality we may assume that x is posi-

tive. If x = 1, then the fa torization is trivial: the number of fa tors is zero.

For x > 1 we use indu tion. If x is a prime, then we are done; otherwise

x = yz for some y; z < x, and y and z are already fa tored into primes.

2

Fields play an important role in many parts of mathemati s, e.g., in linear algebra: ve tors

with oeÆ ients in an arbitrary �eld have essentially the same properties as real or omplex

ve tors.


Uniqueness. This is less trivial. Again, we use indu tion: assume that

x > 1 and that the uniqueness holds for all numbers from 1 through x� 1.

Suppose that x has two fa torizations:

x = p

1

� � � p

m

= q

1

� � � q

n

:

First, we show that p

1

2 fq

1

; : : : ; q

n

g. Indeed, if it were not the ase,

we would have g d(p

1

; q

j

) = 1 for all j. Hen e g d(p

1

; x) = 1 (due to

Corollary A.2.2), whi h ontradi ts the fa t that p

1

j x (due to the �rst

fa torization).

By hanging the order of q's, we an arrange that p

1

= q

1

. Then we

infer that there is a number y < x with two fa torizations, namely, y =

p

2

� � � p

m

= q

2

� � � q

n

. By the indu tion assumption, the fa torizations of y

oin ide (up to the order of fa tors), so that the same is true for x = p

1

y. �

It is often onvenient to gather repeating prime fa tors, i.e., to write

x = �p

�

1

1

� � � p

�

k

k

;

where all p

j

are distin t. Theorem A.4 implies many other \obvious" prop-

erties of integers, e.g., the following one.

Corollary A.4.1. Let a and b be nonzero integers, and

=

jabj

g d(a; b)

:

Then, for any integer x, the onditions a j x and b j x imply that j x.

The number is alled the least ommon multiple of a and b.

A.3. Chinese remainder theorem. Let b and q be positive integers su h

that b j q. Then (mod q)-residues an be unambiguously onverted to

(mod b)-residues. Indeed,

if x

0

� x

00

(mod q); then x

0

� x

00

(mod b):

Therefore a map �

q;b

: Z=qZ! Z=bZ is de�ned, for example,

�

6;3

: 0 7! 0; 1 7! 1; 2 7! 2; 3 7! 0; 4 7! 1; 5 7! 2:

This map is a ring homomorphism, i.e., it is onsistent with the arithmeti

operations (see De�nition A.2 below).

Now, let b

1

j q and b

2

j q. Any (mod q)-residue x an be onverted

into a (mod b

1

)-residue x

1

, as well as (mod b

2

)-residue x

2

. Thus a map

x 7! (x

1

; x

2

) is de�ned; we denote it by �

q;(b

1

;b

2

)

.

Theorem A.5 (Chinese remainder theorem). Let q = b

1

b

2

, where b

1

and b

2

are positive integers su h that g d(b

1

; b

2

) = 1. Then the map

�

q;(b

1

;b

2

)

: Z=qZ! (Z=b

1

Z)� (Z=b

2

Z)

242 Appendix A

is an isomorphism of rings.

(The ring stru ture on the Cartesian produ t of rings will be de�ned below;

see De�nition A.3.)

Abstra t terminology aside, Theorem A.5 says that the map �

q;(b

1

;b

2

)

is

one-to-one. In other words, for any a

1

, a

2

the system

(A.1)

x � a

1

(mod b

1

);

x � a

2

(mod b

2

)

has a unique, up to (mod q)- ongruen e, solution. Indeed the existen e of a

solution says that �

q;(b

1

;b

2

)

is a surje tive (onto) map, whereas the uniqueness

is equivalent to the ondition that �

q;(b

1

;b

2

)

is inje tive.

Proof. We will �rst prove that �

q;(b

1

;b

2

)

is inje tive, i.e., any two solutions

to system (A.1) are ongruent modulo q. Let x

0

and x

00

be su h solutions;

then x = x

0

� x

00

satis�es

x � 0 (mod b

1

);

x � 0 (mod b

2

);

i.e., b

1

j x and b

2

j x. Therefore j x, where is the least ommon multiple

of b

1

and b

2

(see Corollary A.4.1). But g d(b

1

; b

2

) = 1, hen e = b

1

b

2

= q.

Thus, �

q;(b

1

;b

2

)

maps the set Z=qZ to the set (Z=b

1

Z)� (Z=b

2

Z) inje -

tively. Both sets onsist of the same number of elements, q = b

1

b

2

; therefore

�

q;(b

1

;b

2

)

is a one-to-one map. �

We will now explain the abstra t terms used in the formulation of The-

orem A.5 and derive one orollary.

De�nition A.2. Let A and B be rings. Denote the zero and the unit

elements in these rings by 0

A

; 0

B

; 1

A

; 1

B

, respe tively. A map f : A! B is

alled a ring homomorphism if

f(x+ y) = f(x) + f(y); f(xy) = f(x) f(y); f(1

A

) = 1

B

for any x; y 2 A. (Note that the property f(0

A

) = 0

B

follows automati ally.)

If the homomorphism f is a one-to-one map, it is alled an isomorphism.

(In this ase the inverse map f

�1

exists and is also an isomorphism.)

De�nition A.3. The dire t produ t of rings A

1

and A

2

is the set of pairs

A

1

�A

2

=

�

(x

1

; x

2

) : x

1

2 A

1

; x

2

2 A

2

endowed with omponentwise ring

operations:

(x

1

; x

2

) + (y

1

; y

2

) = (x

1

+ y

1

; x

2

+ y

2

);

(x

1

; x

2

) � (y

1

; y

2

) = (x

1

y

1

; x

2

y

2

);

0

A

1

�A

2

= (0

A

1

; 0

A

2

); 1

A

1

�A

2

= (1

A

1

; 1

A

2

):


Similarly one an de�ne the produ t of any number of rings.

Corollary A.5.1. If q = p

�

1

1

� � � p

�

k

k

is the fa torization of q, then there

exists a ring isomorphism

Z=qZ

�

=

Z=p

�

j

j

Z � � � � � Z=p

�

j

j

Z:

A.4. The stru ture of �nite Abelian groups. We assume that the

reader is familiar with the basi on epts of group theory (group homomor-

phism, osets, quotient group, et .) and an use them in simple ases, e.g.,

Z

4

=Z

2

�

=

Z

2

but Z

4

� Z

2

�Z

2

. Also important is Lagrange's theorem, whi h

says that the order of a subgroup divides the order of the group.

First, we onsider a y li group G =

�

�

k

: k = 0; : : : ; q � 1

�

=

Z

q

.

Note that the hoi e of the generator � 2 G is not unique: any element of

the form �

k

, where g d(k; q) = 1, also generates the group. Another simple

observation: if q = p

�

1

1

� � � p

�

k

k

is the fa torization of q, then

(A.2) Z

q

�

=

Z

q

1

� � � � � Z

q

k

; q

j

= p

�

j

j

:

(We emphasize that here

�

=

stands for an isomorphism of groups, not rings.)

This property follows from Corollary A.5.1 | we just need to keep the

additive stru ture of the rings and forget about the multipli ation. We will

all the group Z

p

�

(where p is prime) a primitive y li group.

Theorem A.6. Let G be a �nite Abelian group of order q = p

�

1

1

� � � p

�

k

k

.

Then G an be de omposed into a dire t produ t of primitive y li groups:

(A.3) G

�

=

�

1

Y

r=1

�

Z

p

r

1

�

m(p

1

;r)

!

� � � � �

�

k

Y

r=1

�

Z

p

r

k

�

m(p

k

;r)

!

:

The numbers m(p

j

; r) are uniquely determined by the group G.

Note that the isomorphism in (A.3) may not be unique; there is no

preferred way to hoose one de omposition over another. However, the

produ ts in parentheses are de�ned anoni ally by the group G.

Corollary A.6.1. If an Abelian group G of order q is not y li , then there

is a nontrivial divisor n j q su h that 8x 2 G (x

n

= 1).

Proof. Let q = p

�

1

1

� � � p

�

k

k

be the fa torization of q. The group G is y li

if and only if

m(p

j

; r) =

�

1 if r = �

j

;

0 otherwise:

Sin e G is not y li , the above ondition is violated for some j. However,

P

r

r �m(p

j

; r) = �

j

; therefore m(p

j

; �

j

) = 0. If n = q=p

j

, then x

n

= 1 for

all x 2 G. �

244 Appendix A

The proof of Theorem A.6 requires some preparation. Sin eG is Abelian,

the map '

a

: x 7! x

a

(where a is an arbitrary integer) is a homomorphism

of G into itself. Let us de�ne the following subgroups in G:

(A.4)

G

(a)

= Im'

a

=

�

x

a

: x 2 G

;

G

(a)

= Ker'

a

=

�

x 2 G : x

a

= 1

:

Lemma A.7. If g d(a; b) = 1, then G

(ab)

= G

(a)

� G

(b)

. In other words,

for any z 2 G

(ab)

there are unique x 2 G

(a)

and y 2 G

(b)

su h that xy = z.

Proof. Let ma + nb = 1. Then we an hoose x = z

nb

, y = z

ma

, whi h

proves the existen e of x and y.

On the other hand, G

(a)

\ G

(b)

= f1g. Indeed, if u 2 G

(a)

\ G

(b)

, then

u = u

ma+nb

= u

ma

u

nb

= 1 � 1 = 1. This implies the uniqueness. �

Lemma A.8. If p is a prime fa tor of jGj, then G ontains an element of

order p.

Proof. We will use indu tion on q = jGj. Suppose that q > 1 and that the

lemma holds for all Abelian groups of order q

0

< q. It suÆ es to show that

G has an element whose order is a multiple of p.

Let x be a nontrivial element of G. If p divides the order of x, then we

are done. Otherwise, let H be the subgroup generated by x, and G

0

= G=H

the orresponding quotient group. In this ase p divides jG

0

j.

By the indu tion assumption, there is a nontrivial element y

0

2 G

0

of

order p. It is lear that (y

0

)

k

= 1 if and only if p j k. Let y 2 G be a member

of the oset y

0

(re all that the quotient group is formed by osets). Then

y

k

2 H if and only if p j k. Therefore the order of y is a multiple of p. �

Proof of Theorem A.6. Lemma A.7 already shows that

G =

k

Y

j=1

G

(q

j

)

; where q

j

= p

�

j

j

:

By Lemma A.8, jG

(q

j

)

j has no prime fa tors other than p

j

; therefore jG

(q

j

)

j =

q

j

= p

�

j

j

. We need to split the subgroups G

(q

j

)

even further.

Let us �x j and drop it from notation. The subgroup G

(p)

is spe ial

in that all its elements have order p (or 1). Therefore we may regard it as

a linear spa e over the �eld F

p

(the group multipli ation plays the role of

ve tor addition). Let

L

p;r

= G

(p)

\G

(p

r

)

; m(p; r) = dimL

p;r�1

� dimL

p;r

:

It is lear that G

(p)

= L

p;0

� L

p;1

� � � � � L

p;��1

� L

p;�

= f1g.

We hoose a basis in the spa e L

p;0

in su h a way that the �rst m(p; �)

basis ve tors belong to L

p;��1

, the nextm(p; ��1) ve tors belong to L

p;��2

,


and so on. For ea h basis ve tor e 2 L

p;r�1

n L

p;r

(where n denotes the set

di�eren e, i.e., e 2 L

p;r�1

but e =2 L

p;r

) we �nd an element v 2 G

q

su h that

v

p

r�1

= e. Powers of v form a y li subgroup of order p

r

. One an show

that G

q

is the dire t produ t of these subgroups (the details are left to the

reader). �

A.5. The stru ture of the group (Z=qZ)

�

. Let q = p

�

1

1

� � � p

�

k

k

be the

fa torization of q. Due to Corollary A.5.1, there is a group isomorphism

(Z=qZ)

�

�

=

�

Z=p

�

1

1

Z

�

�

� � � � �

�

Z=p

�

k

k

Z

�

�

:

Therefore it is suÆ ient to study the group (Z=p

�

k

Z)

�

, where p is a prime

number. We begin with the ase � = 1.

Let p be a prime number. All nonzero (mod p)-residues are invertible,

hen e

�

�

(Z=pZ)

�

�

�

= p � 1. The order of a group element always divides

the order of the group (by Lagrange's theorem); therefore the order of any

element in (Z=pZ)

�

divides p�1. Thus we have obtained the following result.

Theorem A.9 (Fermat's little theorem). If p is a prime number and

x 6� 0 (mod p), then x

p�1

� 1 (mod p).

The next theorem fully hara terizes the group (Z=pZ)

�

.

Theorem A.10. If p is a prime number, then (Z=pZ)

�

is a y li group of

order p� 1.

Proof. Suppose the Abelian group G = (Z=pZ)

�

is not y li . By Corol-

lary A.6.1, there exists some integer n, 0 < n < p� 1, su h that x

n

= 1 for

all x 2 G. Therefore the equation x

n

� 1 = 0 has p� 1 solutions in the �eld

F

p

(the solutions are the nonzero elements of the �eld).

The fun tion f(x) = x

n

� 1 is a polynomial of degree n with F

p

oeÆ-

ients. A polynomial of degree n with oeÆ ients in an arbitrary �eld has at

most n roots in that �eld. Indeed, if a

1

is a root, then f(x) = (x�a

1

)f

1

(x),

where f

1

is a polynomial of degree n � 1. If a

2

is another root, then

f

1

(a

2

) = (a

2

� a

1

)

�1

f(a

2

) = 0; therefore f

1

(x) = (x � a

2

)f

2

(x

2

). This

pro ess an ontinue for at most n steps be ause ea h polynomial f

k

has

degree n� k.

But the number of roots is p�1 > n. We have arrived at a ontradit ion.

�

Example. The group (Z=13Z)

�

is generated by the element 2; see the table:

k 2 Z

12

0 1 2 3 4 5 6 7 8 9 10 11

(2

k

mod 13) 2 (Z=13Z)

�

1 2 4 8 3 6 12 11 9 5 10 7

Note that 6, 11 and 7 are also generators of (Z=13Z)

�

. Indeed, if � is a

generator of (Z=pZ)

�

, then the fun tion k 7! �

k

maps Z

p�1

to (Z=pZ)

�

246 Appendix A

isomorphi ally. Thus the element �

k

generates the group (Z=pZ)

�

if and

only if k generates Z

p�1

, i.e., if g d(k; p� 1) = 1. In our ase, the numbers

2; 6; 11; 7 (the generators of (Z=13Z)

�

) orrespond to the invertible (mod 12)-

resudues k = 1; 5; 7; 11.

Theorem A.11. Let p be a prime number, and � � 1 an integer.

1. If p 6= 2, then (Z=p

�

Z)

�

�

=

Z

p�1

� Z

p

��1

�

=

Z

(p�1)p

��1.

2. (Z=2

�

Z)

�

�

=

Z

2

� Z

2

��2for � � 2.

(The isomorphism Z

p�1

� Z

p

��1

�

=

Z

(p�1)p

��1is due to formula (A.2).)

Proof. An element x 2 Z=p

�

Z is invertible if and only if (x mod p) 6= 0.

Therefore

�

�

(Z=p

�

Z)

�

�

�

= (p� 1)p

��1

.

Let us denote G = (Z=p

�

Z)

�

and introdu e a sequen e of subgroups

G � H

0

� H

1

� � � � � H

��1

= f1g de�ned as follows:

(A.5) H

r

=

�

1 + p

r+1

x : x 2 Z=p

��r�1

Z

for r = 0; : : : ; �� 1:

Note that if a 2 H

r�1

, then a

p

2 H

r

(for 1 � r � �� 1). Indeed,

(A.6) (1 + p

r

x)

p

= 1 +

�

p

1

�

p

r

x+

�

p

2

�

p

2r

x

2

+ � � � 2 H

r

(x 2 Z=p

��r

Z):

In the rest of the proof we onsider the two ases separately.

1. p 6= 2. In this ase G = G

(p�1)

� G

(p

��1

)

(the notation was de�ned

in Se tion A.4; see (A.4)). The subgroup G

(p

��1

)

is easily identi�ed: it

oin ides with H

0

de�ned by (A.5). Indeed, if a 2 H

0

, then a

p

��1

= 1;

therefore H

0

� G

(p

��1

)

. On the other hand, jG

(p

��1

)

j = jH

0

j = p

��1

.

We will not �nd the subgroup G

(p�1)

expli itly, but rather give an ab-

stra t argument:

G

(p�1)

�

=

G=G

(p

��1

)

= G=H

0

�

=

(Z=pZ)

�

�

=

Z

p�1

:

It remains to he k that G

(p

��1

)

= H

0

is isomorphi to Z

p

��1. To this

end, it suÆ es to prove that any element of the set di�eren e H

0

nH

1

has

order p

��1

. If we apply the map '

p

: x 7! x

p

repeatedly, H

0

is mapped to

H

1

, then to H

2

, and so on. We need to show that this shift over the subroups

takes pla e one step at a time, i.e., if a 2 H

r�1

nH

r

, then a

p

2 H

r

nH

r+1

.

The ondition a 2 H

r�1

nH

r

an be represented as follows: a = 1+ p

r

x,

where x 2 Z=p

��r

Z and x 6� 0 (mod p). We use Equation (A.6) again.

Note that the terms denoted by the ellipsis ontain p raised to the power

3r � r + 2 or higher, so they are not important. Moreover,

�

p

2

�

=

p(p�1)

2

is

divisible by p. Therefore,

a

p

= (1 + p

r

x)

p

� 1 + p

r+1

x+ p

r+1

�

p

2

�

p

r�1

x

2

� 1 + p

r+1

x (mod p

r+2

);


so that a

p

2 H

r

but a

p

=2 H

r+1

.

2. p = 2. The only reason why the previous proof does not work is that

�

2

2

�

= 1 is not divisible by 2. But the last argument is still orre t for r > 1;

therefore H

1

�

=

Z

p

��2. On the other hand, G = f1;�1g �H

1

. �

A.6. Eu lid's algorithm. On e again, let a and b be integers, at least one

of whi h is not 0. How does one ompute g d(a; b) and solve the equation

ma+ nb = g d(a; b) eÆ iently?

Eu lid's algorithm. Without loss of generality, we may assume that b > 0.

We set x

0

= a, y

0

= b and iterate the transformation

(A.7) (x

j+1

; y

j+1

) = (y

j

; x

j

mod y

j

)

until we get a pair of the form (x

t

; y

t

) = (d; 0). This d is equal to g d(a; b).

Indeed, any ommon divisor of x and y is a ommon divisor of y and

x mod y, and vi e versa. Therefore g d(a; b) = g d(d; 0) = d.

The omplexity of the algorithm will be estimated later, after we des ribe

additional steps that are needed to �nd m and n satisfying ma+ nb = d.

We �rst give some analysis of the algorithm. Pro edure (A.7) an be

represented as follows:

(A.8)

�

x

j

y

j

�

=

�

k

j

1

1 0

��

x

j+1

y

j+1

�

;

�

x

t

y

t

�

=

�

d

0

�

;

where k

j

= bx

j

=y

j

. Note that k

0

is an arbitrary integer, while k

1

; : : : ; k

t�1

are positive; moreover, k

t�1

> 1 if t > 1. Thus we have

(A.9)

�

a

b

�

=

�

k

0

1

1 0

�

� � �

�

k

t�1

1

1 0

��

d

0

�

:

The produ t of the matri es here is denoted by A

t

. Let us also introdu e

partial produ ts,

(A.10) A

j

=

�

k

0

1

1 0

�

� � �

�

k

j�1

1

1 0

�

:

It is easy to see that det(A

j

) = (�1)

j

, and that

(A.11) A

j

=

�

p

j

p

j�1

q

j

q

j�1

�

;

p

0

= 1; p

�1

= 0; p

j+1

= k

j

p

j

+ p

j�1

;

q

0

= 0; q

�1

= 1; q

j+1

= k

j

q

j

+ q

j�1

:

Equation (A.9) says that a = p

t

d and b = q

t

d. (Therefore p

t

=q

t

is an

irredu ible fra tion representation of the rational number a=b.) On the other

hand, p

t

q

t�1

� p

t�1

q

t

= det(A

t

) = (�1)

t

, hen e

(�1)

t

(q

t�1

a� p

t�1

b) = d:

248 Appendix A

Thus we have solved the equation ma + nb = d. We summarize the result

as follows.

Extended Eu lid's algorithm (for solving the equation ma + nb = d).

We iterate transformation (A.7), omputing the ratios k

j

= bx

j

=y

j

on the

way. Then we ompute p

j

, q

j

a ording to (A.11) (this an be made a part

of the iterative pro edure as well). The answer to the problem is as follows:

(A.12) m = (�1)

t

q

t�1

; n = (�1)

t�1

p

t�1

:

Let us estimate the omplexity of the algorithm in terms of the problem

size s, i.e., the total number of digits in the binary representations of a and

b. Using formula (A.8) and the onditions k

1

; : : : ; k

t�1

� 1, d � 1, we obtain

the omponentwise inequality

�

x

1

y

1

�

�

�

1 1

1 0

�

t�1

�

1

0

�

=

�

F

t

F

t�1

�

; where F

t

=

�

t

� (��)

�t

p

5

:

(Here F

0

= 0, F

1

= 1, F

j+1

= F

j

+ F

j�1

are the Fibona i numbers,

whereas � =

1+

p

5

2

is the golden ratio.) Therefore b = x

1

� (�

t

), and

t � O(log b) = O(s). Ea h appli ation of transformation (A.7), as well as

the omputation of k

j

, p

j

, q

j

, are performed by O(s

2

)-size ir uits. Therefore

the overall omplexity is O(s

3

).

A.7. Continued fra tions. Eu lid's algorithm an be viewed as a pro e-

dure for onverting the fra tion z = a=b into the irredu ible fra tion p

t

=q

t

.

It turns out that some steps in this pro edure (namely, the omputation of

k

0

; : : : ; k

t�1

) an be formulated in terms of rational numbers, or even real

numbers. Indeed, let us de�ne z

j

= x

j

=y

j

. Then equations (A.7) and (A.8)

be ome

z

j+1

=

1

fra (z

j

)

; k

j

= bz

j

; where fra (x)

def

= x� bx ;(A.13)

z = z

0

; z

j

= k

j

+

1

z

j+1

; z

t

=1;(A.14)

(Note that z

j

> 1 for all j � 1.) Thus we obtain a representation of z in the

form of a �nite ontinued fra tion [k

0

; k

1

; : : : ; k

t�1

℄ with terms k

j

, whi h is

de�ned as follows:

(A.15) [k

0

; k

1

; : : : ; k

t�1

℄

def

= k

0

+

1

k

1

+

1

. . . . . . . . .

k

t�1

+

1

1

; k

j

2 Z;

k

1

; : : : ; k

t�1

� 1:

We all a ontinued fra tion anoni al if t = 1 or k

t�1

> 1; the pro edure

des ribed by equation (A.13) guarantees this property. (If we started with


an irrational number z, we would get an in�nite ontinued fra tion, whi h

is always onsidered anoni al.)

Proposition A.12.

1. Any real number has exa tly one anoni al ontinued fra tion represen-

tation.

2. A rational number with anoni al representation [k

0

; k

1

; : : : ; k

t�1

℄ has

exa tly one non anoni al representation, namely, [k

0

; k

1

; : : : ; k

t�1

�1; 1℄.

(The proof is left as an exer ise to the reader.)

What are ontinued fra tions good for? We will see that the �rst j

terms of the anoni al ontinued fra tion for z provide a good approxima-

tion of z by a rational number p

j

=q

j

, meaning that jz � p

j

=q

j

j = O(q

�2

j

).

All suÆ iently good approximations are obtained by this pro edure (see

Theorem A.13 below). Put it in a di�erent way: if z is a suÆ iently good

approximation of a rational number p=q, we an �nd that number by exam-

ining the ontinued fra tion representation of z.

To deal with partial ontinued fra tion expansions, we de�ne the fun -

tion

(A.16) [k

0

; k

1

; : : : ; k

j�1

℄(u)

def

= k

0

+

1

k

1

+

1

. . . . . . . . .

k

j�1

+

1

u

:

It allows us to represent z as follows: z = [k

0

; k

1

; : : : ; k

j�1

℄(z

j

).

To obtain an expli it formula for fun tion (A.16), we note that it is a

omposition of fra tional linear fun tions,

[k

0

; k

1

; : : : ; k

j�1

℄(u) = g

k

0

(g

k

1

(� � � g

k

j�1

(u) � � � )); where g

k

(u) =

ku+ 1

u

:

Composing fra tional linear fun tions is equivalent to multiplying 2� 2 ma-

tri es: if f

1

(u) = (a

1

u + b

1

)=(

1

u + d

1

) and f

2

(v) = (a

2

v + b

2

)=(

2

v + d

2

),

then f

1

(f

2

(v)) = (av + b)=( v + d), where

�

a b

d

�

=

�

a

1

b

1

1

d

1

��

a

2

b

2

2

d

2

�

:

Therefore

(A.17) [k

0

; k

1

; : : : ; k

j�1

℄(u) =

p

j

u+ p

j�1

q

j

u+ q

j�1

;

where the integers p

j

, q

j

are de�ned by equation (A.11).

250 Appendix A

Substituting u =1 into (A.17), we obtain the rational number

(A.18)

p

j

q

j

= [k

0

; k

1

; : : : ; k

j�1

℄:

This number is alled the j-th onvergent of z. For example, p

t

=q

t

= z,

p

1

=q

1

= k

0

= bz ; we may also de�ne the 0-th onvergent, p

0

=q

0

= 1=0 =1.

Note that the ontinued fra tion in (A.18) is not ne essarily anoni al.

Let us examine the properties of onvergents. Inasmu h as p

j

q

j�1

�

p

j�1

q

j

= det(A

j

) = (�1)

j

and q

0

< q

1

< q

2

< � � � , the following relations

hold:

(A.19)

p

j

q

j

�

p

j�1

q

j�1

=

(�1)

j

q

j

q

j�1

;

p

1

q

1

<

p

3

q

3

< � � � � z � � � � <

p

2

q

2

<

p

0

q

0

:

(To put z in the middle, we have used the fa t that z = [k

0

; k

1

; : : : ; k

j�1

℄(z

j

)

for 1 < z

j

� 1.) This justi�es the name \ onvergent".

Theorem A.13. Let z be a real number, p=q an irredu ible fra tion, q > 1.

1. If p=q is a onvergent of z, then jz � p=qj < 1=(q(q + 1)).

2. If jz � p=qj < 1=(q(2q � 1)), then p=q is a onvergent of z.

Proof. Let us onsider a more general problem: given the number w = p=q,

�nd the set of real numbers z that have w among their onvergents. We an

represent w as a anoni al ontinued fra tion [k

0

; k

1

; : : : ; k

t�1

℄ and de�ne

p

j

; q

j

(j = 0; : : : ; t) using this fra tion. Note that t > 1 (be ause w is not

an integer), and that p

t

= p, q

t

= q. It is easy to see that three ases are

possible.

1. z = w = p

t

=q

t

.

2. The anoni al ontinued fra tion for z has the form [k

0

; k

1

; : : : ; k

t�1

; : : : ℄.

Then z = (p

t

z

t

+ p

t�1

)=(q

t

z

t

+ q

t�1

) for 1 < z

t

< 1; therefore z lies

between p

t

=q

t

and (p

t

+ p

t�1

)=(q

t

+ q

t�1

) (the ends of the interval are

not in luded).

3. The anoni al ontinued fra tion for z is [k

0

; k

1

; : : : ; k

t�1

�1; 1; : : : ℄. In

this ase z = (p

t

z

t

+ p

t�1

)=(q

t

z

t

+ q

t�1

), where

k

t

+

1

z

t

= k

t

� 1 +

1

1 +

1

z

t+1

; 1 < z

t+1

<1:

Thus z

t

< �2, so that z lies between p

t

=q

t

and (2p

t

� p

t�1

)=(2q

t

� q

t�1

).


Combining these ases, we on lude that w is a onvergent of z if and

only if z 2 I, where I is the open interval with these endpoints:

(A.20)

p

t

+ p

t�1

q

t

+ q

t�1

=

p

t

q

t

� (�1)

t

1

q

t

(q

t

+ q

t�1

)

;

2p

t

� p

t�1

2q

t

� q

t�1

=

p

t

q

t

+ (�1)

t

1

q

t

(2q

t

� q

t�1

)

:

But 1 � q

t�1

� q

t

� 1; therefore

S

�

p

t

=q

t

; 1=(q

t

(2q

t

� 1))

�

� I � S

�

p

t

=q

t

; 1=(q

t

(q

t

+ 1))

�

;

where S(x; Æ) stands for the Æ-neighborhood of x. �

Bibliography

[1℄ J. F. Adams, Le tures on Lie groups, W.A. Benjamin, In ., New York{Amsterdam,

1969.

[2℄ L.M. Adleman, J. DeMarrais and M.A. Huang, Quantum omputability, SIAM J.

Comput. 26 (1997), pp. 1524{1540.

[3℄ D. Aharonov and M. Ben-Or, Fault tolerant quantum omputation with onstant error,

e-print quant-ph/9611025; extended version, e-print quant-ph/9906129.

[4℄ D. Aharonov, A. Kitaev, and N. Nisan, Quantum ir uits with mixed states, STOC'29,

1997; e-print quant-ph/9806029.

[5℄ A.V. Aho and J.D. Ullman, Prin iples of ompiler design, Addison-Wesley, Reading,

MA, 1977.

[6℄ L. Babai and S. Moran, Arthur{Merlin games: A randomized proof system and a

hierar hy of omplexity lasses, Journal of Computer and System S ien es 36 (1988),

pp. 254{276.

[7℄ A. Baren o, C. H. Bennett, R. Cleve, D. P. DiVin enzo, N. Margolus, P. Shor,

T. Sleator, J. Smolin, and H. Weinfurter, Elementary gates for quantum omputa-

tion, Phys. Rev. Ser. A52 (1995), pp. 3457{3467; e-print quant-ph/9503016.

[8℄ D.A. Barrington, Bounded-width polynomial-size bran hing programs re ognize ex-

a tly those languages in NC

1

, Journal of Computer and System S ien es 38 (1989),

pp. 150{164.

[9℄ P. Beame, S. Cook, and H. J. Hoover, Log depth ur uits for division and related

problems, SIAM J. Comput. 15 (1986), pp. 994{1003.

[10℄ C. H. Bennett, Logi al reversibility of omputations, Journal of Resear h and Devel-

opment 17 (1973), pp. 525{532.

[11℄ C. Bennett, G. Brassard, C. Cr�epeau, R. Jozsa, A. Peres, and W. Wootters, Tele-

porting an unknown quantum state via dual lassi al and Einstein{Podolsky{Rosen

hannel, Phys. Rev. Lett. 70 (1993), pp. 1895{1899.

[12℄ C. Bennett, D. DiVin enzo, J. Smolin, and W. Wootters, Mixed state entangle-

ment and quantum error orre tion, Phys. Rev. A54 (1996), pp. 3824{3851; e-print

quant-ph/9604024.

253

254 Bibliography

[13℄ D. Boneh and R. Lipton, Quantum ryptoanalysis of hidden linear fun tions, Pro . of

Advan es in Cryptology|CRYPTO-95, Le ture Notes Computer S ien e, vol. 963,

Springer-Verlag, Berlin, 1995, pp. 424{437.

[14℄ R. Boppana and M. Sipser, The omplexity of �nite fun tions, Handbook of Theoreti-

al Computer S ien e. Volume A, Algorithms and Complexity, Ch. 14. J. van Leeuwen

(ed.), Elsevier, Amsterdam; MIT Press, Cambridge, MA, 1990, pp. 757{804.

[15℄ N. Bourbaki, Lie Groups and Lie Algebras, Hermann, Paris, 1971.

[16℄ A.R. Calderbank and P.W. Shor, Good quantum error- orre ting odes exist,

Phys. Rev. A A54 (1996), pp. 1098{1106; e-print quant-ph/9512032.

[17℄ A.R. Calderbank, E.M. Rains, P.W. Shor, and N. J.A. Sloane Quantum error or-

re tion and orthogonal Geometry, Phys. Rev. Lett. 78 (1997), pp. 405{408; e-print

quant-ph/9605005.

[18℄ R. Cleve and J. Watrous, Fast parallel ir uits for the quantum Fourier transform,

FOCS'41, 2000, pp. 526{536; e-print quant-ph/0006004.

[19℄ D. Coppersmith, An approximate Fourier transform useful in quantum fa toring,

Te hni al Report RC19642, IBM, 1994; e-print quant-ph/0201067.

[20℄ D. Deuts h, Quantum theory, the Chur h{Turing prin iple and the universal quantum

omputer, Pro . Roy. So . London A400 (1985), pp. 97{117.

[21℄ , Quantum omputational networks, Pro . Roy. So . London. A425 (1989),

pp. 73{90.

[22℄ P. Erd�os and J. Spen er, Probabilisti methods in ombinatori s, A ademi Press,

New York, 1974.

[23℄ R. P. Feynman, Simulating physi s with omputers, International Journal of Theoret-

i al Physi s 21(6/7) (1982), 467{488.

[24℄ , Quantum me hani al omputers, Opti s News, 11, February 1985, p. 11.

[25℄ M. H. Freedman, P/NP, and the quantum �eld omputer, Pro . Natl. A ad. S i. USA

95 (1998), pp. 98{101.

[26℄ R. Impagliazzo and A. Wigderson. P = BPP if E requires exponential ir uits: De-

randomizing the XOR lemma, STOC'29, 1997.

[27℄ M. H. Freedman and A.Yu. Kitaev, Diameter of homogeneous spa es, unpublished.

[28℄ L. Fortnow and M. Sipser, Are there intera tive proto ols for Co-NP-languages?,

Inform. Pro ess. Lett. 28 (1988), pp. 249{251.

[29℄ M. R. Garey and D. S. Johnson, Computers and intra tability, Freeman, New York,

1983.

[30℄ J. Gruska, Quantum Computing, M Graw-Hill, London, 1999.

[31℄ A.W. Harrow, B. Re ht and I. L. Chuang, Tight bounds on dis rete approximation of

quantum gates, e-print quant-ph/0111031.

[32℄ L. Grover, A fast quantum me hani al algorithm for database sear h, STOC'28, 1996,

pp. 212{219.

[33℄ A. J. Khin hin, Continued fra tions, Univ. of Chi ago Press, 1992.

[34℄ A.A. Kirillov, Elements of the theory of representations, Springer-Verlag, New York,

1976.

[35℄ A.Yu. Kitaev, Fault-tolerant quantum omputation by anyons, e-print

quant-ph/9707021.

[36℄ A.Yu. Kitaev, Quantum omputations: algorithms and error orre tion, Uspekhi Mat.

Nauk 52 (1997), no. 6, pp. 53{112; English transl., Russian Math. Surveys 52 (1997),

no. 6, pp. 1191{1249.

Bibliography 255

[37℄ A. Kitaev, A. Shen, M. Vyalyi, Classi al and Quantum Computations, Mos ow, 1999

(in Russian); available at http://www.m me.ru/free-books.

[38℄ A.Yu. Kitaev and J. Watrous, Parallelization, ampli� ation, and exponential time

simulation of quantum intera tive systems, STOC'32, 2000, pp. 608{617.

[39℄ S. C. Kleene, Mathemati al logi , Wiley, New York, 1967.

[40℄ , Introdu tion to metamathemati s, Van Nostrand, New York, 1952.

[41℄ E. Knill and R. La amme, A theory of quantum error- orre ting odes, e-print

quant-ph/9604034.

[42℄ E. Knill, R. La amme, and W. Zurek, Threshold a ura y for quantum omputation,

e-print quant-ph/9610011.

[43℄ D. E. Knuth, The art of omputer programming, Addison-Wesley, Reading, MA, 1973.

[44℄ A. I. Kostrikin and Yu. I. Manin, Linear algebra and geometry, Nauka, Mos ow, 1986;

English transl., Gordon and Brea h, New York, 1989.

[45℄ R. Landauer, Irreversibility and heat generation in the omputing pro ess, Journal of

Resear h and Development 3 (1961), pp. 183{191.

[46℄ C. Lautemann, BPP and the polynomial hierar hy, Inform. Pro ess. Lett. 17 (1983),

no. 4, pp. 215{217.

[47℄ F. J. Ma Williams and N. J. A. Sloane, The theory of error orre tion odes, North

Holland, New York, 1981.

[48℄ A. I. Maltsev, Algorithms and re ursive fun tions, Wolters-Noordhof, Groningen,

1970.

[49℄ Yu. I. Manin Computable and In omputable, Mos ow, 1980 (in Russian).

[50℄ M. Mar us and H. Min . A survey of matrix theory and matrix inequalities, Allyn

and Ba on, Boston, 1964.

[51℄ M. A. Nielsen and I. L. Chuang Quantum omputation and quantum information,

Cambridge University Press, 2000.

[52℄ C. H. Papadimitriou and K. Steiglitz, Combinatorial optimization: algorithms and

omplexity, Prenti e-Hall, Englewood Cli�s, NJ, 1982.

[53℄ V.V. Prasolov, Problems and theorems in linear algebra, Amer. Math. So ., Provi-

den e, RI, 1994.

[54℄ H. Rogers, Theory of re ursive fun tions and e�e tive omputability, MIT Press, Cam-

bridge, MA, 1987.

[55℄ A. S hrijver, Theory of linear and integer programming, Wiley-Inters ien e, Chi h-

ester, NY, 1986.

[56℄ J. P. Serr, Lie algebras and Lie groups, W.A. Benjamin, In ., New York{Amsterdam,

1965.

[57℄ I. R. Shafarevi h, Basi notions of algebra, Springer-Verlag, New York, 1997.

[58℄ A. Shamir, IP=PSPACE, J. Asso . Comput. Ma h. 39 (1992), no. 4, 869{877.

[59℄ A. Shen, IP=PSPACE: simpli�ed proof, J. Asso . Comput. Ma h. 39 (1992), no. 4,

878{880.

[60℄ J. R. Shoen�eld, Mathemati al logi , Addison-Wesley, Reading, MA, 1967.

[61℄ , Degrees of unsolvability, Elsevier, New York, 1972.

[62℄ P.W. Shor, Algorithms for quantum omputation: Dis rete log and fa toring,

FOCS'35, 1994, pp. 124{134.

256 Bibliography

[63℄ , Polynomial-time algorithms for prime fa torization and dis rete loga-

rithms on a quantum omputer, SIAM J. Comput. 26 (1997), 1484{1509; e-print

quant-ph/9508027.

[64℄ , S heme for redu ing de oheren e in quantum memory, Phys. Rev. A52

(1995), pp. 2493{2496.

[65℄ , Fault-tolerant quantum omputation, FOCS'37, 1996, pp. 56{65; e-print

quant-ph/9605011.

[66℄ D. Simon, On the power of quantum omputation, FOCS'35, 1994, pp. 116{123.

[67℄ M. Sipser, Introdu tion to the theory of omputation, PWS, Boston, 1997.

[68℄ A.M. Steane, Multiple parti le interferen e and quantum error orre tion, Pro . Roy.

So . London A452 (1996), p. 2551; e-print quant-ph/9601029.

[69℄ C. Umans, Pseudo-random generators for all hardnesses, to appear in STOC

2002 and Complexity 2002 joint session; http://www.resear h.mi rosoft. om/

~umans/resear h.htm.

[70℄ I.M. Vinogradov, Elements of number theory, Dover, New York, 1954.

[71℄ J. Watrous, On quantum and lassi al spa e-bounded pro esses with algebrai transi-

tion amplitudes, FOCS'40, 1999, pp. 341-351; e-print s.CC/9911008.

[72℄ J. Watrous, PSPACE has onstant-round quantum intera tive proof systems,

FOCS'40, 1999, pp. 112{119; e-print: CC/9901015.

[73℄ T. Yamakami and A.C. Yao, NQP

C

= o� C

=

P, Information Pro essing Letters 71

(2) (1999), pp. 63{69; e-print quant-ph/9812032.

[74℄ A.C.-C. Yao, Quantum ir uit omplexity, FOCS'34, 1993, pp. 352{361.

[75℄ C. Zalka, Grover's quantum sear hing algorithm is optimal, e-print

quant-ph/9711070.

Index

Algorithm, 9

for �nding the hidden subgroup

in Z

k

, 135

for period �nding, 121, 127

Grover's, 83

Grover's (for the solution of the general

sear h problem), 87

nondeterministi , 28

primality testing, 40

probabilisti , 36

quantum, 89, 91

Simon's (for �nding the hidden subgroup

in Z

k

2

), 118

Ampli� ation of probability, 37, 83, 139, 141

Amplitudes, 55, 92

An illa, 60

Angle between subspa es, 147

Anyons, 172

Automaton

�nite-state, 24

Basis

lassi al, 55

Bit, 1

quantum (qubit), 53

Bra-ve tor, 56

Carmi hael numbers, 39

Che k matrix

for a linear lassi al ode, 155

Che k operator, 167

Cherno�'s bound, 127, 231

Chur h thesis, 12

Cir uit

Boolean, 17

depth, 23

fan-in, 23

fan-out, 23

formula, 18

graph, 17

size, 19

width, 27

quantum, 60

omplete basis, 73

standard basis, 73

universal, 88

reversible, 61

omplete basis, 61

uniform sequen e of, 22, 23, 89

Cir uit omplexity, 20

Clause, 33

CNF, 19, 33

Code

Hamming, 154

repetition, 153, 154

Shor, 160

Code distan e

lassi al, 154

Codes, error- orre ting, 151

lassi al, 152

linear, 155

quantum, 152

ongruent symple ti , 167

symple ti , 166, 167

tori , 169

Codeve tor, 152

Codeword, 152

Complexity lasses, 14

BQNP, 137

�

k

, 45

�

k

, 45

257

258 Index

P=poly, 20

BPP, 36, 37

MA, 138

Arthur and Merlin, 30, 138

BPP, 150

BQNP, 150, 151

BQP, 91

de�nition using games, 44, 138, 150

dual lass ( o-A), 44

EXPTIME, 22

MA, 150

NC, 23

NP, 28, 150

Karp redu ibility, 30

NP- omplete, 31

P, 14

PP, 91

PSPACE, 15, 150

Computation


probabilisti , 36

quantum, 82

reversible, 63

Copying

of a quantum state, 103

De oheren e, 102

Density matrix, 94

Diagonalization, 179

distan e fun tion, 77

DNF, 19

Element | f. Operator

Elementary transformation, 58

En oding

for a quantum ode, 152

one-to-many, 152

Error

lassi al, 160

phase, 160

Fidelity, 99

distan e, 99

Fun tion

Boolean, 17

basis, 17

omplete basis, 18

onjun tion, 19

disjun tion, 19

negation, 19

standard omplete basis, 18, 19

omputable, 11, 12

majority, 26, 83

partial, 10, 137

total, 10

Garbage, 62

removal, 63

Gate

ontrolled NOT, 62

Deuts h, 75

Fredkin, 206

quantum, 60

To�oli, 61

Group

(Z=qZ)

�

, 119, 121

ESp

2

(n), 164, 166

SO(3), 66, 75

Sp

2

(n), 165

U(1), 66

U(2), 66

hara ter, 118

Hamiltonian, 156, 172

k-lo al, 141

y le, 28

graph, 28

Inner produ t, 56

Ket-ve tor, 56

Language, 12

Literal, 19

Matrix, Pauli, 66

Measurement, 92, 105

onditional probabilities, 113

destru tive, 107

POVM, 107

proje tive, 106

Measuring operator, 111, 112

onditional probabilities, 112

eigenvalues, 113

Miller{Rabin test, 38

Net, 77

�-sparse, 77

in SU(M), 77

quality, 77

Norm

of a superoperator

stable, 109

unstable, 108

operator, 71

tra e, 98

One-way fun tion, 44

Operator

applied to a register, 58

approximate representation, 72

using an illas, 73

Hermitian adjoint, 56

permutation, 61

Index 259

proje tion, 93

realized by a quantum ir uit, 60

using an illas, 60

unitary, 57

with quantum ontrol, 65

Ora le, 26, 35, 83, 116

quantum, 117

randomized, 117

Partial tra e, 95

Phase estimation, 124, 127

Polynomial growth, 14

POVM, 107

Predi ate, 12

de idable, 12

Problem

TQBF , 50

3-CNF, 33

3-SAT , 33

3- oloring, 34

lique, 35

determining the dis rete logarithm, 135

Euler y le, 35

fa toring, 119

general sear h, 83

quantum formulation of, 84

hidden subgroup, 135

hidden subgroup, 116

ILP, 34

independent set, 198

lo al Hamiltonian, 142

mat hing

perfe t, 35

period finding, 119

primality, 38

satis�ability, 31

TQBF, 64

with ora le, 83

Pseudo-random generator, 43

Puri� ation, 96

unitary equivalen e, 97

Quantum omputer, 53

Quantum Fourier transform, 88, 134, 218

Quantum probability

for simple states, 92

general de�nition, 94, 95

simplest de�nition, 55, 82

Quantum register, 58

Quantum teleportation, 107, 227{229

Resolution method, 195

S hmidt de omposition, 97

Set

enumerable, 16

Singular value, 57

de omposition, 57

State of a quantum system

basis, 53

entangled, 60

mixed, 95

produ t, 60

pure, 95

Superoperator, 99, 106

physi ally realizable, 100

hara terization, 100, 101

Superposition of states, 54

Syndrome, 171

Tensor produ t, 55

of operators, 57

universality property, 55

Transformation, error- orre ting, 158, 160

lassi al, 153

for symple ti odes, 171

Turing ma hine, 10

alphabet, 9, 10

blank symbol, 10

ell, 10

omputational table, 20, 32

on�guration, 11

ontrol devi e, 10

external alphabet, 10

head, 10

initial on�guration, 11

initial state, 10

input, 11

multitape, 16


omputational path, 28

output, 11

probabilisti , 36

state, 10

step (or y le) of work, 11

tape, 10

universal, 14

with ora le, 26, 50

Turing thesis, 12

Witness, 38

Documents

Kitaev, alyi Vyqu.zju.edu.cn/uploadfile/file/20161012/20161012142714_47779.pdf · using the p ossibilities tum quan hanics mec in organizing computation lo oks all the more e attractiv