Upload
others
View
22
Download
0
Embed Size (px)
Citation preview
Classi al and quantum omputation
A. Yu. Kitaev, A. H. Shen, M. N. Vyalyi
Contents
Foreword ix
Notation xiii
Introdu tion 1
Part 1. Classi al Computation 9
1. Turing ma hines 9
1.1. De�nition of a Turing ma hine 10
1.2. Computable fun tions and de idable predi ates 11
1.3. Turing's thesis and universal ma hines 12
1.4. Complexity lasses 14
2. Boolean ir uits 17
2.1. De�nitions. Complete bases 17
2.2. Cir uits versus Turing ma hines 20
2.3. Basi algorithms. Depth, spa e and width 23
3. The lass NP: Redu ibility and ompleteness 28
3.1. Nondeterministi Turing ma hines 28
3.2. Redu ibility and NP- ompleteness 30
4. Probabilisti algorithms and the lass BPP 36
4.1. De�nitions. Ampli� ation of probability 36
4.2. Primality testing 38
4.3. BPP and ir uit omplexity 42
5. The hierar hy of omplexity lasses 45
5.1. Games ma hines play 45
5.2. The lass PSPACE 48
v
vi Contents
Part 2. Quantum Computation 53
6. De�nitions and notation 55
6.1. The tensor produ t 55
6.2. Linear algebra in Dira 's notation 55
6.3. Quantum gates and ir uits 58
7. Corresponden e between lassi al and quantum omputation 60
8. Bases for quantum ir uits 65
8.1. Exa t realization 65
8.2. Approximate realization 71
8.3. EÆ ient approximation over a omplete basis 75
9. De�nition of Quantum Computation. Examples 82
9.1. Computation by quantum ir uits 82
9.2. Quantum sear h: Grover's algorithm 83
9.3. A universal quantum ir uit 88
9.4. Quantum algorithms and the lass BQP 90
10. Quantum probability 92
10.1. Probability for state ve tors 92
10.2. Mixed states (density matri es) 94
10.3. Distan e fun tions for density matri es 98
11. Physi ally realizable transformations of density matri es 100
11.1. Physi ally realizable superoperators: hara terization 100
11.2. Cal ulation of the probability for quantum omputation 102
11.3. De oheren e 102
11.4. Measurements 105
11.5. The superoperator norm 108
12. Measuring operators 112
12.1. De�nition and examples 112
12.2. General properties 114
12.3. Garbage removal and omposition of measurements 115
13. Quantum algorithms for Abelian groups 116
13.1. The problem of hidden subgroup in (Z
2
)
k
; Simon's
algorithm 117
13.2. Fa toring and �nding the period for raising to a power 119
13.3. Redu tion of fa toring to period �nding 120
13.4. Quantum algorithm for �nding the period: the basi idea122
13.5. The phase estimation pro edure 125
13.6. Dis ussion of the algorithm 130
13.7. Parallelized version of phase estimation. Appli ations 131
13.8. The hidden subgroup problem for Z
k
135
14. The quantum analogue of NP: the lass BQNP 138
Contents vii
14.1. Modi� ation of lassi al de�nitions 138
14.2. Quantum de�nition by analogy 139
14.3. Complete problems 141
14.4. Lo al Hamiltonian is BQNP- omplete 144
14.5. The pla e of BQNP among other omplexity lasses 150
15. Classi al and quantum odes 151
15.1. Classi al odes 153
15.2. Examples of lassi al odes 154
15.3. Linear odes 155
15.4. Error models for quantum odes 156
15.5. De�nition of quantum error orre tion 158
15.6. Shor's ode 161
15.7. The Pauli operators and symple ti transformations 163
15.8. Symple ti (stabilizer) odes 167
15.9. Tori ode 170
15.10. Error orre tion for symple ti odes 172
15.11. Anyons (an example based on the tori ode) 173
Part 3. Solutions 177
S1. Problems of Se tion 1 177
S2. Problems of Se tion 2 183
S3. Problems of Se tion 3 195
S5. Problems of Se tion 5 202
S6. Problems of Se tion 6 203
S7. Problems of Se tion 7 204
S8. Problems of Se tion 8 204
S9. Problems of Se tion 9 216
S10. Problems of Se tion 10 221
S11. Problems of Se tion 11 224
S12. Problems of Se tion 12 229
S13. Problems of Se tion 13 230
S15. Problems of Se tion 15 234
Appendix A. Elementary Number Theory 237
A.1. Modular arithmeti and rings 237
A.2. Greatest ommon divisor and unique fa torization 239
A.3. Chinese remainder theorem 241
A.4. The stru ture of �nite Abelian groups 243
A.5. The stru ture of the group (Z=qZ)
�
245
A.6. Eu lid's algorithm 247
viii Contents
A.7. Continued fra tions 248
Bibliography 253
Index 257
Foreword
In re ent years interest in what is alled \quantum omputers" has grown
extraordinarily. The idea of using the possibilities of quantum me hani s in
organizing omputation looks all the more attra tive now that experimental
work has begun in this area.
However, the prospe ts for physi al realization of quantum omputers
are presently entirely un lear. Most likely this will be a matter of several
de ades. The fundamental a hievements in this area bear at present a purely
mathemati al hara ter.
This book is intended for a �rst a quaintan e with the mathemati al
theory of quantum omputation. For the onvenien e of the reader, we give
at the outset a brief introdu tion to the lassi al theory of omputational
omplexity. The se ond part in ludes the des riptions of basi e�e tive
quantum algorithms and an introdu tion to quantum odes.
The book is based on material from the ourse \Classi al and quan-
tum omputations", given by A. Shen ( lassi al omputations) and A.Kitaev
(quantum omputations) at the Independent Mos ow University in Spring of
1998. In preparing the book we also used materials from the ourse Physi s
229 | Advan ed Mathemati al Methods of Physi s (Quantum Computa-
tion) given by John Preskill and A.Kitaev at the California Institute of
Te hnology in 1998{99 (solutions to some problems in luded in the ourse
were proposed by Andrew Landahl). The original version of this book was
published in Russian [37℄, but the present edition extends it in many ways.
The prerequisites for reading this book are modest. In essen e, it is
enough to know the basi s of linear algebra (as studied in a standard uni-
versity ourse), elementary probability, basi notions of group theory, and a
ix
x Foreword
few on epts from the theory of algorithms (some omputer programming
experien e may do as well as the formal knowledge). Some topi s require
an a quaintan e with Lie groups and homology of manifolds | but only at
the level of de�nitions.
To redu e the amount of information the reader needs to digest at the
�rst reading, part of the material is given in the form of problems and
solutions. Ea h problem is assigned a grade a ording to its diÆ ulty: 1
for an exer ise in use of de�nitions, 2 for a problem that requires some
work, 3 for a diÆ ult problem whi h requires a nontrivial idea. (Of ourse,
the diÆ ulty of a problem is a subje tive thing. Also, if several problems
are based on the same idea, only the �rst of them is marked as diÆ ult).
The grade appears in square bra kets before the problem number. Some
problems are marked with an ex lamation sign, whi h indi ates that they
are almost as important as the main text. Thus, [1!℄ means an easy but
important exer ise, whereas [3℄ is a diÆ ult problem whi h is safe to skip.
Further reading
In this book we fo us on algorithm omplexity (in parti ular, for quan-
tum algorithms), while many related things are not overed. As a gen-
eral referen e on quantum information theory we re ommend the book by
Mi hael Nielsen and Isaa Chuang [51℄, whi h in ludes su h topi s as the
von Neumann entropy, quantum ommuni ation hannels, quantum ryp-
tography, fault-tolerant omputation, and various proposed s hemes for the
realization of a quantum omputer. Another book on quantum omputa-
tion and information was written by Josef Gruska [30℄. Most original papers
on the subje t an be found in the ele toni ar hive at http://arXiv.org,
se tion \Quantum Physi s" (quant-ph).
A knowledgements
A.K. thanks Mi hael Freedman and John Preskill for many inspiring
dis ussions on the topi s in luded in this book. We are grateful to Andrew
Landahl for providing the solution to Problem 3.6 and pointing to some
in onsisten ies in the original manus ript. Among other people who have
helped us to improve the book are David DiVin enzo and Barbara Terhal.
Thanks to the people at AMS, and espe ially to our patient editor Sergei
Gelfand and the opy-editor Natalya Pluzhnikov, for their help in bringing
this book into reality.
The book was written while A.K. was a member of Mi rosoft Resear h
and Calte h, and while A. S. and M.V. were members of Independent Mos-
ow University. The preparation of the original Russian version was started
Foreword xi
while all three of us were working at IMU, and A.K. was a member of
L.D. Landau Institute for Theoreti al Physi s.
A.K. gratefully a knowledges the support from the National S ien e
Foundation through Calte h's Institute for Quantum Informaiton. M.V.
a knowledges the support from the Russian Foundation for Basi Resear h
under grant 02{01{00547.
Notation
_ disjun tion (logi al OR)
^ onjun tion (logi al AND)
: negation
� addition modulo 2 (and also the dire t sum
of linear subspa es)
he
ontrolled NOT gate (p. 62)
blank symbol in the alphabet of a Turing ma hine
Æ(�; �) transition fun tion of a Turing ma hine
Æ
jk
Krone ker symbol
�
S
(�) hara teristi fun tion of the set S
f
�
invertible fun tion orresponding to the
Boolean fun tion f (p. 61)
x
n�1
� � � x
0
number represented by binary digits x
n�1
; : : : ; x
0
g d(x; y) greatest ommon divisor of x and y
a mod q residue of a modulo q
a
b
representation of the rational number a=b in the
form of an irredu ible fra tion
a j b a divides b
a � b (mod q) a is ongruent to b modulo q
A) B A implies B
A, B A is logi ally equivalent to B
L
1
_ L
2
Karp redu tion of predi ates
(L
1
an be redu ed to L
2
(p. 31))
bx the greatest integer not ex eeding x
dxe the least integer not greater than x
xiii
xiv Notation
A
�
set of all �nite words in the alphabet A
E
�
group of hara ters on the Abelian group E,
i.e., Hom(E;U(1))
z
�
omplex onjugate of z
M
�
spa e of linear fun tionals on the spa e M
h�j bra-ve tor (p. 56)
j�i ket-ve tor (p. 56)
h�j�i inner produ t
A
y
Hermitian adjoint operator
b
G unitary operator orresponding to the
permutation G (p. 61)
I
L
identity operator on the spa e L
�
M
proje tion (the operator of proje ting
onto the subspa e M)
Tr
F
A partial tra e of the operator A over the
spa e (tensor fa tor) F (p. 96)
A �B superoperator � 7! A�B (p. 108)
M
n
n-th tensor degree of M
C (a; b; : : : ) spa e generated by the ve tors a; b; : : :
�(U) operator U with quantum ontrol (p. 65)
U [A℄ appli ation of the operator U to a quantum
register (set of qubits) A (p. 58)
E [A℄, E(n; k) error spa es (p. 156)
� (�
1
; �
1
; : : : ; �
n
; �
n
) basis operators on the spa e B
n
(p. 162)
SympCode(F; �) symple ti ode (p. 168)
j � j ardinality of a set or modulus of a number
k � k norm of a ve tor (p. 71)
or operator norm (p. 71)
k � k
tr
tra e norm (p. 98)
k � k
}
superoperator norm (p. 110)
Pr[A℄ probability of the event A
P (� j�) onditional probability (in various ontexts)
P (�;M) quantum probability (p. 95)
f(n) = O(g(n)) there exist numbers C and n
0
su h that f(n) � Cg(n) for all n � n
0
f(n) = (g(n)) there exist numbers C and n
0
su h that f(n) � Cg(n) for all n � n
0
f(n) = �(g(n)) f(n) = O(g(n)) and f(n) = (g(n)) at the
same time
f(n) = poly(n) means the same as f(n) = n
O(1)
poly(n;m) abbreviation for poly(n+m)
Notation xv
N set of natural numbers, i.e., f0; 1; 2; : : : g
Z set of integers
R set of real numbers
C set of omplex numbers
B lassi al bit (set f0; 1g)
B quantum bit (qubit, spa e C
2
| p. 53)
F
q
�nite �eld of q elements
Z=nZ ring of residues modulo n
Z
n
additive group of the ring Z=nZ
(Z=nZ)
�
multipli ative group of invertible elements of
Z=nZ
Sp
2
(n) symple ti group of order n over the �eld F
2
(p. 165)
ESp
2
(n) extended symple ti group of order n over the
�eld F
2
(p. 164)
L(N ) spa e of linear operators on M
L(N ;M) spa e of linear operators from N to M
U(M) group of unitary operators in the spa e M
SU (M) spe ial unitary group in the
spa e M
SO (M) spe ial orthogonal group in the
Eu lidean spa e M
Notation for matri es:
H =
1
p
2
�
1 1
1 �1
�
; K =
�
1 0
0 i
�
;
Pauli matri es: �
x
=
�
0 1
1 0
�
; �
y
=
�
0 �i
i 0
�
; �
z
=
�
1 0
0 �1
�
Notation for omplexity lasses:
NC (p. 23) NP (p. 28) BQP (p. 91)
P (p. 15) MA (p. 138) BQNP (p. 139)
BPP (p. 37) �
k
(p. 46) PSPACE (p. 15)
PP (p. 92) �
k
(p. 46) EXPTIME (p. 22)
P=poly (p. 20)
Introdu tion
All omputers, beginning with Babbage's un onstru ted \analyti al ma h-
ine"
1
and ending with the Cray, are based on the very same prin iples. From
the logi al point of view a omputer onsists of bits (variables, taking the
values 0 and 1), and a program | that is, a sequen e of operations, ea h
using some bits. Of ourse, the newest omputers work faster than old ones,
but progress in this dire tion is limited. It is hard to imagine that the size
of a transistor or another element will ever be smaller than 10
�8
m (the
diameter of the hydrogen atom) or that the lo k frequen y will be greater
than 10
15
Hz (the frequen y of atomi transitions), so that even the super-
omputers of the future will not be able to solve omputational problems
having exponential omplexity. Let us onsider, for example, the problem
of fa toring an integer number x into primes. The obvious method is to at-
tempt to divide x by all numbers from 2 to
p
x. If x has n digits (as written
in the binary form), we need to go through �
p
x � 2
n=2
trials. There ex-
ists an ingenious algorithm that solves the same problem in approximately
exp( n
1=3
) steps ( is a onstant). But even so, to fa tor a number of a mil-
lion digits, a time equal to the age of the Universe would not suÆ e. (There
may exist more e�e tive algorithms, but it seems impossible to dispense
with the exponential.)
There is, however, another way of speeding up the al ulation pro ess
for several spe ial lasses of problems. The situation is su h that ordi-
nary omputers do not employ all the possibilities that are o�ered to us
1
Charles Babbage began his work on the \analyti al ma hine" proje t in 1833. In ontrast
to al ulating devi es already existed at the time, his was supposed to be a universal omputer.
Babbage devoted his whole life to its development, but was not su essful in realizing his dream.
(A simpler, nonuniversal ma hine was partially onstru ted. In fa t, this smaller proje t ould
have been ompleted | in 1991 the ma hine was produ ed in a ordan e with Babbage's design.)
1
2 Introdu tion
by nature. This assertion may seem extremely obvious: in nature there ex-
ists a multitude of pro esses that are unlike operations with zeros and ones.
We might attempt to use those pro esses for the reation of an analog om-
puter. For example, interferen e of light an be used to ompute the Fourier
transform. However, in most ases the gain in speed is not major, i.e., it
depends weakly on the size of the devi e. The reason lies in the fa t that
the equations of lassi al physi s (for example, Maxwell's equations) an be
e�e tively solved on an ordinary digital omputer. What does \e�e tively"
mean? The al ulation of an interferen e pattern may require more time
than the real experiment by a fa tor of a million, be ause the speed of light
is great and the wave length is small. However, as the size of the modelled
physi al system gets bigger, the required number of omputational opera-
tions grows at a moderate rate | as the size raised to some power or, as is
ustomarily said in omplexity theory, polynomially. (As a rule, the number
of operations is proportional to the quantity V t, where V is the volume and
t is the time.) Thus we see that lassi al physi s is too \simple" from the
omputational point of view.
Quantum me hani s is more interesting from this perspe tive. Let us
onsider, for example, a system of n spins. Ea h spin has two so- alled basis
states (0 = \spin up" and 1 = \spin down"), and the whole system has 2
n
basis states jx
1
; : : : ; x
n
i (ea h of the variables x
1
; : : : ; x
n
takes values 0 or 1).
By a general prin iple of quantum me hani s,
P
x
1
;:::;x
n
x
1
;:::;x
n
jx
1
; : : : ; x
n
i
is also a possible state of the system; here
x
1
;:::;x
n
are omplex numbers
alled amplitudes. The summation sign must be understood as a pure for-
mality. In fa t, the \sum" (also alled a superposition) represents a new
mathemati al obje t | a ve tor in a 2
n
-dimensional omplex ve tor spa e.
Physi ally, j
x
1
;:::;x
n
j
2
is the probability to �nd the system in the basis state
jx
1
; : : : ; x
n
i by a measurement of the values of the variables x
j
. (We note
that su h a measurement destroys the superposition.) For this to make sense,
the formula
P
x
1
;:::;x
n
j
x
1
;:::;x
n
j
2
= 1 must hold. Therefore, the general state
of the system (i.e., a superposition) is a unit ve tor in the 2
n
-dimensional
omplex spa e. A state hange over a spe i�ed time interval is des ribed by
a unitary matrix of size 2
n
� 2
n
. If the time interval is very small (� ~=J ,
where J is the energy of spin-spin intera tion and ~ is Plan k's onstant),
then this matrix is rather easily onstru ted; ea h of its elements is easily
al ulated knowing the intera tion between the spins. If, however, we want
to ompute the hange of the state over a large time interval, then it is ne -
essary to multiply su h matri es. For this purpose an exponentially large
number of operations is needed. Despite mu h e�ort, no method has been
found to simplify this omputation (ex ept for some spe ial ases). Most
plausibly, simulation of quantum me hani s is indeed an exponentially hard
omputational problem. One may think this is unfortunate, but let us take
Introdu tion 3
a di�erent point of view: quantum me hani s being hard means it is power-
ful. Indeed, a quantum system e�e tively \solves" a omplex omputational
problem | it models its very self.
Can we use quantum systems for solving other omputational prob-
lems? What would be a mathemati al model of a quantum omputer that
is just as independent of physi al realization as are models of lassi al
omputation?
2
It seems that these questions were �rst posed in 1980 in
the book by Yu. I. Manin [49℄. They were also dis ussed in the works of
R. Feynman [23, 24℄ and other authors. In 1985 D.Deuts h [20℄ proposed a
on rete mathemati al model | the quantum Turing ma hine, and in 1989
an equivalent but more onvenient model | the quantum ir uit [21℄ (the
latter was largely based on Feynman's ideas).
What exa tly is a quantum ir uit? Suppose that we have N spins,
ea h lo ated in a separate numbered ompartment and ompletely isolated
from the surrounding world. At ea h moment of time (as a omputer lo k
ti ks) we hoose, at our dis retion, any two spins and a t on them with an
arbitrary 4 � 4 unitary matrix. A sequen e of su h operations is alled a
quantum ir uit. Ea h operation is determined by a pair of integers, idexing
the spins, and sixteen omplex numbers (the matrix entries). So a quantum
ir uit is a kind of omputer program, whi h an be represented as text and
written on paper. The word \quantum" refers to the way this program is
exe uted.
Let us try to use a quantum ir uit for al ulating a fun tion F : B
n
!
B
m
, where B = f0; 1g is the set of values of a lassi al bit.
3
It is ne essary to
be able to enter the initial data, perform the omputations, and read out the
result. Input into a quantum omputer is a sequen e (x
1
; : : : ; x
n
) of zeros
and ones | meaning that we prepare an initial state jx
1
; : : : ; x
n
; 0; : : : ; 0i.
(The amount of initial data, n, is usually smaller than the overall number
of \memory ells," i.e., of spins, N . The remaining ells are �lled with
zeros.) The initial data are fed into a quantum ir uit, whi h depends on
the problem being solved, but not on the spe i� initial data. The ir uit
turns the initial state into a new quantum state,
j (x
1
; : : : ; x
n
)i =
X
y
1
;:::;y
N
y
1
;:::;y
N
(x
1
; : : : ; x
n
) jy
1
; : : : ; y
N
i;
2
The standard mathemati al model of an ordinary omputer is the Turing ma hine. Most
other models in use are polynomially equivalent to this one and to ea h other, i.e., a problem,
that is solvable in L steps in one model, will be solvable in L
k
steps in another model, where
and k are onstants.
3
Any omputational problem an be posed in this way. For example, if we wish to solve the
problem of fa toring an integer into primes, then (x
1
; : : : ; x
n
) = x (in binary notation) and F (x)
is a list of prime fa tors (in some binary ode).
4 Introdu tion
whi h depends on (x
1
; : : : ; x
n
). It is now ne essary to read out the result. If
the ir uit were lassi al (and orre tly designed to ompute F ), we would
expe t to �nd the answer in the �rstm bits of the sequen e (y
1
; : : : ; y
N
), i.e.,
we seek (y
1
; : : : ; y
m
) = F (x
1
; : : : ; x
n
). To determine the a tual result in the
quantum ase, the values of all spins should be measured. The measurement
may produ e any sequen e of zeros and ones (y
1
; : : : ; y
N
), the probability of
obtaining su h a sequen e being equal to
�
�
y
1
;:::;y
N
(x
1
; : : : ; x
n
)
�
�
2
. A quantum
ir uit is \ orre t" for a given fun tion F if the orre t answer (y
1
; : : : ; y
m
) =
F (x
1
; : : : ; x
n
) is obtained with probability that is suÆ iently lose to 1. By
repeating the omputation several times and hoosing the answer that is
en ountered most frequently, we an redu e the probability of an error to
be as small as we want.
We have just formulated (omitting some details) a mathemati al model
of quantum omputation. Now, two questions arise naturally.
1. For whi h problems does quantum omputation have an advantage in
omparison with lassi al?
2. What system an be used for the physi al realization of a quantum
omputer? (This does not ne essarily have to be a system of spins.)
With regard to the �rst question we now know the following. First, on
a quantum omputer it is possible to model an arbitrary quantum system
in polynomially many steps. This will allow us (when quantum omputers
be ome available) to predi t the properties of mole ules and rystals and to
design mi ros opi ele troni devi es, say, 100 atoms in size. Presently su h
devi es lie at the edge of te hnologi al possibility, but in the future they will
likely be ommon elements of ordinary omputers. So, a quantum omputer
will not be a thing to have in every home or oÆ e, but it will be used to
make su h things.
A se ond example is fa toring integers into primes and analogous num-
ber-theoreti problems. In 1994 P. Shor [62℄ found a quantum algorithm
4
whi h fa tors an n-digit integer in about n
3
steps. This beautiful result
ould have an out ome that is more harmful than useful: fa toring allows
one to break the most ommonly used ryptosystem (RSA), to forge ele -
troni signatures, et . (But anyway, building a quantum omputer is su h
a diÆ ult task that ryptography users may have good sleep | at least, for
the next 10 years.) The method at the ore of Shor's algorithms deals with
Abelian groups. Some non-Abelian generalizations have been found, but it
remains to be seen if they an be applied to any pra ti al problem.
4
Without going into detail, a quantum algorithm is mu h the same thing as a quantum ir uit.
The di�eren e lies in the fa t that a ir uit is de�ned for problems of �xed size (n = onst), whereas
an algorithm applies to any n.
Introdu tion 5
A third example is a sear h for a needed entry in an unsorted database.
Here the gain is not so signi� ant: to lo ate one entry in N we need about
p
N steps on a quantum omputer, ompared to N steps on a lassi al one.
As of this writing, these are all known examples | not be ause quantum
omputers are useless for other problems, but be ause their theory has not
been worked out yet. We an hope that there will soon appear new mathe-
mati al ideas that will lead to new quantum algorithms.
The physi al realization of a quantum omputer is an ex eed-
ingly interesting, but diÆ ult problem. Only a few years ago doubts were
expressed about its solvability in prin iple. The trouble is that an arbitrary
unitary transformation an be realized only with ertain a ura y. Apart
from that, a system of spins or a similar quantum system annot be fully pro-
te ted from the disturban es of the surrounding environment. All this leads
to errors that a umulate in the omputational pro ess. In L � Æ
�1
steps
(where Æ is the pre ision of ea h unitary transformation) the probability of
an error will be of the order of 1, whi h renders the omputation useless. In
part this diÆ ulty an be over ome using quantum error- orre ting odes. In
1996 P. Shor [65℄ proposed a s heme of error orre tion in the quantum om-
puting pro ess (fault-tolerant quantum omputation). The original method
was not optimal but it was soon improved by a number of authors. The
end result amounts to the following. There exists some threshold value Æ
0
su h that for any pre ision Æ < Æ
0
arbitrarily long quantum omputation is
possible. However, for Æ > Æ
0
errors a umulate faster than we an su eed
in orre ting them. By various estimates, Æ
0
lies in the interval from 10
�6
to 10
�2
(the exa t value depends on the hara ter of the disturban es and
the ir uit that is used for error orre tion).
So, there are no obsta les in prin iple for the realization of a quantum
omputer. However, the problem is so diÆ ult that it an be ompared to
the problem of ontrolled thermonu lear synthesis. In fa t, it is essential to
simultaneously satisfy several almost ontradi tory demands:
1. The elements of a quantum omputer | quantum bits (spins or some-
thing similar) | must be isolated from one another and from the envi-
ronment.
2. It is essential to have the possibility to a t sele tively on ea h pair of
quantum bits (at least, on ea h neighboring pair). Generally, one needs
to implement several types of elementary operations ( alled quantum
gates) des ribed by di�erent unitary operators.
3. Ea h of the gates must be realized with pre ision Æ < Æ
0
(see above).
4. The quanum gates must be suÆ iently nontrivial, so that any other
operator is, in a ertain sense, expressible in terms of them.
6 Introdu tion
At the present time there exist several approa hes to the problem of
realizing a quantum omputer.
1. Individual atoms or ions. This �rst-proposed and best-developed idea
exists in several variants. For representing a quantum bit one an employ
both the usual ele tron levels and the levels of �ne and super�ne stru tures.
There is an experimental te hnique for keeping an individual ion or atom in
the trap of a steady magneti or alternating ele tri �eld for a reasonably
long time (of the order of 1 hour). The ion an be \ ooled down" (i.e.,
its vibrational motion eliminated) with the aid of a laser beam. Sele ting
the duration and frequen y of the laser pulses, it is possible to prepare an
arbitrary superposition of the ground and ex ited states. In this way it
is rather easy to ontrol individual ions. Within the trap, one an also
pla e two or more ions at distan es of several mi rons one from another,
and ontrol ea h of them individually. However, it is rather diÆ ult to
horeograph the intera tions between the ions. To this end it has been
proposed that olle tive vibrational modes (ordinary me hani al vibrations
with a frequen y of several MHz) be used. Dipole-dipole intera tions ould
also be used, with the advantage of being a lot faster. A se ond method
(for neutral atoms) is as follows: pla e atoms into separate ele tromagneti
resonators that are oupled to one another (at the moment it is un lear how
to a hieve this te hni ally). Finally, a third method: using several laser
beams, one an reate a periodi potential (\opti al latti e") whi h traps
unex ited atoms. However, an atom in an ex ited state an move freely.
Thus, by ex iting one of the atoms for a ertain time, one lets it move
around and intera t with its neighbors. This �eld of experimental physi s
is now developing rapidly and seems to be very promising.
2. Nu lear magneti resonan e. In a mole ule with several di�erent
nu lear spins, an arbitrary unitary transformation an be realized by a su -
ession of magneti �eld pulses. This has been tested experimentally at
room temperature. However, for the preparation of a suitable initial state,
a temperature < 10
�3
K is required. Apart from diÆ ulties with the ooling,
undesirable intera tions between the mole ules in rease dramati ally as the
liquid freezes. In addition, it is nearly impossible to address a given spin
sele tively if the mole ule has several spins of the same kind.
3. Super ondu ting granules and \quantum dots". Under super-
ool temperatures, the unique degree of freedom of a small (submi ron size)
super ondu ting granule is its harge. It an hange in magnitude by a
multiple of two ele tron harges (sin e ele trons in a super ondu tor are
bound in pairs). Changing the external ele tri potential, one an a hieve
a situation where two harge states have almost the same energy. These
Introdu tion 7
two states an be used as basis states of a quantum bit. The granules in-
tera t with ea h other by means of Josephson jun tions and mutual ele tri
apa itan e. This intera tion an be ontrolled. A quantum dot is a mi-
rostru ture whi h an ontain few ele trons or even a single ele tron. The
spin of this ele tron an be used as a qubit. The diÆ ulty is that one needs
to ontrol ea h granule or quantum dot individually with high pre ision.
This seems harder than in the ase of free atoms, be ause all atoms of the
same type are identi al while parameters of fabri ated stru tures u tuate.
This approa h may eventually su eed, but a new te hnology is required for
its realization.
4. Anyons. Anyons are quasi-parti les (ex itations) in ertain two-dimen-
sional quantum systems, e.g. in a two-dimensional ele tron liquid in mag-
neti �eld. What makes them spe ial is their topologi al properties, whi h
are stable to moderate variation of system parameters. One of the authors
(A.K.) onsiders this approa h espe ially interesting (in view of it being
his own invention, f. [35℄), so that we will des ribe it in more detail. (At
a more abstra t level, the onne tion between quantum omputation and
topology was dis ussed by M. Freedman [25℄.)
The fundamental diÆ ulty in onstru ting a quantum omputer is the
ne essity for realizing unitary transformations with pre ision Æ < Æ
0
, where
Æ
0
is between 10
�2
and 10
�6
. To a hieve this it is ne essary, as a rule, to
ontrol the parameters of the system with still greater pre ision. However,
we an imagine a situation where high pre ision is a hieved automati ally,
i.e., where error orre tion o urs on the physi al level. An example is given
by two-dimensional systems with anyoni ex itations.
All parti les in three-dimensional spa e are either bosons or fermions.
The wave fun tion of bosons does not hange if the parti les are permuted.
The wave fun tion of fermions is multiplied by �1 under a transposition
of two parti les. In any ase, the system is un hanged when ea h of the
parti les is returned to its prior position. In two-dimensional systems, more
omplex behavior is possible. Note, however, that the dis ussion is not about
fundumental parti les, su h as an ele tron, but about ex itations (\defe ts")
in a two-dimensional ele tron liquid. Su h ex itations an move, transform
to ea h other, et ., just like \genuine" parti les.
5
However, ex itations in
the two-dimensional ele tron liquid display some unusual properties. An
ex itation an have a fra tional harge (for example, 1=3 of the harge of
an ele tron). If one ex itation makes a full turn around another, the state
of the surrounding ele tron liquid hanges in a pre isely de�ned manner
5
Fundamental parti les an also be onsidered as ex itations in the va uum whi h is, a tually,
a nontrivial quantum system. The di�eren e is that the va uum is unique, whereas the ele tron
liquid and other \quantum media" an be designed to meet our needs.
8 Introdu tion
that depends on the types of the ex itations and on the topology of the
path, but not on the spe i� traje tory. In the simplest ase, the wave
fun tion gets multiplied by a number (whi h is equal to e
2�i=3
for anyons in
the two-dimensional ele tron liquid in a magneti �eld at the �lling fa tor
1=3). Ex itations with su h properties are alled Abelian anyons. Another
example of Abelian anyons is des ribed (in a mathemati al language) in
Se tion 15.11.
More interesting are non-Abelian anyons, whi h have not yet been ob-
served experimentally. (Theory predi ts their existen e in a two-dimensional
ele tron liquid in a magneti �eld at the �lling fa tor 5=2.) In the presen e
of non-Abelian anyons, the state of the surrounding ele tron liquid is de-
generate, the multipli ity of the degenera y depending on the number of
anyons. In other words, there exist not one, but many states, whi h an
display arbitrary quantum superpositions. It is utterly impossible to a t on
su h a superposition without moving the anyons, so the system is ideally
prote ted from perturbations. If one anyon is moved around another, the
superposition undergoes a ertain unitary transformation. This transforma-
tion is absolutely pre ise. (An error an o ur only if the anyon \gets out of
hand" as a result of quantum tunneling.)
At �rst glan e, the design using anyons seems least realisti . Firstly,
Abelian anyons will not do for quantum omputation, and non-Abelian ones
are still awaiting experimental dis overy. But in order to realize a quantum
omputer, it is ne essary to ontrol (i.e., dete t and drag by a spe i�ed
path) ea h ex itation in the system, whi h will probably be a fra tion of
a mi ron apart from ea h other. This is an ex eedingly omplex te hni al
problem. However, taking into a ount the high demands for pre ision, it
may not be at all easier to realize any of the other approa hes we have men-
tioned. Beyond that, the idea of topologi al quantum omputation, lying at
the foundation of the anyoni approa h, might be expedited by other means.
For example, the quantum degree of freedom prote ted from perturbation,
might shoot up at the end of a \quantum wire" (a one-dimensional ondu -
tor with an odd number of propagating ele troni modes, pla ed in onta t
with a three-dimensional super ondu tor).
Thus, the idea of a quantum omputer looks so very attra tive, and
so very unreal. It is likely that the design of an ordinary omputer was
per eived in just that way at the time of Charles Babbage, whose invention
was realized only a hundred years later. We may hope that in our time the
s ien e and the industry will develop faster, so that we will not have to wait
that long. Perhaps a ouple of fresh ideas plus a few years for working out
a new te hnology will do.
Part 1
Classi al Computation
1. Turing ma hines
Note. In this se tion we address the abstra t notion of omputability, of
whi h we only need a few basi properties. Therefore our exposition here
is very brief. For the most part, the omitted details are simply exer ises
in programming a Turing ma hine, whi h is but a primitive programming
language. A little of programming experien e (in any language) suÆ es to
see that these tasks are doable but tedious.
Informally, an algorithm is a set of instru tions; using it, \we need only to
arry out what is pres ribed as if we were robots: neither understanding, nor
leverness, nor imagination is required of us" [39℄. Applying an algorithm
to its input (initial data) we get some output (result). (It is quite possible
that omputation never terminates for some inputs; in this ase we get no
result.)
Usually inputs and outputs are strings. A string is a �nite sequen e
of symbols ( hara ters, letters) taken from some �nite alphabet. Therefore,
before asking for an algorithm that, say, fa tors polynomials with integer
oeÆ ients, we should spe ify the en oding, i.e., spe ify some alphabet A
and the representation of polynomials by strings over A. For example, ea h
polynomial may be represented by a string formed by digits, letter x, signs
+, � and �. In the answer, two fa tors an be separated by a spe ial
delimiter, et .
One should be areful here be ause sometimes the en oding be omes
really important. For example, if we represent large integers as bit strings (in
binary), it is rather easy to ompare them (to �nd whi h of two given integers
is larger), but multipli ation is more diÆ ult. On the other hand, if an
9
10 1. Classi al Computation
integer is represented by its remainders modulo di�erent primes p
1
; p
2
; : : : ; p
n
(using the Chinese remainder theorem; see Theorem A.5 in Appendix A), it
is easy to multiply them, but omparison is more diÆ ult. So we will spe ify
the en oding in ase of doubt.
We now give a formal de�nition of an algorithm.
1.1. De�nition of a Turing ma hine.
De�nition 1.1. A Turing ma hine (TM) onsists of the following ompo-
nents:
{ a �nite set S alled the alphabet ;
{ an element 2 S (blank symbol);
{ a subset A � S alled the external alphabet ; we assume that the blank
symbol does not belong to A;
{ a �nite set Q whose elements are alled states of the TM;
{ an initial state q
0
2 Q;
{ a transition fun tion, spe i� ally, a partial fun tion
(1.1) Æ : Q� S ! Q� S � f�1; 0; 1g:
(The term \partial fun tion" means that the domain of Æ is a tually a subset
of Q� S. A fun tion that is de�ned everywhere is alled total.)
Note that there are in�nitely many Turing ma hines, ea h representing a
parti ular algorithm. Thus the above omponents are more like a omputer
program. We now des ribe the \hardware" su h programs run on.
A Turing ma hine has a tape that is divided into ells. Ea h ell arries
one symbol from the ma hine alphabet S. We assume that the tape is
in�nite to the right. Therefore, the ontent of the tape is an in�nite sequen e
� = s
0
; s
1
; : : : (where s
i
2 S).
A Turing ma hine also has a read-write head that moves along the tape
and hanges symbols: if we denote its position by p = 0; 1; 2; : : : , the head
an read the symbol s
p
and write another symbol in its pla e.
Position of head O
Cells s
0
s
1
: : : s
p
: : :
Cell numbers 0 1 p
The behavior of a Turing ma hine is determined by a ontrol devi e,
whi h is a �nite-state automaton. At ea h step of the omputation this
devi e is in some state q 2 Q. The state q and the symbol s
p
under the
head determine the a tion performed by the TM: the value of the transition
fun tion, Æ(q; s
p
) = (q
0
; s
0
;�p), ontains the new state q
0
, the new symbol
1. Turing ma hines 11
s
0
, and the shift �p (for example, �p = �1 means that the head moves to
the left).
More formally, the on�guration of a TM is a triple h�; p; qi, where �
is an in�nite sequen e s
0
; : : : ; s
n
; : : : of elements of S, p is a nonnegative
integer, and q 2 Q. At ea h step the TM hanges its on�guration h�; p; qi
as follows:
(a) it reads the symbol s
p
;
(b) it omputes the value of the transition fun tion: Æ(q; s
p
) = (q
0
; s
0
;�p)
(if Æ(q; s
p
) is unde�ned, the TM stops);
( ) it writes the symbol s in ell p of the tape, moves the head by �p, and
passes to state q
0
. In other words, the new on�guration of the TM is
the triple hs
0
; : : : ; s
p�1
; s
0
; s
p+1
; : : : ; p+�p; q
0
i. (If p+�p < 0, the TM
stops.)
Perhaps everyone would agree that these a tions require neither lever-
ness, nor imagination.
It remains to de�ne how the input is given to the TM and how the result
is obtained. Inputs and outputs are strings over A. An input string � is
written on the tape and is padded by blanks. Initially the head is at the
left end of the tape; the initial state is q
0
. Thus the initial on�guration is
h� : : : ; 0; q
0
i. Subsequently, the on�guration is transformed step by step
using the rules des ribed above, and we get the sequen e of on�gurations
h� : : : ; 0; q
0
i; h�
1
; p
1
; q
1
i; h�
2
; p
2
; q
2
i; : : : :
As we have said, this pro ess terminates if Æ is unde�ned or the head bumps
into the (left) boundary of the tape (p + �p < 0). After that, we read
the tape from left to right (starting from the left end) until we rea h some
symbol that does not belong to A. The string before that symbol will be
the output of the TM.
1.2. Computable fun tions and de idable predi ates. Every Turing
ma hine M omputes a partial fun tion '
M
: A
�
! A
�
, where A
�
is the set
of all strings over A. By de�nition, '
M
(�) is the output string for input �.
The value '
M
(�) is unde�ned if the omputation never terminates.
De�nition 1.2. A partial fun tion f from A
�
to A
�
is omputable if there
exists a Turing ma hine M su h that '
M
= f . In this ase we say that f is
omputed by M .
Not all fun tions are omputable be ause the set of all fun tions of type
A
�
! A
�
is un ountable, while the set of all Turing ma hines is ountable.
For on rete examples of non omputable fun tions see Problems 1.3{1.5.
12 1. Classi al Computation
By a predi ate we mean a fun tion with Boolean values: 1 (\true") or
0 (\false"). Informally, a predi ate is a property that an be true or false.
Normally we onsider predi ates whose domain is the set A
�
of all strings
over some alphabet A. Su h predi ates an be identi�ed with subsets of A
�
:
a predi ate P orresponds to the set fx : P (x)g, i.e., the set of strings x for
whi h P (x) is true. Subsets of A
�
are also alled languages over A.
As has been said, a predi ate P is a fun tion A
�
! f0; 1g. A predi ate is
alled de idable if this fun tion is omputable. In other words, a predi ate
P is de idable if there exists a Turing ma hine that answers question \is
P (�) true?" for any � 2 A
�
, giving either 1 (\yes") or 0 (\no"). (Note that
this ma hine must terminate for any � 2 A
�
.)
The notions of a omputable fun tion and a de idable predi ate an be
extended to fun tions and predi ates in several variables in a natural way.
For example, we an �x some separator symbol # that does not belong to
A and onsider a Turing ma hine M with external alphabet A[f#g. Then
a partial fun tion '
M;n
: (A
�
)
n
! A
�
is de�ned as follows:
'
M;n
(�
1
; : : : ; �
n
) = output of M for the input �
1
#�
2
# � � �#�
n
:
The value '
M;n
(�
1
; : : : ; �
n
) is unde�ned if the omputation never terminates
or the output string does not belong A
�
.
De�nition 1.3. A partial fun tion f from (A
�
)
n
to A
�
is omputable if
there is a Turing ma hine M su h that '
M;n
= f .
The de�nition of a de idable predi ate an be given in the same way.
We say that a Turing ma hine works in time T (n) if it performs at most
T (n) steps for any input of size n. Analogously, a Turing ma hine M works
in spa e s(n) if it visits at most s(n) ells for any omputation on inputs of
size n.
1.3. Turing's thesis and universal ma hines. Obviously a TM is an
algorithm in the informal sense. The onverse assertion is alled the Turing
thesis:
\Any algorithm an be realized by a Turing ma hine."
It is alled also the Chur h thesis be ause Chur h gave an alternative
de�nition of omputable fun tions that is formally equivalent to Turing's
de�nition. Note that the Chur h-Turing thesis is not a mathemati al theo-
rem, but rather a statement about our informal notion of algorithm, or the
physi al reality this notion is based upon. Thus the Chur h-Turing thesis
annot be proved, but it is supported by empiri al eviden e. At the early
age of mathemati al omputation theory (1930's), di�erent de�nitions of
1. Turing ma hines 13
algorithm were proposed (Turing ma hine, Post ma hine, Chur h's lambda-
al ulus, G�odel's theory of re ursive fun tions), but they all turned out to
be equivalent to ea h other. The reader an �nd a detailed exposition of the
theory of algorithms in [5, 39, 40, 48, 54, 60, 61℄.
We make some informal remarks about the apabilities of Turing ma-
hines. A Turing ma hine behaves like a person with a restri ted memory,
pen il, eraser, and a notebook with an in�nite number of pages. Pages are
of �xed size; therefore there are �nitely many possible variants of �lling a
page, and these variants an be onsidered as letters of the alphabet of a
TM. The person an work with one page at a time but an then move to the
previous or to the next page. When turning a page, the person has a �nite
amount of information ( orresponding to the state of the TM) in her head.
The input string is written on several �rst pages of the notebook (one
letter per page); the output should be written in a similar way. The om-
putation terminates when the notebook is losed (the head rosses the left
boundary) or when the person does not know what to do next (Æ is unde-
�ned).
Think about yourself in su h a situation. It is easy to realize that by
memorizing a few letters near the head you an perform any a tion in a
�xed-size neighborhood of the head. You an also put extra information
(in addition to letters from the external alphabet) on pages. This means
that you extend the tape alphabet by taking the Cartesian produ t with
some other �nite set that represents possible notes. You an leaf through
the notebook until you �nd a note that is needed. You an reate a free ell
by moving all information along the tape. You an memorize symbols and
then opy them onto free pages of the notebook. Extra spa e on pages may
also be used to store auxiliary strings of arbitrary length (like the initial
word, they are written one symbol per page). These auxiliary strings an
be pro essed by \subroutines". In parti ular, auxiliary strings an be used
to implement ounters (integers that an be in remented and de remented).
Using ounters, we an address a memory ell by its number, et .
Note that in the de�nition of omputability for fun tions of type A
�
!
A
�
we restri t neither the number of auxiliary tape symbols (the set of
symbols S ould be mu h bigger than A) nor the number of states. It
is easy to see, however, that one auxiliary symbol (the blank) is enough.
Indeed, we an represent ea h letter from S nA by a ombination of blanks
and nonblanks. (The details are left to the reader as an exer ise.)
Sin e a Turing ma hine is a �nite obje t (a ording to De�nition 1.1),
it an be en oded by a string. (Note that Turing ma hines with arbitrary
numbers of states and alphabets of any size an be en oded by strings over a
�xed alphabet.) Then for any �xed alphabet A we an onsider a universal
14 1. Classi al Computation
Turing ma hine U . Its input is a pair ([M ℄; x), where [M ℄ is the en oding
of a ma hine M with external alphabet A, and x is a string over A. The
output of U is '
M
(x). Thus U omputes the fun tion u de�ned as follows:
u
�
[M ℄; x
�
= '
M
(x):
This fun tion is universal for the lass of omputable fun tions of type
A
�
! A
�
in the following sense: for any omputable fun tion f : A
�
! A
�
,
there exists someM su h that u([M ℄; x) = f(x) for all x 2 A
�
. (The equality
a tually means that either both u(M;x) and f(x) are unde�ned, or they are
de�ned and equal. Sometimes the notation u([M ℄; x) ' f(x) is used to stress
that both expressions an be unde�ned.)
The existen e of a universal ma hine U is a onsequen e of the Chur h-
Turing thesis sin e our des ription of Turing ma hines was algorithmi . But,
unlike the Chur h-Turing thesis, this is also a mathemati al theorem: the
ma hine U an be onstru ted expli itly and proved to ompute the fun tion
u. The onstru tion is straightforward but boring. It an be explained as
follows: the notebook begins with pages where instru tions (i.e., [M ℄) are
written; the input string x follows the instru tions. The universal ma hine
interprets the instru tions in the following way: it marks the urrent page,
goes ba k to the beginning of the tape, �nds the instru tion that mat hes the
urrent state and the urrent symbol, then returns to the urrent page and
performs the a tion required. A tually, the situation is a bit more omplex:
both the urrent state and the urrent symbol of M have to be represented
in U by several symbols on the tape (be ause the number of states of the
universal ma hine U is �xed whereas the alphabet and the number of states
of the simulated ma hine M are arbitrary). Therefore, we need subroutines
to move strings along the tape, ompare them with instru tions, et .
1.4. Complexity lasses. The omputability of a fun tion does not guar-
antee that we an ompute it in pra ti e: an algorithm omputing it an
require too mu h time or spa e. So from the pra ti al viewpoint we are
interested in e�e tive algorithms.
The idea of an e�e tive algorithm an be formalized in di�erent ways,
leading to di�erent omplexity lasses. Probably the most important is the
lass of polynomial algorithms.
We say that a fun tion f(n) is of polynomial growth if f(n) � n
d
for some onstants ; d and for all suÆ iently large n. (Notation: f(n) =
poly(n).)
Let B be the set f0; 1g. In the sequel we usually onsider fun tions and
predi ates de�ned on B
�
, i.e., on binary strings.
1. Turing ma hines 15
De�nition 1.4. A fun tion F on B
�
is omputable in polynomial time if
there exists a Turing ma hine that omputes it in time T (n) = poly(n),
where n is the length of the input. If F is a predi ate, we say that it is
de idable in polynomial time.
The lass of all fun tions omputable in polynomial time, or all predi ates
de idable in polynomial time (sometimes we all them \polynomial predi-
ates" for brevity), is denoted by P. (Some other omplexity lasses onsid-
ered below are only de�ned for predi ates.) Note that if F is omputable
in polynomial time, then jF (x)j = poly(jxj), sin e the output length annot
ex eed the maximum of the input length and the number of omputation
steps. (Here jtj stands for the length of the string t.)
The omputability in polynomial time is still a theoreti notion: if the
degree of the polynomial is large (or the onstant is large), an algorithm
running in polynomial time may be quite impra ti al.
One may use other omputational models instead of Turing ma hines to
de�ne the lass P. For example, we may use a usual programming language
dealing with integer variables, if we require that all integers used in the
program have at most poly(n) bits.
In speaking about polynomial time omputation, one should be areful
about en oding. For example, it is easy to see that the predi ate that is true
for all unary representations of prime numbers (i.e., strings 1 : : : 1 whose
length N is a prime number) is polynomial. Indeed, the obvious algorithm
that tries to divideN by all numbers�
p
N runs in polynomial time, namely,
poly(N). On the other hand, we do not know whether the predi ate P (x) =
\x is a binary representation of a prime number" is polynomial or not. For
this to be true, there should exist an algorithm with running time poly(n),
where n = blog
2
N is the length of the binary string x. (A probabilisti
polynomial algorithm for this problem is known; see below.)
De�nition 1.5. A fun tion (predi ate) F on B
�
is omputable (de idable)
in polynomial spa e if there exists a Turing ma hine that omputes F and
runs in spa e s(n) = poly(n), where n is the length of the input.
The lass of all fun tions (predi ates) omputable (de idable) in polynomial
spa e is alled PSPACE.
Note that any ma hine that runs in polynomial time also runs in polyno-
mial spa e, therefore P � PSPACE. Most experts believe that this in lusion
is stri t, i.e., P 6= PSPACE, although nobody has su eeded in proving it so
far. This is a famous open problem.
Problems
16 1. Classi al Computation
[1℄ 1.1. Constru t a Turing ma hine that reverses its input (e.g., produ es
\0010111" from \1110100").
[1℄ 1.2. Constru t a Turing ma hine that adds two numbers written in
binary. (Assume that the numbers are separated by a spe ial symbol \+"
that belongs to the external alphabet of the TM.)
[3!℄ 1.3 (\The halting problem is unde idable"). Prove that there is no algo-
rithm that determines, for given Turing ma hine and input string, whether
the ma hine terminates at that input or not.
[2℄ 1.4. Prove that there is no algorithm that enumerates all Turing ma-
hines that do not halt when started with the empty tape.
(Informally, enumeration is a pro ess whi h produ es one element of a set
after another so that every element is in luded in this list. Exa t de�nition:
a set X � A
�
is alled enumerable if it is the set of all possible outputs of
some Turing ma hine E.)
[3℄ 1.5. Let T (n) be the maximum number of steps performed by a Turing
ma hine with � n states and � n symbols before it terminates starting
with the empty tape. Prove that the fun tion T (n) grows faster than any
omputable total fun tion b(n), i.e., lim
n!1
T (n)=b(n) =1.
The mode of operation of Turing ma hines is rather limited and an be
extended in di�erent ways. For example, one an onsider multitape Turing
ma hines that have a �nite number of tapes. Ea h tape has its own head
that an read and write symbols on the tape. There are two spe ial tapes:
an input read-only tape, and an output write-only tape (after writing a
symbol the head moves to the right). A k-tape ma hine has an input tape,
an output tape, and k work tapes.
At ea h step the ma hine reads symbols on all of the tapes (ex ept for
the output tape), and its a tion depends upon these symbols and the urrent
state. This a tion is determined by a transition fun tion that says what the
next state is, what symbols should be written on ea h work tape, what
movement is pres ribed for ea h head (ex ept for the output one), and what
symbol should be written on the output tape (it is possible also that no
symbol is written; in this ase the output head does not move).
Initially all work tapes are empty, and the input string is written on the
input tape. The output is the ontent of the output tape after the TM halts
(this happens when the transition fun tion is unde�ned or when one of the
heads moves past the left end of its tape).
More tapes allow Turing ma hine to work faster; however, the di�eren e
is not so great, as the following problems show.
2. Boolean ir uits 17
[2℄ 1.6. Prove that a 2-tape Turing ma hine working in time T (n) for inputs
of length n an be simulated by an ordinary Turing ma hine working in time
O(T
2
(n)).
[3℄ 1.7. Prove that a 3-tape Turing ma hine working in time T (n) for in-
puts of length n an be simulated by a 2-tape ma hine working in time
O
�
T (n) log T (n)
�
.
[3℄ 1.8. LetM be a (single-tape) Turing ma hine that dupli ates the input
string (e.g., produ es \blabla" from \bla"). Let T (n) be its maximum run-
ning time when pro essing input strings of length n. Prove that T (n) � "n
2
for some " and for all n. What an be said about T
0
(n), the minimum
running time for inputs of length n?
[2℄ 1.9. Consider a programming language that in ludes 100 integer vari-
ables, the onstant 0, in rement and de rement statements, and onditions
of type \variable = 0". One may use if-then-else and while onstru ts,
but re ursion is not allowed. Prove that any omputable fun tion of type
Z! Z has a program in this language.
2. Boolean ir uits
2.1. De�nitions. Complete bases. A Boolean ir uit is a representation
of a given Boolean fun tion as a omposition of other Boolean fun tions.
By a Boolean fun tion of n variables we mean a fun tion of type B
n
! B .
(For n = 0 we get two onstants 0 and 1.) Assume that some set A of
Boolean fun tions (basis) is �xed. It may ontain fun tions with di�erent
arity (number of arguments).
A ir uit C over A is a sequen e of assignments. These assignments in-
volve n input variables x
1
; : : : ; x
n
and several auxiliary variables y
1
; : : : ; y
m
;
the j-th assignment has the form y
j
:= f
j
(u
1
; : : : ; u
r
). Here f
j
is some fun -
tion from A, and ea h of the variables u
1
; : : : ; u
r
is either an input variable
or an auxiliary variable that pre edes y
j
. The latter requirement guarantees
that the right-hand side of the assignment is de�ned when we perform it (we
assume that the values of the input variables are de�ned at the beginning;
then we start de�ning y
1
, y
2
, et .).
The value of the last auxiliary variable is the result of omputation.
A ir uit with n input variables x
1
; : : : ; x
n
omputes a Boolean fun tion
F : B
n
! B if the result of omputation is equal to F (x
1
; : : : ; x
n
) for any
values of x
1
; : : : ; x
n
.
If we sele t m auxiliary variables (instead of one) to be the output, we
get the de�nition of the omputation of a fun tion F : B
n
! B
m
by a ir uit.
18 1. Classi al Computation
y
1
y
0
x
1
x
0
z
2
z
1
z
0
^ � ^ �
^ �
�
Fig. 2.1. Cir uit over the basis f^;�g for the addition of two 2-digit
numbers: z
2
z
1
z
0
= x
1
x
0
+ y
1
y
0
.
A ir uit an also be represented by an a y li dire ted graph (as in
Figure 2.1), in whi h verti es of in-degree 0 (inputs | we put them on top
of the �gure) are labeled by input variables; all other verti es (gates) are
labeled with fun tions from A in su h a way that the in-degree of a vertex
mat hes the arity of the fun tion pla ed at that vertex, and ea h in oming
edge is linked to one of the fun tion arguments. Some verti es are alled
output verti es. Sin e the graph is a y li , any value assignment to the
input variables an be extended (uniquely) to a onsistent set of values for
all verti es. Therefore, the set of values at the output verti es is a fun tion
of input values. This fun tion is omputed by a ir uit. It is easy to see
that this representation of a ir uit an be transformed into a sequen e of
assignments, and vi e versa. (We will not use this representation mu h, but
it explains the name \ ir uit".)
A ir uit for a Boolean fun tion is alled a formula if ea h auxiliary
variable, ex ept the last one, is used (i.e., appears on the right-hand side of
an assignment) exa tly on e. The graph of a formula is a tree whose leaves
are labeled by input variables; ea h label may appear any number of times.
(In a general ir uit an auxiliary variable may be used more than on e, in
whi h ase the out-degree of the orresponding vertex is more than 1.)
Why the name \formula"? If ea h auxiliary variable is used only on e, we
an repla e it by its de�nition. Performing all these \inline substitutions",
we get an expression for f that ontains only input variables, fun tions from
the basis, and parentheses. The size of this expression approximately equals
the total length of all assignments. (It is important that ea h auxiliary
variable is used only on e; otherwise we would need to repla e all o urren es
of ea h auxiliary variable by their de�nitions, and the size might in rease
exponentially.)
2. Boolean ir uits 19
A basis A is alled omplete if, for any Boolean fun tion f , there is a
ir uit over A that omputes f . (It is easy to see that in this ase any
fun tion of type B
n
! B
m
an be omputed by an appropriate ir uit.)
The most ommon basis ontains the following three fun tions:
NOT(x) = :x; OR(x
1
; x
2
) = x
1
_ x
2
; AND(x
1
; x
2
) = x
1
^ x
2
(negation, disjun tion, onjun tion). Here are the value tables for these
fun tions:
x :x x
1
x
2
x
1
_ x
2
x
1
x
2
x
1
^ x
2
0 1 0 0 0 0 0 0
1 0 0 1 1 0 1 0
1 0 1 1 0 0
1 1 1 1 1 1
Theorem 2.1. The basis fNOT;OR;ANDg = f:;_;^g is omplete.
Proof. Any Boolean fun tion of n arguments is determined by its value
table, whi h ontains 2
n
rows. Ea h row ontains the values of the arguments
and the orresponding value of the fun tion.
If the fun tion takes value 1 only on e, it an be omputed by a onjun -
tion of literals; ea h literal is either a variable or the negation of a variable.
For example, if f(x
1
; x
2
; x
3
) is true (equals 1) only for x
1
= 1; x
2
= 0; x
3
= 1,
then
f(x
1
; x
2
; x
3
) = x
1
^ :x
2
^ x
3
(the onjun tion is asso iative, so we omit parentheses; the order of literals
is also unimportant).
In the general ase, a fun tion f an be represented in the form
(2.1) f(x) =
_
fu: f(u)=1g
�
u
(x);
where u = (u
1
; : : : ; u
n
), and �
u
is the fun tion su h that �
u
(x) = 1 if x = u,
and �
u
(x) = 0 otherwise. �
A representation of type (2.1) is alled a disjun tive normal form (DNF).
By de�nition, a DNF is a disjun tion of onjun tions of literals. Later
we will also need the onjun tive normal form (CNF) | a onjun tion of
disjun tions of literals. Any Boolean fun tion an be represented by a CNF.
This fa t is dual to Theorem 2.1 and an be proved in a dual way (we start
with fun tions that have only one zero in the table). Or we an represent
:f by a DNF and then get a CNF for f by negation using De Morgan's
identities
x ^ y = :(:x _ :y); x _ y = :(:x ^ :y):
20 1. Classi al Computation
These identities show that the basis f:;_;^g is redundant: the subsets
f:;_g and f:;^g also onstitute omplete bases. Another useful example
of a omplete basis is f^;�g.
The number of assignments in a ir uit is alled its size. The minimal
size of a ir uit over A that omputes a given fun tion f is alled the ir uit
omplexity of f (with respe t to the basis A) and is denoted by
A
(f). The
value of
A
(f) depends on A, but the transition from one �nite omplete
basis to another hanges the ir uit omplexity by at most a onstant fa tor:
ifA
1
andA
2
are two �nite omplete bases, then
A
1
(f) = O(
A
2
(f)) and vi e
versa. Indeed, ea h A
2
-assignment an be repla ed by O(1) A
1
-assignments
sin e A
1
is a omplete basis.
We are interested in asymptoti estimates for ir uit omplexity (up to
an O(1)-fa tor); therefore the parti ular hoi e of a omplete basis is not
important. We use the notation (f) for the ir uit omplexity of f with
respe t to some �nite omplete basis.
[2℄ Problem 2.1. Constru t an algorithm that determines whether a given
set of Boolean fun tions A onstitutes a omplete basis. (Fun tions are
represented by tables.)
[2!℄ Problem 2.2. Let
n
be the maximum omplexity (f) for Boolean
fun tions f in n variables. Prove that 1:99
n
<
n
< 2:01
n
for suÆ iently
large n.
2.2. Cir uits versus Turing ma hines. Any predi ate F on B
�
an be
restri ted to strings of �xed length n, giving rise to the Boolean fun tion
F
n
(x
1
; : : : ; x
n
) = F (x
1
x
2
� � � x
n
):
Thus F may be regarded as the sequen e of Boolean fun tions F
0
; F
1
; F
2
; : : : .
Similarly, in most ases of pra ti al importan e a (partial) fun tion of
type F : B
�
! B
�
an be represented by a sequen e of (partial) fun tions
F
n
: B
n
! B
p(n)
, where p(n) is a polynomial with integer oeÆ ients. We
will fo us on predi ates for simpli ity, though.
De�nition 2.1. A predi ate F belongs to the lass P=poly (\nonuniform
P") if
(F
n
) = poly(n):
(The term \nonuniform" indi ates that a separate pro edure, i.e., a Boolean
ir uit, is used to perform omputation with input strings of ea h individual
length.)
Theorem 2.2. P � P=poly.
2. Boolean ir uits 21
Proof. Let F be a predi ate de idable in polynomial time. We have to
prove that F 2 P=poly. Let M be a TM that omputes F and runs in
polynomial time (and therefore in polynomial spa e). The omputation by
M on some input x of length n an be represented as a spa e-time diagram �
that is a re tangular table of size T �s, where T = poly(n) and s = poly(n).
t = 0 �
0;1
t = 1
: : :
t = j �
j;k�1
�
j;k
�
j;k+1
t = j + 1 �
j+1;k
: : :
t = T : : :
| {z }
s ells
In this table the j-th row represents the on�guration ofM after j steps:
�
j;k
orresponds to ell k at time j and onsists of two parts: the symbol
on the tape and the state of the TM if its head is in k-th ell (or a spe ial
symbol � if it is not). In other words, all �
j;k
belong to S �
�
f�g [ Q
�
.
(Only one entry in a row an have the se ond omponent di�erent from �.)
For simpli ity we assume that after the omputation stops all subsequent
rows in the table repeat the last row of the omputation.
There are lo al rules that determine the ontents of a ell �
j+1;k
if we
know the ontents of three neighboring ells in row j, i.e., �
j;k�1
, �
j;k
and
�
j;k+1
. Indeed, the head speed is at most one ell per step, so no other ell
an in uen e �
j+1;k
. Rules for boundary ells are somewhat spe ial; they
take into a ount that the head annot be lo ated outside the table.
Now we onstru t a ir uit that omputes F (x) for inputs x of length n.
The ontents of ea h table ell an be en oded by a onstant (i.e., indepen-
dent of n) number of Boolean variables. These variables (for all ells) will
be the auxiliary variables of the ir uit.
Ea h variable en oding the ell �
j+1;k
depends only on the variables
that en ode �
j;k�1
, �
j;k
, and �
j;k+1
. This dependen e is a Boolean fun tion
with a onstant number of arguments. These fun tions an be omputed
by ir uits of size O(1). Combining these ir uits, we obtain a ir uit that
omputes all of the variables whi h en ode the state of every ell. The size
of this ir uit is O(sT )O(1) = poly(n).
It remains to note that the variables in row 0 are determined by the
input string, and this dependen e leads to additional poly(n) assignments.
Similarly, to �nd out the result of M it is enough to look at the symbol
written in the 0-th ell of the tape at the end of the omputation. So the
22 1. Classi al Computation
output is a fun tion of �
T;0
and an be omputed by an additional O(1)-size
ir uit. Finally we get a poly(n)-size ir uit that simulates the behavior
of M for inputs of length n and therefore omputes the Boolean fun tion
F
n
. �
Remark 2.1. The lass P=poly is bigger than P. Indeed, let ' : N ! B
be an arbitrary fun tion (maybe even a non omputable one). Consider the
predi ate F
'
su h that F
'
(x) = '(jxj), where jxj stands for the length of
string x. The restri tion of F
'
to strings of length n is a onstant fun tion
(0 or 1), so the ir uit omplexity of (F
'
)
n
is O(1). Therefore F
'
for any '
belongs to P=poly, although for a non omputable ' the predi ate F
'
is not
omputable and thus does not belong to P.
Remark 2.2. That said, P=poly seems to be a good approximation of P
for many purposes. Indeed, the lass P=poly is relatively small: out of
2
2
n
Boolean fun tions in n variables only 2
poly(n)
fun tions have polynomial
ir uit omplexity (see solution to Problem 2.2). The di�eren e between
uniform and nonuniform omputation is more important for bigger lasses.
For example, EXPTIME, the lass of predi ates de idable in time 2
poly(n)
,
is a nontrivial omputational lass. However, the nonuniform analog of this
lass in ludes all predi ates!
The arguments used to prove Theorem 2.2 an also be used to prove the
following riterion:
Theorem 2.3. F belongs to P if and only if these onditions hold:
(1) F 2 P=poly;
(2) the fun tions F
n
are omputed by polynomial-size ir uits C
n
with the
following property: there exists a TM that for ea h positive integer n
runs in time poly(n) and onstru ts the ir uit C
n
.
A sequen e of ir uits C
n
with this property is alled polynomial-time uni-
form.
Note that the TM mentioned in (2) is not running in polynomial time
sin e its running time is polynomial in n but not in logn (the number of bits
in the binary representation of n). Note also that we impli itly use some
natural en oding for ir uits when saying \TM onstru ts a ir uit".
Proof. ) The ir uit for omputing F
n
onstru ted in Theorem 1.2 has
regular stru ture, and it is lear that the orresponding sequen e of assign-
ments an be produ ed in polynomial time when n is known.
( This is also simple. We ompute the size of the input string x,
then apply the TM to onstru t a ir uit C
jxj
that omputes F
jxj
. Then
we perform the assignments indi ated in C
jxj
, using x as the input, and
2. Boolean ir uits 23
get F (x). All these omputations an be performed in polynomial (in jxj)
time. �
[1℄ Problem 2.3. Prove that there exists a de idable predi ate that be-
longs to P=poly but not to P.
2.3. Basi algorithms. Depth, spa e and width. We hallenge the
reader to study these topi s by working on problems. (Solutions are also
provided, of ourse.) In Problems 2.9{2.16 we introdu e some basi algo-
rithms whi h are used universally throughout this book. The algorithms
are des ribed in terms of uniform (i.e., e�e tively onstru ted) sequen es of
ir uits. In this book we will be satis�ed with polynomial-time uniformity ;
f. Theorem 2.3. [This property is intuitive and usually easy to he k. An-
other useful notion is logarithmi -spa e uniformity : the ir uits should be
onstru ted by a Turing ma hine with work spa e O(log n) (ma hines with
limited work spa e are de�ned below; see 2.3.3). Most of the ir uits we
build satisfy this stronger ondition, although the proof might not be so
easy.℄
2.3.1. Depth. In pra ti e (e.g., when implementing ir uits in hardware),
size is not the only ir uit parameter that ounts. Another important pa-
rameter is depth. Roughly, it is the time that is needed to arry out all
assignments in the ir uit, if we an do more than one in parallel. In-
terestingly enough, it is also related to the spa e needed to perform the
omputation (see Problems 2.17 and 2.18). In general, there is a trade-o�
between size and depth. In our solutions we will be willing to in rease the
size of a ir uit a little to gain a onsiderable redu tion of the depth (see
e.g., Problem 2.14). As a result, we ome up with ir uits of polynomial size
and poly-logarithmi depth. (With a ertain notion of uniformity, the fun -
tions omputed by su h ir uits form the so- alled lass NC, an interesting
sub lass of P.)
More formally, the depth of a Boolean ir uit is the maximum number
of gates on any path from the input to the output. The depth is � d if and
only if one an arrange the gates into d layers, so that the input bits of any
gate at layer j ome from the layers 1; : : : ; j� 1. For example, the ir uit in
Figure 2.1 has depth 3.
Unless stated expli itly otherwise, we assume that all gates have bounded
fan-in, i.e., the number of input bits. (This is always the ase when ir uits
are built over a �nite basis. Unbounded fan-in an o ur, for example, if
one uses OR gates with an arbitrary number of inputs.) We also assume
that the fan-out (the number of times an input or an auxiliary variable is
24 1. Classi al Computation
used) is bounded.
1
If it is ne essary to use the variable more times, one may
insert additional \trivial gates" (identity transformations) into the ir uit,
at the ost of some in rease in size and depth. Note that a formula is a
ir uit in whi h all auxiliary variables have fan-out 1, whereas the fan-out
of the input variables is unbounded.
[1℄ 2.4. Let C be an O(log n)-depth ir uit whi h omputes some fun tion
f : B
n
! B
m
. Prove that after eliminating all extra variables and gates in
C (those whi h are not onne ted to the output), we get a ir uit of size
poly(n+m).
[1!℄ 2.5. Let C be a ir uit (over some basis B) whi h omputes a fun tion
f : B
n
! B . Prove that C an be onverted into an equivalent (i.e., omput-
ing the same fun tion) formula C
0
of the same depth over the same basis.
(It follows from the solution that the size of C
0
does not ex eed
d
, where d
is the depth and is the maximal fan-in.)
[3℄ 2.6 (\balan ed formula"). Let C be a formula of size L over the basis
fNOT;OR;ANDg (with fan-in � 2). Prove that it an be onverted into an
equivalent formula of depth O(logL) over the same basis.
(Therefore, it does not matter whether we de�ne the formula omplexity
of a fun tion in terms of size or in terms of depth. This is not true for the
ir uit omplexity.)
[1℄ 2.7. Show that any fun tion an be omputed by a ir uit of depth � 3
with gates of type NOT, AND, and OR, if we allow AND- and OR-gates
with arbitrary fan-in and fan-out.
[2℄ 2.8. By de�nition, the fun tion PARITY is the sum of n bits modulo 2.
Suppose it is omputed by a ir uit of depth 3 ontaining NOT-gates, AND-
gates and OR-gates with arbitrary fan-in. Show that the size of the ir uit
is exponential (at least
n
for some > 1 and for all n).
2.3.2. Basi algorithms.
[3!℄ 2.9. Comparison. Constru t a ir uit of size O(n) and depth O(logn)
that tells whether two n-bit integers are equal and if they are not, whi h
one is greater.
[2℄ 2.10. Let n = 2
l
. Constru t ir uits of size O(n) and depth O(logn)
for the solution of the following problems.
1
This restri tion is needed to allow onversion of Boolean ir uits into quantum ir uits with-
out extra ost (see Se tion 7). However, it is in no way standard: in most studies in omputational
omplexity, unbounded fan-out is assumed.
2. Boolean ir uits 25
a) A ess by index. Given an n-bit string x = x
0
� � � x
n�1
(a table)
and an l-bit number j (an index), �nd x
j
.
b) Sear h. Evaluate the disjun tion y = x
0
_� � �_x
n�1
, and if it equals
1, �nd the smallest j su h that x
j
= 1.
We now des ribe one rather general method of parallelizing omputation.
A �nite-state automaton is a devi e with an input alphabet A
0
, an output
alphabet A
00
, a set of states Q and a transition fun tionD : Q�A
0
! Q�A
00
(re all that su h a devi e is a part of a Turing ma hine). It is initially set
to some state q
0
2 Q. Then it re eives input symbols x
0
; : : : ; x
m�1
2 A
0
and
hanges its state from q
0
to q
1
to q
2
, et ., a ording to the rule
2
(q
j+1
; y
j
) = D(q
j
; x
j
):
The iterated appli ation of D de�nes a fun tion D
m
: (q
0
; x
0
; : : : ; x
m�1
) 7!
(q
m
; y
0
; : : : ; y
m�1
). We may assume without loss of generality that Q = B
r
,
A
0
= B
r
0
, A
00
= B
r
00
; then D : B
r+r
0
! B
r+r
00
whereas D
m
: B
r+r
0
m
!
B
r+r
00
m
.
The work of the automaton an be represented by this diagram:
x
m�1
x
1
x
0
? ? ?
q
m
�
D
q
m�1
�
� � �
q
2
�
D
q
1
�
D
�
q
0
? ? ?
y
m�1
y
1
y
0
[3!℄ 2.11 (\parallelization of iteration"). Let the integers r, r
0
, r
00
and m be
�xed; set k = r+ r
0
+ r
00
. Constru t a ir uit of size exp(O(k))m and depth
O(k logm) that re eives a transition fun tion D : B
r+r
0
! B
r+r
00
(as a value
table), an initial state q
0
and input symbols x
0
; : : : ; x
m�1
, and produ es the
output (q
m
; y
0
; : : : ; y
m�1
) = D
m
(q
0
; x
0
; : : : ; x
m�1
).
[2!℄ 2.12. Addition. Constru t a ir uit of size O(n) and depth O(logn)
that adds two n-bit integers.
[3!℄ 2.13. The following two problems are losely related.
a) Iterated addition. Constru t a ir uit of size O(nm) and depth
O(logn+ logm) for the addition of m n-digit numbers.
b) Multipli ation. Constru t a ir uit with the same parameters for
the multipli ation of an n-digit number by an m-digit number.
2
Our de�nition of an automaton is not standard. We require that the automaton reads and
outputs one symbol at ea h step. Traditionally, an automaton is allowed to either read a symbol,
or output a symbol, or both, depending on the urrent state. The operation of su h a general
automaton an be represented as the appli ation of an automaton in our sense (with a suitable
output alphabet B) followed by substitution of a word for ea h output symbol.
26 1. Classi al Computation
[3!℄ 2.14. Division. This problem also omes in two variants.
a) Compute the inverse of a real number x=1:x
1
x
2
� � � =1+
P
1
j=1
2
�j
x
j
with pre ision 2
�n
. By de�nition, this requires to �nd a number z su h that
jz � x
�1
j � 2
�n
. For this to be possible, x must be known with pre ision
2
�n
or better; let us assume that x is represented by an n+ 1-digit number
x
0
su h that jx
0
� xj � 2
�(n+1)
. Constru t an O(n
2
logn)-size, O((log n)
2
)-
depth ir uit for the solution of this problem.
b) Divide two integers with remainder: (a; b) 7!
�
ba=b ; (a mod b)
�
,
where 0 � a < 2
k
b and 0 < b < 2
n
. In this ase, onstru t a ir uit of size
O(nk + k
2
log k) and depth O(log n+ (log k)
2
).
[2!℄ 2.15. Majority. The majority fun tion MAJ : B
n
! B equals 1 for
strings in whi h the number of 1s is greater than the number of 0s, and
equals 0 elsewhere. Constru t a ir uit of size O(n) and depth O(log n) that
omputes the majority fun tion.
[3!℄ 2.16. Conne ting path. Constru t a ir uit of size O(n
3
logn) and
depth O((log n)
2
) that he ks whether two �xed verti es of an undire ted
graph are onne ted by a path. The graph has n verti es, labeled 1; : : : ; n;
there arem = n(n�1)=2 input variables x
ij
(where i < j) indi ating whether
there is an edge between i and j.
2.3.3. Spa e and width. In the solution to the above problems we strove
to provide parallel algorithms, whi h were des ribed by ir uits of poly-
logarithmi depth. We now show that, if the ir uit size is not taken into
a ount, then uniform omputation with poly-logarithmi depth
3
is equiva-
lent to omputation with poly-logarithmi spa e.
We are going to study omputation with very limited spa e | so small
that the input string would not �t into it. So, let us assume that the input
string x = x
0
� � � x
n�1
is stored in a supplementary read-only memory, and
the Turing ma hine an a ess bits of x by their numbers. We may think
of the input string as a fun tion X : j 7! x
j
omputed by some external
agent, alled \ora le". The length of an \ora le query" j is in luded into
the ma hine work spa e, but the length of x is not. This way we an a ess
all input bits with spa e O(logn) (but no smaller).
De�nition 2.2. Let X : A
�
! A
�
be a partial fun tion. A Turing ma hine
with ora le X is an ordinary TM with a supplementary tape, in whi h it an
write a string z and have the value of X(z) available for inspe tion at the
next omputation step.
3
As is usual, we onsider ir uits over a �nite omlete basis; the fan-in and fan-out are
bounded.
2. Boolean ir uits 27
In our ase X(z) = x
j
, where z is the binary representation of j (0 � j �
n� 1) by dlog
2
ne digits; otherwise X(z) is unde�ned.
The omputation of a fun tion f : B
n
! B
m
by su h a ma hine is de�ned
as follows. Let x (jxj = n) be the input. We write n and another number
k (0 � k < m) on the ma hine tape and run the ma hine. We allow it to
query bits of x. When the omputation is omplete, the �rst ell of the tape
must ontain the k-th bit of f(x). Note that if the work spa e is limited
by s, then n � 2
s
and m � 2
s
. The omputation time is also bounded:
the ma hine either stops within exp(O(s)) steps, or enters a y le and runs
forever.
[2!℄ 2.17 (\Small depth ir uits an be simulated in small spa e"). Prove
that there exists a Turing ma hine that evaluates the output variables of a
ir uit of depth d over a �xed basis, using work spa e O(d + logm) (where
m is the number of the output variables). The input to the ma hine onsists
of a des ription of the ir uit and the values of its input variables.
[3!℄ 2.18 (\Computation with small spa e is parallelizable"). Let M be a
Turing ma hine. For ea h hoi e of n;m; s, let f
n;m;s
: B
n
! B
m
be the
fun tion omputed by the ma hine M with spa e s (it may be a partial
fun tion). Prove that there exists a family of exp(O(s))-size, O(s
2
)-depth
ir uits C
n;m;s
whi h ompute the fun tions f
n;m;s
.
(These ir uits an be onstru ted by a TM with spa e O(s), but we
will not prove that.)
The reader might wonder why we dis uss the spa e restri tion in terms
of Turing ma hines while the ir uit language is apparently more onvenient.
So, let us ask this question: what is the ir uit analog of the omputation
spa e? The obvious answer is that it is the ir uit width.
Let C be a ir uit whose gates are arranged into d layers. The width
of C is the maximum amount of information arried between the layers,
not in luding the input variables. More formally, for ea h l = 1; : : : ; d we
de�ne w
l
to be the number of auxiliary variables from layers 1; : : : ; l that
are output variables or onne ted to some variables from layers l+1; : : : ; d
(i.e., used in the right-hand side of the orresponding assignments). Then
the width of C is w = maxfw
1
; : : : ; w
d
g.
But here omes a little surprise: any Boolean fun tion an be omputed
by a ir uit of bounded width (see Problem 2.19 below). Therefore the width
is rather meaningless parameter, unless we put some other restri tions on
the ir uit. To hara terize omputation with limited spa e (e.g., the lass
PSPACE), one has to use either Turing ma hines, or some lass of ir uits
with regular stru ture.
28 1. Classi al Computation
[3℄ 2.19 (Barrington [8℄). Let C be a formula of depth d that omputes a
Boolean fun tion f(x
1
; : : : ; x
n
). Constru t a ir uit of size exp(O(d)) and
width O(1) that omputes the same fun tion.
3. The lass NP: Redu ibility and ompleteness
3.1. Nondeterministi Turing ma hines. NP is the lass of predi ates
re ognizable in polynomial time by \nondeterministi Turing ma hines."
(The word \nondeterministi " is not appropriate but widely used.)
The lass NP is de�ned only for predi ates. One says, for example, that
\the property that a graph has a Hamiltonian y le belongs to NP". (A
Hamiltonian y le is a y le that traverses all verti es exa tly on e.)
We give several de�nitions of this lass. The �rst uses nondeterministi
Turing ma hines. A nondeterministi Turing ma hine (NTM) resembles
an ordinary (deterministi ) ma hine, but an nondeterministi ally hoose
one of several a tions possible in a given on�guration. More formally, a
transition fun tion of an NTM is multivalued: for ea h pair (state, symbol)
there is a set of possible a tions. Ea h a tion has a form (new state, new
symbol, shift). If the set of possible a tions has ardinality at most 1 for
ea h state-symbol ombination, we get an ordinary Turing ma hine.
A omputational path of an NTM is determined by a hoi e of one of the
possible transitions at ea h step; di�erent paths are possible for the same
input.
De�nition 3.1. A predi ate L belongs to the lass NP if there exists an
NTM M and a polynomial p(n) su h that
L(x) = 1 ) there exists a omputational path that gives answer
\yes" in time not ex eeding p(jxj);
L(x) = 0 ) (version 1) there is no path with this property;
(version 2) . . . and, moreover, there is no path (of any
length) that gives answer \yes".
Remark 3.1. Versions 1 and 2 are equivalent. Indeed, let an NTM M
1
satisfy version 1 of the de�nition. To ex lude \yes" answers for long om-
putational paths, it suÆ es to simulate M
1
while ounting its steps, and to
abort the omputation after p(jxj) steps.
Remark 3.2. The argument in Remark 3.1 has a subtle error. If the oeÆ-
ients of the polynomial p are non omputable, diÆ ulties may arise when we
have to ompare the number of steps with the value of the polynomial. In
order to avoid this ompli ation we will add to De�nition 3.1 an additional
requirement: p(n) has integer oeÆ ients.
3. The lass NP: Redu ibility and ompleteness 29
Remark 3.3. By de�nition P � NP. Is this in lusion stri t? Rather intense
although unsu essful attempts have been made over the past 30 years to
prove the stri tness. Re ently S. Smale in luded the P
?
= NP problem in
the list of most important mathemati al problems for the new entury (the
other problems are the Riemann hypothesis and the Poin ar�e onje ture).
More pra ti al people an dream of $1,000,000 that Clay Institute o�ers for
the solution of this problem.
Now we give another de�nition of the lass NP, whi h looks more natu-
ral. It uses the notion of a polynomially de idable predi ate of two variables:
a predi ate R(x; y) (where x and y are strings) is polynomially de idable (de-
idable in polynomial time) if there is a (deterministi ) TM that omputes
it in time poly(jxj; jyj) (whi h means poly(jxj+ jyj) or poly
�
maxfjxj; jyjg
�
|
these two expressions are equivalent).
De�nition 3.2. A predi ate L belongs to the lass NP if it an be repre-
sented as
L(x) = 9y
�
�
jyj < q(jxj)
�
^R(x; y)
�
;
where q is a polynomial (with integer oeÆ ients), and R is a predi ate of
two variables de idable in polynomial time.
Remark 3.4. Let R(x; y) = \y is a Hamiltonian y le in the graph x".
More pre isely, we should say: \x is a binary en oding of some graph, and y
is an en oding of a Hamiltonian y le in that graph". Take q(n) = n. Then
L(x) means that graph x has a Hamiltonian y le. (We assume that the
en oding of any y le in a graph is shorter than the en oding of the graph
itself.)
Theorem 3.1. De�nitions 3.1 and 3.2 are equivalent.
Proof. De�nition 3.1 ) De�nition 3.2. Let M be an NTM and let p(n) be
the polynomial of the �rst de�nition. Consider the predi ate R(x; y) = \y
is a des ription of a omputational path that starts with input x, ends with
answer `yes', and takes at most p(jxj) steps". Su h a des ription has length
proportional to the omputation time if an appropriate en oding is used
(and even if we use a table as in the proof of Theorem 2.2, the des ription
length is at most quadrati ). Therefore for q(n) in the se ond de�nition we
an take O(p(n)) (or O(p
2
(n)) if we use less eÆ ient en oding).
It remains to prove that predi ate R belongs to P. This is almost obvi-
ous. We must he k that we are presented with a valid des ription of some
omputational path (this is a polynomial task), and that this path starts
with x, takes at most p(jxj) steps, and ends with \yes" (that is also easy).
De�nition 3.2 ) De�nition 3.1. Let R; q be as in De�nition 3.2. We
onstru t an NTM M for De�nition 3.1. M works in two stages.
30 1. Classi al Computation
First, M nondeterministi ally guesses y. More pre isely, this means
that M goes to the end of the input string, moves one ell to the right,
writes #, moves on e more to the right, and then writes some string y (M 's
nondeterministi rules allow it to write any symbol and then move to the
right, or �nish writing). After that, the tape has the form x#y for some y,
and M goes to the se ond stage.
At the se ond stage M he ks that jyj < q(jxj) (note that M an write
a very long y for any given x) and omputes R(x; y) (using the polynomial
algorithm that exists a ording to De�nition 3.2). If x 2 L, then there
is some y su h that jyj < q(jxj) and R(x; y). Therefore M has, for x; a
omputational path of polynomial length ending with \yes". If x =2 L, no
omputational path ends with \yes". �
Now we pro eed to yet another des ription of NP that is just a refor-
mulation of De�nition 3.2, but whi h has a form that an be used to de�ne
other omplexity lasses.
De�nition 3.3. Imagine two persons: King Arthur, whose mental abilities
are polynomially bounded, and a wizard Merlin, who is intelle tually om-
nipotent and knows everything. A is interested in some property L(x) (he
wants, for example, to be sure that some graph x has a Hamiltonian y le).
M wants to onvin e A that L(x) is true. But A does not trust M (\he
is too lever to be loyal") and wants to make sure L(x) is true, not just
believe M.
So Arthur arranges that, after both he and Merlin see input string x,
M writes a note to A where he \proves" that L(x) is true. Then A veri�es
this proof by some polynomial proof- he king pro edure.
The proof- he king pro edure is a polynomial predi ate
R(x; y) = \y is a proof of L(x)":
It should satisfy two properties:
L(x) = 1 ) M an onvin e A that L(x) is true by presenting some
proof y su h that R(x; y);
L(x) = 0 ) whatever M says, A is not onvin ed: R(x; y) is false for
any y.
Moreover, the proof y should have polynomial (in jxj) length, otherwise
A annot he k R(x; y) in polynomial (in jxj) time.
In this way, we arrive pre isely at De�nition 3.2.
3.2. Redu ibility and NP- ompleteness. The notion of redu ibility al-
lows us to verify that a predi ate is at least as diÆ ult as some other pred-
i ate.
3. The lass NP: Redu ibility and ompleteness 31
De�nition 3.4 (Karp redu ibility). A predi ate L
1
is redu ible to a pred-
i ate L
2
if there exists a fun tion f 2 P su h that L
1
(x) = L
2
(f(x)) for any
input string x.
We say that f redu es L
1
to L
2
. Notation: L
1
_ L
2
.
Karp redu ibility is also alled \polynomial redu ibility" (or just \re-
du ibility").
Lemma 3.2. Let L
1
_ L
2
. Then
(a) L
2
2 P) L
1
2 P;
(b) L
1
=2 P) L
2
=2 P;
( ) L
2
2 NP) L
1
2 NP.
Proof. To prove (a) let us note that jf(x)j = poly(jxj) for any f 2 P.
Therefore, we an de ide L
1
(x) in polynomial time as follows: we ompute
f(x) and then ompute L
2
(f(x)).
Part (b) follows from (a).
Part ( ) is also simple. It an be explained in various ways. Using the
Arthur{Merlin metaphor, we say that Merlin ommuni ates to Arthur a
proof that L
2
(f(x)) is true (if it is true). Then Arthur omputes f(x) by
himself and he ks whether L
2
(f(x)) is true, using Merlin's proof.
Using De�nition 3.2, we an explain the same thing as follows:
L
1
(x) , L
2
(f(x)) , 9 y
�
�
jyj < q(jf(x)j)
�
^R(f(x); y)
�
, 9 y
�
�
jyj < r(jxj)
�
^R
0
(x; y)
�
:
Here R
0
(x; y) stands for
�
jyj < q(jf(x)j)
�
^R(f(x); y), and r(n) = q(h(n)),
where h(n) is a polynomial bound for the time needed to ompute f(x) for
any string x of length n (and, therefore, jf(x)j � h(jxj) for any x). �
De�nition 3.5. A predi ate L 2 NP is NP- omplete if any predi ate in NP
is redu ible to it.
If some NP- omplete predi ate an be omputed in time T (n), then any
NP-predi ate an be omputed in time poly(n) + T (poly(n)). Therefore, if
some NP- omplete predi ate belongs to P, then P = NP. Put is this way:
if P 6= NP (whi h is probably true), then no NP- omplete predi ate belongs
to P.
If we measure running time \up to a polynomial", then we an say that
NP- omplete predi ates are the \most diÆ ult" ones in NP.
The key result in omputational omplexity says that NP- omplete pred-
i ates do exist. Here is one of them, alled satis�ability : SAT (x) means that
32 1. Classi al Computation
x is a propositional formula ( ontaining Boolean variables and operations
:, ^, and _) that is satis�able, i.e., true for some values of the variables.
Theorem 3.3 (Cook, Levin).
(1) SAT 2 NP;
(2) SAT is NP- omplete.
Corollary. If SAT 2 P, then P = NP.
Proof of Theorem 3.3. (1) To onvin e Arthur that a formula is satis�-
able, Merlin needs only to show him the values of the variables that make it
true. Then Arthur an ompute the value of the whole formula by himself.
(2) Let L(x) be an NP-predi ate and
L(x) = 9y
�
�
jyj < q(jxj)
�
^R(x; y)
�
for some polynomial q and some predi ate R de idable in polynomial time.
We need to prove that L is redu ible to SAT . LetM be a Turing ma hine
that omputes R in polynomial time. Consider the omputation table (see
the proof of Theorem 2.2) for M working on some input x#y. We will use
the same variables as in the proof of Theorem 2.2. These variables en ode
the ontents of ells in the omputation table.
Now we write a formula that says that values of variables form an en-
oding of a su essful omputation (with answer \yes"), starting with the
input x#y. To form a valid omputation table, values should obey some
lo al rules for ea h four ells on�gured as follows:
These lo al rules an be written as formulas in 4t variables (if t variables are
needed to en ode one ell). We write the onjun tion of these formulas and
add formulas saying that the �rst line ontains the input string x followed
by # and some binary string y, and that the last line ontains the answer
\yes".
The satisfying assignment for our formula will be an en oding of a su -
essful omputation of M on input x#y (for some binary string y). On the
other hand, any su essful omputation that uses at most S tape ells and
requires at most T steps (where T � S is the size of the omputation table
that is en oded) an be transformed into a satisfying assignment.
Therefore, if we onsider a omputational table that is large enough to
ontain the omputation of R(x; y) for any y su h that jyj < q(jxj), and
3. The lass NP: Redu ibility and ompleteness 33
write the orresponding formula as explained above, we get a polynomial-
size formula that is satis�able if and only if L(x) is true. Therefore L is
redu ible to SAT . �
Other examples of NP- omplete problems (predi ates) an be obtained
using the following lemma.
Lemma 3.4. If SAT _ L and L 2 NP, then L is NP- omplete. More
generally, if L
1
is NP- omplete, L
1
_ L
2
, and L
2
2 NP, then L
2
is NP-
omplete.
Proof. The redu ibility relation is transitive: if L
l
_ L
2
and L
2
_ L
3
, then
L
1
_ L
3
. (Indeed, the omposition of two fun tions from P belongs to P).
A ording to the hypothesis, any NP-problem is redu ible to L
1
, and L
1
is
redu ible to L
2
. Therefore any NP-problem is redu ible to L
2
. �
Now let us onsider the satis�ability problem restri ted to 3-CNF. Re all
that a CNF ( onjun tive normal form) is a onjun tion of lauses; ea h
lause is a disjun tion of literals; ea h literal is either a variable or a negation
of a variable. If ea h lause ontains at most three literals, we get a 3-CNF.
By 3-SAT we denote the following predi ate:
3-SAT (x) = x is a satis�able 3-CNF:
Evidently, 3-SAT is redu ible to SAT (be ause any 3-CNF is a formula).
The next theorem shows that the reverse is also true: SAT is redu ible to
3-SAT . Therefore 3-SAT is NP- omplete (by Lemma 3.4).
Theorem 3.5. SAT _ 3-SAT.
Proof. For any propositional formula (and even for any ir uit over the
standard basis fAND;OR;NOTg), we onstru t a 3-CNF that is satis�able
if and only if the given formula is satis�able (the given ir uit produ es
output 1 for some input).
Let x
1
; : : : ; x
n
be input variables of the ir uit, and let y
1
; : : : ; y
s
be
auxiliary variables (see the de�nition of a ir uit). Ea h assignment involves
at most three variables (1 on the left-hand side, and 1 or 2 on the right-hand
side).
Now we onstru t a 3-CNF that has variables x
1
; : : : ; x
n
; y
1
; : : : ; y
s
and
is true if and only if the values of all y
j
are orre tly omputed (i.e., they
oin ide with the right-hand sides of the assignments) and the last variable is
true. To this end, we repla e ea h assignment by an equivalen e (of Boolean
34 1. Classi al Computation
expressions) and represent this equivalen e as a 3-CNF:
�
y , (x
1
_ x
2
)
�
= (x
1
_ x
2
_ :y) ^ (:x
1
_ x
2
_ y) ^ (x
1
_ :x
2
_ y)
^ (:x
1
_ :x
2
_ y);
�
y , (x
1
^ x
2
)
�
= (x
1
_ x
2
_ :y) ^ (:x
1
_ x
2
_ :y) ^ (x
1
_ :x
2
_ :y)
^ (:x
1
_ :x
2
_ y);
�
y , :x
�
= (x _ y) ^ (:x _ :y):
Finally, we take the onjun tion of all these 3-CNF's and the variable y
s
(the latter represents the ondition that the output of the ir uit is 1).
Let us assume that the resulting 3-CNF is satis�ed by some x
1
; : : : ; x
n
,
y
1
; : : : ; y
s
. If we plug the same values of x
1
; : : : ; x
n
into the ir uit, then
the auxiliary variables will be equal to y
1
; : : : ; y
s
, so the ir uit output will
be y
s
= 1. Conversely, if the ir uit produ es 1 for some inputs, then the
3-CNF is satis�ed by the same values of x
1
; : : : ; x
n
and appropriate values
of the auxiliary variables.
So our transformation (of a ir uit into a 3-CNF) is indeed a redu tion
of SAT to 3-SAT . �
Here is another simple example of redu tion.
ILP (integer linear programming). Given a system of linear inequalities
with integer oeÆ ients, is there an integer solution? (In other words, is the
system onsistent?)
In this problem the input is the oeÆ ient matrix and the ve tor of the
right-hand sides of the inequalities. It is not obvious that ILP 2 NP. Indeed,
the solution might exist, but Merlin might not be able to ommuni ate it to
Artur be ause it is not immediately lear that the number of bits needed to
represent the solution is polynomial.
However, it is in fa t true that, if a system of inequalities with integer
oeÆ ients has an integer solution, then it has an integer solution whose
binary representation has size bounded by a polynomial in the bit size of
the system; see [55, vol. 2, x17.1℄. Therefore, ILP is in NP.
Now we redu e 3-SAT to ILP . Assume that a 3-CNF is given. We
onstru t a system of inequalities that has integer solutions if and only if
the 3-CNF is satis�able. For ea h Boolean variable x
i
we onsider an integer
variable p
i
. The negation :x
i
orresponds to the expression 1 � p
i
. Ea h
lause X
j
_X
k
_X
m
(where X
�
are literals) orresponds to the inequality
P
j
+ P
k
+ P
m
� 1, where P
j
; P
k
; P
m
are the expressions orresponding to
X
j
;X
k
;X
m
. It remains to add the inequalities 0 � p
i
� 1 for all i, and we
get a system whose solutions are satisfying assignments for the given 3-CNF.
3. The lass NP: Redu ibility and ompleteness 35
Remark 3.5. If we do not require the solution to be integer-valued, we get
the standard linear programming problem. Polynomial algorithms for the
solution of this problem (due to Kha hiyan and Karmarkar) are des ribed,
e. g., in [55, vol. 1, xx13, 15.1℄.
An extensive list of NP- omplete problems an be found in [29℄. Usually
NP- ompleteness is proved by some redu tion. Here are several examples of
NP- omplete problems.
3- oloring. For a given graph G determine whether it admits a 3-
oloring. (By a 3- oloring we mean oloring of the verti es with 3 olors
su h that ea h edge has endpoints of di�erent olors.)
(It turns out that a similar 2- oloring problem an be solved in polyno-
mial time.)
Clique. For a graph G and an integer k determine whether the graph
has a k- lique (a set of k verti es su h that every two of its elements are
onne ted by an edge).
Problems
[3℄ 3.1. Prove that one an he k the satis�ability of a 2-CNF (a onjun -
tion of disjun tions, ea h ontaining two literals) in polynomial time.
[2℄ 3.2. Prove that the problem of the existen e of an Euler y le in an
undire ted graph (an Euler y le is a y le that traverses ea h edge exa tly
on e) belongs to P.
[1!℄ 3.3. Suppose we have an NP-ora le | a magi devi e that an imme-
diately solve any instan e of the SAT problem for us. In other words, for
any propositional formula the ora le tells whether it is satis�able or not.
Prove that there is a polynomial-time algorithm that �nds a satisfying as-
signment to a given formula by making a polynomial number of queries to
the ora le. (A similar statement is true for the Hamiltonian y le: �nding a
Hamiltonian y le in a graph is at most polynomially harder than he king
for its existen e.)
[3!℄ 3.4. There are n boys and n girls. It is known whi h boys agree to
dan e with whi h girls and vi e versa. We want to know whether there
exists a perfe t mat hing (the boys and the girls an dan e in pairs so that
everybody is satis�ed). Prove that this problem belongs not only to NP
(whi h is almost evident), but also to P.
3.5. Constru t
[2!℄ (a) a polynomial redu tion of the 3-SAT problem to the lique problem;
36 1. Classi al Computation
[3℄ (b) a polynomial redu tion of 3-SAT toClique that onserves the num-
ber of solutions (if a 3-CNF F is transformed into a graph H and an integer
k, then the number of satisfying assignments for F equals the number of
k- liques in H).
3.6. Constru t
[2!℄ (a) a polynomial redu tion of 3-SAT to 3- oloring;
[3℄ (b) the same as for (a), with the additional requirement that the number
of satisfying assignments is one sixth of the number of 3- olorings of the
orresponding graph.
[2℄ 3.7. A tile is a (1� 1)-square whose sides are marked with letters. We
want to over an (n � n)-square with n
2
tiles; it is known whi h letters
are allowed at the boundary of the n � n-square and whi h letters an be
adja ent.
The tiling problem requires us to �nd, for a given set of tile types ( on-
taining at most poly(n) types) and for given restri tions, whether or not
there exists a tiling of the (n� n)-square.
Prove that the tiling problem is NP- omplete.
[1℄ 3.8. Prove that the predi ate \x is the binary representation of a om-
posite integer" belongs to NP.
[3℄ 3.9. Prove that the predi ate \x is the binary representation of a prime
integer" belongs to NP.
4. Probabilisti algorithms and the lass BPP
4.1. De�nitions. Ampli� ation of probability. A probabilisti Turing
ma hine (PTM) is somewhat similar to a nondeterministi one; the di�er-
en e is that hoi e is produ ed by oin tossing, not by guessing. More for-
mally, some (state; symbol) ombinations have two asso iated a tions, and
the hoi e between them is made probabilisti ally. Ea h instan e of this
hoi e is ontrolled by a random bit. We assume that ea h random bit is 0
or 1 with probability 1=2 and that di�erent random bits are independent.
(In fa t we an repla e 1=2 by, say, 1=3 and get almost the same de�ni-
tion; the lass BPP (see below) remains the same. However, if we repla e
1=2 by some non omputable real p, we get a rather strange notion whi h is
better avoided.)
For a given input string a PTM generates not a unique output string,
but a probability distribution on the set of all strings (di�erent values of
the random bits lead to di�erent omputation outputs, and ea h possible
output has a ertain probability).
4. Probabilisti algorithms and the lass BPP 37
De�nition 4.1. Let " be a onstant su h that 0 < " < 1=2. A predi ate
L belongs to the lass BPP if there exist a PTM M and a polynomial p(n)
su h that the ma hine M running on input string x always terminates after
at most p(jxj) steps, and
L(x) = 1 ) M gives the answer \yes" with probability � 1� ";
L(x) = 0 ) M gives the answer \no" with probability � ".
In this de�nition the admissible error probability " an be, say, 0:49
or 10
�10
| the lass BPP will remain the same. Why? Assume that the
PTM has probability of error at most " < 1=2. Take k opies of this ma hine,
run them all for the same input (using independent random bits) and let
them vote. Formally, what we do is applying the majority fun tion MAJ
(see Problem 2.15) to k individual outputs. The \majority opinion" will be
wrong with probability
(4.1)
p
error
�
X
S�f1;:::;kg;
jSj�k=2
(1� ")
jSj
"
k�jSj
=
�
(1� ")"
�
k=2
X
S�f1;:::;kg;
jSj�k=2
�
"
1� "
�
k=2�jSj
<
�
p
(1� ")"
�
k
2
k
= �
k
; where � = 2
p
"(1 � ") < 1:
If k is big enough, the e�e tive error probability will be as small as we
wish. This is alled ampli� ation of probability. To make the quantity p
error
smaller than a given number "
0
, we need to set k = �(log(1="
0
)). (Sin e "
and "
0
are onstants, k does not depend on the input. Even if we require
that "
0
= exp(�p(n)) for an arbitrary polynomial p, the omposite TM still
runs in polynomial time.)
Let us we give an equivalent de�nition of the lass BPP using predi ates
in two variables (this de�nition is similar to De�nition 3.2).
De�nition 4.2. A predi ate L belongs to BPP if there exist a polynomial
p and a predi ate R, de idable in polynomial time, su h that
L(x) = 1 ) the fra tion of strings r of length p(jxj) satisfying R(x; r)
is greater than 1� ";
L(x) = 0 ) the fra tion of strings r of length p(jxj) satisfying R(x; r)
is less than ".
Theorem 4.1. De�nitions 4.1 and 4.2 are equivalent.
38 1. Classi al Computation
Proof. De�nition 4.1 ) De�nition 4.2. Let R(x; r) be the following pred-
i ate: \M says `yes' for the input x using r
1
; : : : ; r
p
n
as the random bits"
(we assume that a oin is tossed at ea h step of M). It is easy to see that
the requirements of De�nition 4.2 are satis�ed.
De�nition 4.2 ) De�nition 4.1. Assume that p and R are given. Con-
sider a PTM that (for input x) randomly hooses a string r of length p(jxj),
making p(jxj) oin tosses, and then omputes R(x; r). This ma hine satis�es
De�nition 4.1 (with a di�erent polynomial p
0
instead of p). �
y
x
x 2 L
x 62 L
Fig. 4.1. The hara teristi set of the predi ate R(x; y). Verti al lines
represent the sets S
x
.
De�nition 4.2 is illustrated in Figure 4.1. We represent a pair (x; y) of
strings as a point and draw the set S =
�
(x; y) : (jyj = p(jxj)) ^ R(x; y)
.
For ea h x we onsider the x-se tion of S de�ned as S
x
= fy : (x; y) 2 Sg.
The set S is rather spe ial in the sense that, for any x, either S
x
is large
( ontains at least 1 � " fra tion of all strings of length p(jxj) ) or is small
( ontains at most " fra tion of them). Therefore all values of x are divided
into two ategories: for one of them L(x) is true and for the other L(x) is
false.
Remark 4.1. Probabilisti Turing ma hines (unlike nondeterministi ones,
whi h depend on almighty Merlin for guessing the omputational path) an
be onsidered as real omputing devi es. Physi al pro esses like the Nyquist
noise or radioa tive de ay are believed to provide sour es of random bits; in
the latter ase it is guaranteed by the very nature of quantum me hani s.
4.2. Primality testing. A lassi example of a BPP problem is he king
whether a given integer q (represented by n = dlog
2
qe bits) is prime or not.
We all this problem Primality. We will des ribe a probabilisti primality
test, alled Miller{Rabin test. For reader's onvenien e all ne essary fa ts
from number theory are given in Appendix A.
4. Probabilisti algorithms and the lass BPP 39
4.2.1. Main idea. We begin with a mu h simpler \Fermat test", though
its results are generally in on lusive. It is based on Fermat's little theorem
(Theorem A.9), whi h says that
if q is prime, then a
q�1
� 1 (mod q) for x 2 f1; : : : ; q � 1g:
We may regard a as a (mod q)-residue and simply write a
q�1
= 1, assuming
that arithmeti operations are performed with residues rather than integers.
So, the test is this: we pi k a random a and he k whether a
q�1
= 1.
If this is true, then q may be a prime; but if this is false, then q is not a
prime. Su h a an be alled a witness saying that a is omposite. This kind
of eviden e is indire t (it does not give us any fa tor of q) but usually easy
to �nd: it often suÆ es to he k a = 2. But we will do a better job if we
sample a from the uniform distribution over the set f1; : : : ; q� 1g (i.e., ea h
element of this set is taken with probability 1=(q � 1) ).
Suppose q is omposite. We want to know if the test a tually shows that
with nonnegligible probability. Consider two ases.
1) g d(a; q) = d 6= 1. Then a
q�1
� 0 6� 1 (mod d), therefore a
q�1
6�
1 (mod q). The test dete ts that q is omposite. Unfortunately, the
probability to get su h an a is usually small.
2) g d(a; q) = 1, i.e. a 2 (Z=qZ)
�
(where (Z=qZ)
�
denotes the group of
invertible (mod q)-residues). This is the typi al ase; let us onsider it
more losely.
Lemma 4.2. If a
q�1
6= 1 for some element a 2 (Z=qZ)
�
, then the Fermat
test dete ts the ompositeness of q with probability � 1=2.
Proof. Let G = (Z=qZ)
�
. For any integer k de�ne the following set:
(4.2) G
(m)
=
�
x 2 G : x
m
= 1
:
This is a subgroup in G (due to the identity a
m
b
m
= (ab)
m
for elements
of an Abelian group). If a
q�1
6= 1 for some a, then a =2 G
(q�1)
, therefore
G
(q�1)
6= G. By Lagrange's theorem, the ratio jGj = jG
(m)
j is an integer,
hen e jGj = jG
(m)
j � 2. It follows that a
q�1
6= 1 for at least half of a 2
(Z=qZ)
�
. And, as we already know, a
q�1
6= 1 for all a =2 (Z=qZ)
�
. �
Is it a tually possible that q is omposite but a
q�1
= 1 for all invert-
ible residues a? Su h numbers q are rare, but they exist (they are alled
Carmi hael numbers). Example: q = 561 = 3 � 11 � 17. Note that the num-
bers 3 � 1, 11 � 1 and 17 � 1 divide q � 1. Therefore a
q�1
= 1 for any
a 2 (Z=qZ)
�
�
=
Z
3�1
� Z
11�1
� Z
17�1
.
We see that the Fermat test alone is not suÆ ient to dete t a omposite
number. The Miller{Rabin test uses yet another type of witnesses for the
ompositeness: if b
2
� 1 (mod q), and b 6� �1 (mod q) for some b, then q is
40 1. Classi al Computation
omposite. Indeed, in this ase b
2
� 1 = (b� 1)(b+1) is a multiple of q but
b � 1 and b + 1 are not, therefore q has nontrivial fa tors in ommon with
both b� 1 and b+ 1.
4.2.2. Required subroutines and their omplexity. Addition (or sub-
tra tion) of n-bit integers is done by an O(n)-size ir uit; multipli ation and
division are performed by O(n
2
)-size ir uits. These estimates refer to the
standard algorithms learned in s hool, though they are not the most eÆ ient
for large integers. In the solutions to Problems 2.12, 2.13 and 2.14 we de-
s ribed alternative algorithms, whi h are mu h better in terms of the ir uit
depth, but slightly worse in terms of the size. If only the size is important,
the standard addition algorithm is optimal, but the ones for the multipli-
ation and division are not. For example, an O(n logn log log n)-size ir uit
for the multipli ation exists; see [5, Se . 7.5℄ or [43, vol. 2, Se . 4.3.3℄.
Eu lid's algorithm for g d(a; b) has omplexity O(n
3
), but there is also
a so- alled \binary" g d algorithm (see [43, vol. 2, Se . 4.5.2℄) of omplexity
O(n
2
). It does not solve the equation xa+ yb = g d(a; b), though.
We will also use modular arithmeti . It is lear that the addition of
(mod q)-residues is done by a ir uit of size O(n), whereas modular multi-
pli ation an be redu ed to integer multipli ation and division; therefore
it is performed by a ir uit of size O(n
2
) (by the standard te hnique).
To invert a residue a 2 (Z=qZ)
�
, we need to �nd an integer x su h that
xa � 1 (mod q), i.e., xa + yq = 1. This is done by extended Eu lid's
algorithm, whi h has omplexity O(n
3
).
It is also possible to ompute (a
m
mod q) by a polynomial time algo-
rithm. (Note that we speak about an algorithm that is polynomial in the
length n of its input (a;m; q); therefore performing m multipli ations is out
of the question. Note also that the size of the integer a
m
is exponential.) But
we an ompute (a
2
k
mod q) for k = 1; 2; : : : ; blog
2
m by repeated squaring,
applying the (mod q) redu tion at ea h step. Then we multiply some of the
results in su h a way that the powers, i.e., the numbers 2
k
, add to m (us-
ing the binary representation of m). This takes O(logm) = O(n) modular
multipli ations, whi h translates to ir uit size O(n
3
).
4.2.3. The algorithm. Assume that a number q is given.
Step 1. If q is even (and q 6= 2), then q is omposite. If q is odd, we
pro eed to Step 2.
Step 2. Let q � 1 = 2
k
l, where k > 0, and l is odd.
Step 3. We hoose a random a 2 f1; : : : ; q � 1g.
Step 4. We ompute a
l
; a
2l
; a
4l
; : : : ; a
2
k
l
= a
q�1
modulo q.
4. Probabilisti algorithms and the lass BPP 41
Test 1. If a
q�1
6= 1 (where modular arithmeti is assumed), then q is
omposite.
Test 2. If the sequen e a
l
; a
2l
; : : : ; a
2
k
l
(Step 4) ontains a 1 that is
pre eded by anything ex ept �1, then q is omposite. In other words, if
there exists j su h that a
2
j
l
6= �1 but a
2
j+1
l
= 1, then q is omposite.
In all other ases the algorithm says that \q is prime" (though in fa t it
is not guaranteed).
Theorem 4.3. If q is prime, then the algorithm always (with probability 1)
gives the answer \prime".
If q is omposite, then the answer \ omposite" is given with probability
at least 1=2.
Remark 4.2. To get a probabilisti algorithm in sense of De�nition 4.1,
we repeat this test twi e: the probability of an error (a omposite number
being undete ted twi e) is at most 1=4 < 1=2.
Proof of Theorem 4.3. As we have seen, the algorithm always gives the
answer \prime" for prime q.
Assume that q is omposite (and odd). If g d(a; q) > 1 then Test 1
shows that q is omposite. So, we may assume that that a is uniformly
distributed over the group G = (Z=qZ)
�
. We onsider two major ases.
Case A: q = p
�
, where p is an odd prime, and � > 1. We show
that there is an invertible (mod q)-residue x su h that x
q�1
6= 1, namely
x = (1 + p
��1
) mod q. Indeed, x
�1
= (1� p
��1
) mod q, and
4
x
q�1
� (1 + p
��1
)
q�1
= 1 + (q � 1)p
��1
+ higher powers of p
� 1� p
��1
6� 1 (mod q):
Therefore Test 1 dete ts the ompositeness of q with probability � 1=2 (due
to Lemma 4.2).
Case B: q has at least two distin t prime fa tors. Then q = uv, where u
and v are odd numbers, u; v > 1, and g d(u; v) = 1. The Chinese remainder
theorem (Theorem A.5) says that the group G = (Z=qZ)
�
is isomorphi to
the dire t produ t U � V , where U = (Z=uZ)
�
and V = (Z=vZ)
�
, and that
ea h element x 2 G orresponds to the pair
�
(x mod u); (x mod v)
�
.
For any m we de�ne the following subgroup ( f. formula (4.2)):
G
(m)
=
�
x
m
: x 2 G
= Im'
m
; where '
m
: x 7! x
m
:
Note that G
(m)
= f1g if and only if G
(m)
= G; this is yet another way to
formulate the ondition that a
m
= 1 for all a 2 G. Also note that if a is
4
A similar argument is used to prove that the group (Z=p
�
Z)
�
is y li ; see Theorem A.11.
42 1. Classi al Computation
uniformly distributed over G, then a
m
is uniformly distributed over G
(m)
.
Indeed, the map '
m
: x 7! x
m
is a group homomorphism; therefore the
number or pre-images is the same for all elements of G
(m)
. It is lear that
G
(m)
�
=
U
(m)
� V
(m)
. Now we have two sub ases.
Case 1. U
(q�1)
6= f1g or V
(q�1)
6= f1g. This ondition implies that
G
(q�1)
6= f1g, so that Test 1 dete ts q being omposite with probability at
least 1=2.
Case 2. U
(q�1)
= f1g and V
(q�1)
= f1g. In this ase Test 1 is always
passed, so we have to study Test 2. Let us de�ne two sequen es of subgroups:
U
(l)
� U
(2l)
� � � � � U
(2
k
l)
= f1g; V
(l)
� V
(2l)
� � � � � V
(2
k
l)
= f1g:
Note that U
(l)
6= f1g and V
(l)
6= f1g. Spe i� ally, both U
(l)
and V
(l)
ontain
the residues that orrespond to �1. Indeed, both U and V ontain �1, and
(�1)
l
= �1 sin e l is odd.
Going from right to left, we �nd the �rst pla e where one of the sets
U
(m)
, V
(m)
ontains an element di�erent from 1. It other words, we �nd
t = 2
s
l su h that 0 � s < k, U
(2t)
= f1g, V
(2t)
= f1g, and either U
(t)
6= f1g
or V
(t)
6= f1g.
We will prove that Test 2 shows (with probability at least 1=2) that q is
omposite. By our assumption a
2t
= 1, so we need to know the probability
of the event a
t
6= �1. Let us onsider two possibilities.
Case 2a. One of the sets U
(t)
, V
(t)
equals f1g (for example, let U
(t)
=
f1g). This means that for any a the pair
�
(a
t
mod u); (a
t
mod v)
�
has 1 as
the �rst omponent. Therefore a
t
6= �1, sin e �1 is represented by the pair
(�1;�1).
At the same time, V
(t)
6= f1g; therefore the probability that a
t
has 1 in
the se ond omponent is at most 1=2 (by Lagrange's theorem; f. proof of
Lemma 4.2). Thus Test 2 says \ omposite" with probability at least 1=2.
Case 2b. Both sets U
(t)
and V
(t)
ontain at least two elements: jU
(t)
j =
� 2, jV
(t)
j = d � 2. In this ase a
t
has 1 in the �rst omponent with
probability 1= (there are equiprobable possibilities) and has 1 in the
se ond omponent with probability 1=d. These events are independent due
to the Chinese remainder theorem, so the probability of the event a
t
= 1
is 1= d. For similar reasons the probability of the event a
t
= �1 is either
0 or 1= d. In any ase the probability of the event a
t
= �1 is at most
2= d � 1=2. �
4.3. BPP and ir uit omplexity.
Theorem 4.4. BPP � P=poly.
4. Probabilisti algorithms and the lass BPP 43
Proof. Let L(x) be a BPP-predi ate, and M a probabilisti TM that de-
ides L(x) with probability at least 1 � ". By running M repeatedly we
an de rease the error probability. Re all that a polynomial number of rep-
etitions leads to the exponential de rease. Therefore we an onstru t a
polynomial probabilisti TM M
0
that de ides L(x) with error probability
less that "
0
< 1=2
n
for inputs x of length n.
The ma hine M
0
uses a random string r (one random bit for ea h step).
For ea h input x of length n, the fra tion of strings r that lead to an in orre t
answer is less than 1=2
n
. Therefore the overall fra tion of \bad" pairs (x; r)
among all su h pairs is less than 1=2
n
. [If one represents the set of all pairs
(x; r) as a unit square, the \bad" subset has area < 1=2
n
.℄ It follows that
there exists r = r
�
su h that the fra tion of bad pairs (x; r) is less than
1=2
n
among all pairs with r = r
�
. However, there are only 2
n
su h pairs
( orresponding to 2
n
possibilities for x). The only way the fra tion of bad
pairs an be smaller than 1=2
n
is that there are no bad pairs at all!
Thus we on lude that there is a parti ular string r
�
that produ es or-
re t answers for all x of length n.
The ma hine M
0
an be transformed into a polynomial-size ir uit with
input (x; r). Then we �x the value of r (by setting r = r
�
) and obtain a
polynomial-size ir uit with input x that de ides L(x) for all n-bit strings x.
�
This is a typi al non onstru tive existen e proof: we know that a \good"
string r
�
exists (by probability ounting) but have no means of �nding it,
apart from exhaustive sear h.
Remark 4.3. It might well be the ase that BPP = P. Let us explain
brie y the motivation of this onje ture and why it is hard to prove.
Speaking about proved results, it is lear that BPP � PSPACE. Indeed,
the algorithm that ounts all values of the string r that lead to the answer
\yes" runs in polynomial spa e. Note that the running time of this algorithm
is 2
N
poly(n), where N = jrj is the number of random bits.
On the other hand, there is empiri al eviden e that probabilisti algo-
rithms usually work well with pseudo-random bits instead of truly random
ones. So attempts have been made to onstru t a mathemati al theory of
pseudo-random numbers. The general idea is as follows. A pseudo-random
generator is a fun tion g : B
l
! B
L
whi h transforms short truly random
strings (of length l, whi h an be as small as O(logL)) into mu h longer
pseudo-random strings (of length L). \Pseudo-random" means that any
omputational devi e with limited resour es (say, any Boolean ir uit of a
given size N omputing a fun tion F : B
L
! B ) is unable to distinguish
between truly random and pseudo-random strings of length L. Spe i� ally,
44 1. Classi al Computation
we require that
�
�
�
Pr
�
F (g(x)) = 1
�
�Pr
�
F (y) = 1
�
�
�
�
� Æ; x 2 B
l
; y 2 B
L
for some onstant Æ < 1=2, where x and y are sampled from the uniform
distributions. (In this de�nition the important parameters are l and N ,
while L should �t the number of input bits of the ir uit. For simpli ity
let us require that L = N : it will not hurt if the pseudo-random generator
produ es some extra bits.)
It is easy to show that pseudo-random generators g : B
O(log L)
! B
L
ex-
ist: if we hoose the fun tion g randomly, it ful�lls the above ondition with
high probability. What we a tually need is a sequen e of eÆ iently om-
putable pseudo-random generators g
L
: B
l(L)
! B
L
, where l(L) = O(logL).
If su h pseudo-random generators exist, we an use their output instead of
truly random bits in any probabilisti algorithm. The de�nition of pseudo-
randomness guarantees that this will work, provided the running time of
the algorithm is limited by
p
L (for a suitable onstant ) and the error
probability " is smaller than 1=2 � Æ. (With pseudo-random bits the error
probability will be " + Æ, whi h is still less than 1=2. The estimate
p
L
omes from the simulation of a Turing ma hine by Boolean ir uits, see
Theorem 2.2.) Thus we de rease the number of needed genuine random
bits from L to l. Then we an derandomize the algorithm by trying all
2
l
possibilities. If l = O(logL), the resulting omputation has polynomial
omplexity.
We do not know whether eÆ iently omputable pseudo-random gener-
ators exist. The trouble is that the de�nition has to be satis�ed for \any
Boolean ir uit of a given size"; this ondition is extremely hard to deal
with. But we may try to redu e this problem to a more fundamental one
| obtaining lower bounds for the ir uit omplexity of Boolean fun tions.
Even this idea, whi h sets the most diÆ ult part of the problem aside, took
many years to realize. Mu h work in this area was done in 1980's and early
1990's, but the results were not as strong as needed. Re ently there has
been a dramati progress leading to more eÆ ient onstru tions of pseudo-
random generators and new derandomization te hniques. It has been proved
that BPP = P if there is an algorithm with running time exp(O(n)) that
omputes a sequen e of �n tions with ir uit omplexity exp((n)) [26℄. We
also mention a new work [69℄ in whi h pseudo-random generators are on-
stru ted from arbitrary hard fun tions in an optimal (up to a polynomial)
way.
5. The hierar hy of omplexity lasses 45
5. The hierar hy of omplexity lasses
Re all that we identify languages (sets of strings) and predi ates (and x 2 L
means L(x) = 1).
De�nition 5.1. Let A be some lass of languages. The dual lass o-A
onsists of the omplements of all languages in A. Formally,
L 2 o-A, (B
�
nL) 2 A:
It follows immediately from the de�nitions that P = o-P, BPP =
o-BPP, PSPACE = o-PSPACE.
5.1. Games ma hines play. Consider a game with two players alled
White (W) and Bla k (B). A string x is shown to both players. After that,
players alternately hoose binary strings: W starts with some string w
1
, B
replies with b
1
, then W says w
2
, et . Ea h string has length polynomial in
jxj. Ea h player is allowed to see the strings already hosen by his opponent.
The game is ompleted after some pres ribed number of steps, and
the referee, who knows x and all the strings and who a ts a ording to
a polynomial-time algorithm, de lares the winner. In other words, there is a
predi ate W (x;w
1
; b
1
; w
2
; b
2
; : : : ) that is true when W is the winner, and we
assume that this predi ate belongs to P. If this predi ate is false, B is the
winner (there are no ties). This predi ate (together with polynomial bounds
for the length of strings and the number of steps) determines the game.
Let us note that in fa t the termination rule an be more ompli ated,
but we always assume that the number of moves is bounded by a polynomial.
Therefore we an \pad" the game with dummy moves that are ignored by
the referee and onsider only games where the number of moves is known in
advan e and is a polynomial in the input length.
Sin e this game is �nite and has no ties, for ea h x either B or W has
a winning strategy. (One an formally prove this using indu tion over the
number of moves.) Therefore, any game determines two omplementary
sets,
L
w
= fx : W has a winning strategyg ;
L
b
= fx : B has a winning strategyg :
Many omplexity lasses an be de�ned as lasses formed by the sets L
w
(or
L
b
) for some lasses of games. Let us give several examples.
P: the sets L
w
(or L
b
) for games of zero length (the referee de lares the
winner after he sees the input)
46 1. Classi al Computation
NP: the sets L
w
for games that are �nished after W's �rst move. In
other words, NP-sets are sets of the form
fx : 9w
1
W (x;w
1
)g:
o-NP: the sets L
b
for games that are �nished after W's �rst move. In
other words, o-NP-sets are sets of the form
fx : 8w
1
B(x;w
1
)g
(here B = :W means that B wins the game.)
�
2
: the sets L
w
for games where W and B make one move ea h and then
the referee de lares the winner. In other words, �
2
-sets are sets of the form
fx : 9w
1
8 b
1
W (x;w
1
; b
1
)g:
(W an make a winning move w
1
after whi h any move b
1
of B makes B
lose).
�
2
: the sets L
b
for the same lass of games, i.e., the sets of the form
fx : 8w
1
9 b
1
B(x;w
1
; b
1
)g:
: : :
�
k
: the sets L
w
for games of length k (the last move is made by W if k
is odd or by B is k is even), i.e., the sets
fx : 9w
1
8 b
1
: : : Q
k
y
k
W (x;w
1
; b
1
; : : : )g
(if k is even, then Q
k
= 8; y
k
= b
k=2
; if k is odd, then Q
k
= 9; y
k
=
w
(k+1)=2
).
�
k
: the sets L
b
for the same lass of games, i.e., the sets
fx : 8w
1
9 b
1
: : : Q
k
y
k
B(x;w
1
; b
1
; : : : )g
(if k is even, then Q
k
= 9; y
k
= b
k=2
; if k is odd, then Q
k
= 8; y
k
=
w
(k+1)=2
).
: : :
Evidently, omplements of �
k
-sets are �
k
-sets and vi e versa: �
k
=
o-�
k
, �
k
= o-�
k
:
Theorem 5.1 (Lautemann [46℄). BPP � �
2
\�
2
.
Proof. Sin e BPP= o-BPP, it suÆ es to show that BPP � �
2
.
Let us assume that L 2 BPP. Then there exist a predi ate R ( om-
putable in polynomial time) and a polynomial p su h that the fra tion
jS
x
j=2
N
, where
(5.1) S
x
=
�
y 2 B
N
: R(x; y)
; N = p(jxj);
5. The hierar hy of omplexity lasses 47
is either large (greater than 1� " for x 2 L) or small (less than " for x =2 L).
To show that L 2 �
2
, we need to reformulate the property \X is a large
subset of G" (where G is the set of all strings y of length N) using existential
and universal quanti�ers.
This ould be done if we impose a group stru ture on G. Any group
stru ture will work if the group operations are polynomial-time omputable.
For example, we an onsider an additive group formed by bit strings of a
given length with bit-wise addition modulo 2.
The property that distinguishes large sets from small ones is the follow-
ing: \several opies of X shifted by some elements over G", i.e.,
(5.2) 9 g
1
; : : : ; g
m
�
S
i
(g
i
+X) = G
�
;
where \+" denotes the group operation. To hoose an appropriate value for
m, let us see when (5.2) is guaranteed to be true (or false).
It is obvious that ondition (5.2) is false if
(5.3) mjXj < jGj:
On the other hand, (5.2) is true if for independent random g
1
; : : : ; g
m
2 G
the probability of the event
S
i
(g
i
+ X) = G is positive; in other words, if
Pr
�
S
i
(g
i
+X) 6= G
�
< 1. Let us estimate this probability.
The probability that a random shift g + X does not ontain a �xed
element u 2 G (for a given X and random g) is 1�jXj=jGj. When g
1
; : : : ; g
m
are hosen independently, the orresponding sets g
1
+X; : : : ; g
m
+X do not
over u with probability (1� jXj=jGj)
m
. Summing these probabilities over
all u 2 G, we see that the probability of the event
S
i
(g
i
+X) 6= G does not
ex eed jGj(1� jXj=jGj)
m
.
Thus ondition (5.2) is true if
(5.4) jGj
�
1� jXj=jGj
�
m
< 1:
Let us now apply these results to the set X = S
x
(see formula (5.1)).
We want to satisfy (5.3) and (5.4) when jS
x
j=2
N
< " (i.e., x =2 L) and when
jS
x
j=2
N
> 1�" (i.e., x 2 L), respe tively. Thus we get the inequlities "m < 1
and 2
N
"
m
< 1, whi h should be satis�ed simultaneously by a suitable hoi e
of m. This is not always possible if N and " are �xed. Fortunately, we have
some exibility in the hoi e of these parameters. Using \ampli� ation of
probability" by repeating the omputation k times, we in rease N by fa tor
of k, while de reasing " exponentially. Let the initial value of " be a onstant,
and � given by (4.1). The ampli� ation hanges N and " to N
0
= kN and
"
0
= �
k
. Thus we need to solve the system
�
k
m < 1; 2
kN
�
km
< 1
48 1. Classi al Computation
by adjusting m and k. It is obvious that there is a solution with m = O(N)
and k = O(logN).
We have proved that x 2 L is equivalent to the following �
2
- ondition:
9g
1
; : : : ; g
m
8y
�
�
jyj = p
0
(jxj)
�
)
�
(y 2 g
1
+ S
0
x
) _ � � � _ (y 2 g
m
+ S
0
x
)
�
�
:
Here p
0
(n) = kp(jxj) (k and m also depend on jxj), whereas S
0
x
is the \am-
pli�ed" version of S
x
.
In other words, we have onstu ted a game where W names m strings
(group elements) g
1
; : : : ; g
m
, and B hooses some string y. If y is overed by
some g
i
+ S
0
x
(whi h is easy to he k: it means that y � g
i
belongs to S
0
x
),
then W wins; otherwise B wins. In this game W has a winning strategy if
and only if S
0
x
is big, i.e., if x 2 L. �
5.2. The lass PSPACE. This lass ontains predi ates that an be om-
puted by a TM running in polynomial (in the input length) spa e. The lass
PSPACE also has a game-theoreti des ription:
Theorem 5.2. L 2 PSPACE if and only if there exists a polynomial game
su h that
L = fx : W has a winning strategy for input xg:
By a polynomial game we mean a game where the number of moves is
bounded by a polynomial (in the length of the input), players' moves are
strings of polynomial length, and the referee's algorithm runs in polynomial
time.
Proof. ( We show that a language determined by a game belongs to
PSPACE. Let the number of turns be p (jxj). We onstru t a sequen e of
ma hines M
1
; : : : ;M
p(jxj)
. Ea h M
k
gets a pre�x x;w
1
; b
1
; : : : of the play
that in ludes k moves and determines who has the winning strategy in the
remaining game.
The ma hine M
p(jxj)
just omputes the predi ate W (x;w
1
; : : : ) using
referee's algorithm. The ma hineM
k
tries all possibilities for the next move
and onsults M
k+1
to determine the �nal result of the game for ea h of
them. Then M
k
gives an answer a ording to the following rule, whi h says
whether W wins. If it is W's turn, then it suÆ es to �nd a single move for
whi h M
k+1
de lares W to be the winner. If it is B's turn, then W needs to
win after all possible moves of B.
The ma hine M
0
says who is the winner before the game starts and
therefore de ides L(x). Ea h ma hine in the sequen e M
1
; : : : ;M
p(jxj)
uses
only a small (polynomially bounded) amount of memory, so that the om-
posite ma hine runs in polynomial spa e. (Note that the omputation time
is exponential sin e ea h of the M
k
alls M
k+1
many times.)
5. The hierar hy of omplexity lasses 49
) Let M be a ma hine that de ides the predi ate L and runs in poly-
nomial spa e s. We may assume that omputation time is bounded by 2
O(s)
.
Indeed, there are 2
O(s)
di�erent on�gurations, and after visiting the same
on�guration twi e the omputation repeats itself, i.e., the omputation be-
omes y li .
[To see why there are at most 2
O(s)
on�gurations note that on�guration
is determined by head position (in f0; 1; : : : ; sg), internal state (there are jQj
of them) and the ontents of the s ells of the tape (jAj
s
possibilities where
A is the alphabet of TM); therefore the total number of on�gurations is
jAj
s
� jQj � s = 2
O(s)
.℄
Therefore we may assume without loss of generality that the running
time of M on input x is bounded by 2
q
, where q = poly(jxj).
In the des ription of the game given below we assume that TM keeps its
on�guration un hanged after the omputation terminates.
During the game, W laims thatM 's result for an input string x is \yes",
and B wants to disprove this. The rules of the game allow W to win ifM(x)
is indeed \yes" and allow B to win if M(x) is not \yes".
In his �rst move W de lares the on�guration of M after 2
q�1
steps
dividing the omputation into two parts. B an hoose any of the parts:
either the time interval [0; 2
q�1
℄ or the interval [2
q�1
; 2
q
℄. (B tries to at h
W by hoosing the interval where W is heating.) Then W de lares the
on�guration of M at the middle of the interval hosen by B and divides
this interval into two halves, B sele ts one of the halves, W de lares the
on�guration of M at the middle, et .
The game ends when the length of the interval be omes equal to 1. Then
the referee he ks whether the on�gurations orresponding to the ends of
this interval mat h (the se ond is obtained from the �rst a ording to M 's
rules). If they mat h, then W wins; otherwise B wins.
If M 's output on x is really \yes", then W wins if he is honest and
de lares the a tual on�guration of M . If M 's output is \no", then W is
for ed to heat: his laim is in orre t for (at least) one of the halves. If B
sele ts this half at ea h move, than B an �nally at h W \on the spot" and
win. �
[2!℄ Problem 5.1. Prove that any predi ate L(x) that is re ognized by a
nondeterministi ma hine in spa e s = poly(jxj) belongs to PSPACE. (A
predi ate L is re ognized by an NTM M in spa e s(jxj) if for any x 2 L
there exists a omputational path of M that gives the answer \yes" using
at most s(jxj) ells and, for ea h x =2 L, no omputational path of M ends
with \yes".)
50 1. Classi al Computation
Theorem 5.2 shows that all the lasses �
k
; �
k
are subsets of PSPACE.
Relations between these lasses are represented by the in lusion diagram in
Figure 5.1. The smallest lass is P (games of length 0); P is ontained in
both lasses NP and o-NP(whi h orrespond to games of length 1); lasses
NP and o-NP are ontained in �
2
and �
2
(games with two moves) and so
on. We get the lass PSPACE, allowing the number of moves in a game be
polynomial in jxj.
P
BPP NP o-NP
PSPACE
NP \ o-NP
�
2
�
2
�
2
\ �
2
� � � � � � � � � � � �
Fig. 5.1. In lusion diagram for omputational lasses. An arrow from
A to B means that B is a subset of A.
We do not know whether the in lusions in the diagram are stri t. Com-
puter s ientists have been working hard for several de ades trying to prove
at least something about these lasses, but the problem remains open. It is
possible that P = PSPACE (though this seems very unlikely). It is also pos-
sible that PSPACE = EXPTIME, where EXPTIME is the lass of languages
de idable in time 2
poly(n)
. Note, however, that P 6= EXPTIME | one an
prove this by a rather simple \diagonalization argument" ( f. solution to
Problem 1.3).
[3!℄ Problem 5.2. A Turing ma hine with ora le for language L uses a
de ision pro edure for L as an external subroutine ( f. De�nition 2.2). The
ma hine has a supplementary ora le tape, where it an write strings and
then ask the \ora le" whether the string written on the ora le tape belongs
to L.
Prove that any language that is de idable in polynomial time by a TM
with ora le for some L 2 �
k
(or L 2 �
k
) belongs to �
k+1
\�
k+1
.
The lass PSPACE has omplete problems (to whi h any problem from
PSPACE is redu ible). Here is one of them.
5. The hierar hy of omplexity lasses 51
The TQBF Problem is given by the predi ate
TQBF (x) , x is a True Quanti�ed Boolean Formula, i.e., a true state-
ment of type Q
1
y
1
: : : Q
n
y
n
F (y
1
; : : : ; y
n
), where variables
y
i
range over B = f0; 1g, F is some propositional formula
(involving y
1
; : : : ; y
n
, :;^;_), and Q
i
is either 8 or 9.
By de�nition, 8y A(y) means (A(0) ^A(1)) and 9y A(y) means (A(0) _
A(1)).
Theorem 5.3. TQBF is PSPACE- omplete.
Proof. We redu e an arbitrary language L 2 PSPACE to TQBF . Using
Theorem 5.2, we onstru t a game that orresponds to L. Then we onvert
a TM that omputes the result of the game (a predi ate W ) into a ir uit.
Moves of the players are en oded by Boolean variables. Then the existen e
of the winning strategy for W an be represented by a quanti�ed Boolean
formula
9w
1
1
9w
2
1
: : : 9w
p(jxj)
1
8 b
1
1
: : : 8 b
p(jxj)
1
9w
1
2
: : : 9w
p(jxj)
2
: : : S(x;w
1
1
; w
2
1
; : : : );
where S (�) denotes the Boolean fun tion omputed by the ir uit. (Boolean
variables w
1
1
; : : : ; w
p(jxj)
1
en ode the �rst move of W, variables b
1
1
; : : : ; b
p(jxj)
1
en ode B's answer, w
1
2
; : : : ; w
p(jxj)
2
en ode the se ond move of W, et .)
In order to onvert S into a Boolean formula, re all that a ir uit is
a sequen e of assignments y
i
:= R
i
that determine the values of auxiliary
Boolean variables y
i
. Then we an repla e S (�) by a formula
9y
1
; : : : ;9y
s
�
(y
1
, R
1
) ^ � � � ^ (y
s
, R
s
) ^ y
s
�
;
where s is the size of the ir uit.
After this substitution we obtain a quanti�ed Boolean formula whi h is
true if and only if x 2 L. �
Remark 5.1. Note the similarity between Theorem 5.3 (whi h is about
polynomial spa e omputation) and Problems 2.17 and 2.18 (whi h are
basi ally about poly-logarithmi spa e omputation). Also note that a
polynomial-size quanti�ed Boolean formula may be regarded as a polyno-
mial depth ir uit (though of very spe ial stru ture): the 8 and 9 quanti�ers
are similar to the ^ and _ gates. It is not surprising that the solutions are
based on mu h the same ideas. In parti ular, the redu tion from NMT to
TQBF is similar to the parallel simulation of a �nite-state automaton (see
Problem 2.11). However, in the ase of TQBF we ould a�ord reasoning at
a more abstra t level: with greater amount of omputational resour es we
were sure that all bookkeeping onstru tions ould be implemented. This
is one of the reasons why \big" omputational lasses (like PSPACE) are
popular among omputer s ientists, in spite of being apparently impra ti al.
52 1. Classi al Computation
In fa t, it is sometimes easier to prove something about big lasses, and then
s ale down the problem parameters while �xing some details.
Part 2
Quantum Computation
As already mentioned in the introdu tion, ordinary omputers do not employ
all possibilities o�ered by Nature. Their internal work is based on operations
with 0s and 1s, while in Nature there is possibility of performing unitary
transformations, i.e., of operating on an in�nite set.
1
This possibility is
des ribed by quantum me hani s. Devi es (real or imaginary) using this
possibility are alled quantum omputers.
It is not lear a priori whether the omputational power is really in-
reased in passing from Boolean fun tions to unitary transformations on
�nite-dimensional spa es. However, there is strong eviden e that su h an
in rease is a tually a hieved. For example, onsider the fa toring problem
| de omposition of an integer into prime fa tors. No polynomial algorithm
is known for solving this problem on ordinary omputers. But for quantum
omputers, su h algorithms do exist.
Ordinary omputers operate with states built from a �nite number of
bits. Ea h bit may exist in one of the two states, 0 or 1. The state of the
whole system is given by spe ifying the values of all the bits. Therefore, the
set of states B
n
=f0; 1g
n
is �nite and has ardinality 2
n
.
A quantum omputer works with a �nite set of obje ts alled qubits.
Ea h qubit has two separate states, also denoted by 0 and 1 (if we think
of qubits as spins, then these states are \spin up" and \spin down"). The
2
n
assignments of individual states to ea h qubit do not yield all possible
states of the system, but they form a basis in a spa e of states. Arbitrary
linear ombinations of the basis states, with omplex oeÆ ients, are also
1
Of ourse, a tual in�nity does not o ur in Nature. In the given ase the essential fa t
is that a unitary transformation an be performed only with some pre ision | for details see
Se tion 8.
53
54 2. Quantum Computation
possible. We will denote the basis states by jx
1
; : : : ; x
n
i, where x
j
2 B , or
by jxi, where x 2 B
n
. An arbitrary state of the system may be represented
in the form
2
j i =
X
(x
1
;:::;x
n
)2B
n
x
1
;:::;x
n
jx
1
; : : : ; x
n
i; where
X
(x
1
;:::;x
n
)2B
n
j
x
1
;:::;x
n
j
2
= 1:
The state spa e for su h a system is a linear spa e of dimension 2
n
over the
�eld C of omplex numbers.
State
of an ordinary omputer of a quantum omputer
� � � � � � bits
x
1
x
2
: : : x
n
x
j
2 B
� � � � � � qubits
basis: jx
1
; x
2
; : : : ; x
n
i; x
j
2 B
arbitrary:
P
x2B
n
x
jxi; where
P
x2B
n
j
x
j
2
= 1
One detail to add: if we multiply the ve tor
P
x
x
jxi by a phase fa tor
e
i'
(' real), we obtain a physi ally indistinguishable state. Therefore, a
state of a quantum omputer is a unit ve tor de�ned up to a phase fa tor.
Computation may be imagined as a sequen e of transformations on the
set of states of the system. Let us des ribe whi h transformations are pos-
sible in the lassi al and in the quantum ase:
Classi al ase: Quantum ase:
transformations are fun tions
from B
n
to B
n
:
transformations are unitary operators, i.e., oper-
ators that preserve the length
P
x2B
n
j
x
j
2
of ea h
ve tor
P
x2B
n
x
jxi.
Remark. All that has been said pertains only to isolated systems. A real
quantum omputer is (will be) a part of a larger system (the Universe),
intera ting with the remaining world. Quantum states and transformations
of open systems will be onsidered in Se tions 10, 11.
Now we must give a formal de�nition of quantum omputation. As in the
lassi al ase, one an de�ne both quantum Turing ma hines and quantum
ir uits. We hoose the se ond approa h, whi h is more onvenient for a
number of reasons.
2
The bra kets j : : : i in the notation j i do not signify any operation on the obje t | they
merely indi ate that represents a ve tor.
6. De�nitions and notation 55
6. De�nitions and notation
6.1. The tensor produ t. A system of n qubits has a state spa e C
2
n
,
whi h an be represented as a tensor produ t, C
2
� � � C
2
= (C
2
)
n
. The
fa tors orrespond to a spa e of a single qubit.
The tensor produ t of linear spa es L and M an be de�ned as an
arbitrary spa e N of dimension (dimL)(dimM). The idea is that if L and
M are endowed with some bases, fe
1
; : : : ; e
l
g � L and ff
1
; : : : ; f
m
g � M,
then N possesses a standard basis whose elements are asso iated with pairs
(e
j
; f
k
). We denote these elements by e
j
f
k
. Using this standard basis
�
e
j
f
k
: j = 1; : : : ; l; k = 1; : : : ;m
;
one an de�ne the tensor produ t of arbitrary two ve tors, u =
P
j
u
j
e
j
and
v =
P
k
v
k
f
k
(u
j
; v
k
2 C ) in su h a way that the map : (u; v) 7! u v is
linear in both u and v:
(6.1) u v =
X
j;k
(u
j
v
k
) e
j
f
k
:
We will mostly use this \pedestrian" de�nition of the tensor produ t,
although it is not invariant, i.e., it depends on the hoi e of bases in L
and M. An invariant de�nition is abstra t and hard to grasp, but it is
indispensable if we really want to prove something. The tensor produ t of
two spa es, L andM, is a spa e N = LM, together with a bilinear map
H : L �M ! N (also denoted by , i.e., u v
def
= H(u; v)) whi h satisfy
the following universality property :
for any spa e F and any bilinear fun tion F : L�M! F , there is a
unique linear fun tion G : LM! F su h that F (u; v) = G(uv)
(for every pair of u 2 L, v 2M.)
[1℄ Problem 6.1. Prove that the tensor produ t de�ned in the \pedes-
trian" way (i.e., using bases) satis�es the universality property.
[1℄ Problem 6.2. Consider two linear maps, A : L ! L
0
and B : M !
M
0
. Prove that there is a unique linear map C = AB : LM! L
0
M
0
su h that C(u v) = A(u) B(v) for any u 2 L, v 2M.
[2℄ Problem 6.3. Show that the pair (N ;H) in the abstra t de�nition of
the tensor produ t is unique. (Figure out in what sense).
6.2. Linear algebra in Dira 's notation. In our ase, there is a pre-
hosen lassi al basis: fj0i; j1ig for C
2
, and fjx
1
; : : : ; x
n
i : x
j
2 B g for
(C
2
)
n
. The spa e C
2
furnished with a basis is denoted by B. The basis is
onsidered orthonormal, whi h yields an inner produ t on the spa e of states.
The oeÆ ients
x
1
:::x
n
of the de omposition of a ve tor j i relative to this
56 2. Quantum Computation
basis are alled amplitudes. Their physi al meaning is that the square of the
absolute value j
x
1
:::x
n
j
2
of the amplitude is interpreted as the probability of
�nding the system in the given state of the basis. As must be the ase, the
sum of the probabilities is equal to 1 sin e the length of the ve tor is assumed
to be 1. (Probabilities will be fully dis ussed later; for some time we will
be o upied with linear algebra, namely with studying unitary operators on
the spa e B
n
.)
We will use (and have already used) notation ustomary in physi s,
introdu ed by Dira , for ve tors and the inner produ t. Ve tors are denoted
like this: j i ; the inner produ t of two ve tors is denoted by h�j�i. If
j�i =
P
x
a
x
jxi and j�i =
P
x
b
x
jxi, then h�j�i =
P
x
a
�
x
b
x
. (From now on,
a
�
stands for the omplex onjugate.) In the notation for ve tors the bra kets
are needed only \for elegan e" | they indi ate the type of the obje t and
o�er a symmetri designation (see below). In pla e of j�i we ould simply
write �, even though this is not ustomary. Thus j�
1
+ �
2
i = j�
1
i+ j�
2
i, and
both expressions mean �
1
+ �
2
.
The inner produ t is Hermitian. It is onjugate-linear in the �rst argu-
ment
3
and linear in the se ond, i.e.,
h�
1
+ �
2
j�i = h�
1
j�i+ h�
2
j�i; h�j�
1
+ �
2
i = h�j�
1
i+ h�j�
2
i;
h �j�i =
�
h�j�i; h�j �i = h�j�i:
If we take the left half of the inner produ t symbol, we get a bra-ve tor
h�j, i.e., a linear fun tional on ket-ve tors (i.e., the ve tors from our spa e).
Bra- and ket-ve tors are in a one-to-one orresponden e to one another.
(Nonetheless, it is ne essary to distinguish them in some way | and it is
just for this purpose the angle bra kets were introdu ed.) Be ause of the
onjugate linearity of the inner produ t with respe t to the �rst argument,
we have the equation h �j =
�
h�j. A bra-ve tor may be written as a row,
and a ket-ve tor as a olumn (so as to be able to multiply it on the left by
a matrix):
h�j =
�
0
h0j+
�
1
h1j = (
�
0
;
�
1
); j�i =
0
j0i +
1
j1i =
�
0
1
�
:
The notation h�jAj�i, where A is a linear operator, an be interpreted in
two ways: either as the produ t of the bra-ve tor h�j and the ket-ve tor Aj�i,
or as the produ t of h�jA and j�i. The �rst interpretation is ompletely lear,
whereas the se ond should be viewed as a de�nition of the linear fun tional
h j = h�jA. The orresponding ket-ve tor j i is related to j�i by a linear
operator A
y
alled Hermitian adjoint to A. Thus we write j i = A
y
j�i,
3
Note that mathemati ians often de�ne the inner produ t to be onjugate-linear in the se ond
argument.
6. De�nitions and notation 57
h j = hA
y
�j, so the de�ning property of A
y
is
hA
y
�j�i = h�jAj�i:
Operators an be spe i�ed as matri es relative to the lassi al basis (or
any other orthonormal basis),
A =
X
j;k
a
jk
jjihkj; where a
jk
= hjjAjki:
It is lear that jjihkj is a linear operator:
�
jjihkj
�
j�i = hkj�i jji.
The set of linear operators on a spa e M is denoted by L(M). Some-
times we will have to onsider linear maps between di�erent spa es, say, from
N to M. The spa e of su h maps is denoted by L(N ;M). It is naturally
isomorphi to MN
�
: the isomorphism takes an operator
P
a
jk
jjihkj 2
L(N ;M) to the ve tor
P
a
jk
jji hkj 2 MN
�
.
A unitary operator on a spa eM is an invertible operator that preserves
the inner produ t. The ondition
h�j�i = hU�jU j�i = h�jU
y
U j�i
is equivalent to U
y
U = I (where I is the identity operator). Sin e the
spa eM has �nite dimension, the above ondition implies that jdetU j = 1,
so the existen e of U
�1
follows automati ally. Unitary operators are also
hara terized by the property U
�1
= U
y
. The set of unitary operators is
denoted by U(M).
Our de�nition of the inner produ t in B
n
is onsistent with the tensor
produ t:
�
h�
1
j h�
2
j
��
j�
1
i j�
2
i
�
= h�
1
j�
1
ih�
2
j�
2
i:
Later on we will use the tensor produ t of operators ( f. Problem 6.2.) It is
an operator a ting on the tensor produ t of the spa es on whi h the fa tors
a t. The a tion is de�ned by the rule
(AB)j�i j�i = Aj�i Bj�i:
If the operators are given in the matrix form relative to some basis, i.e.,
A =
X
j;k
a
jk
jjihkj; B =
X
j;k
b
jk
jjihkj;
then the matrix elements of the operator C = AB have the form
(jk)(lm)
=
a
jl
b
km
.
[2!℄ Problem 6.4. Let X : L ! N be a linear map. Prove that the oper-
ators XX
y
and X
y
X have the same set of nonzero eigenvalues �
2
j
, ounted
58 2. Quantum Computation
with multipli ities. (The numbers �
j
> 0 are alled the singular values of
X.) Moreover, the following singular value de omposition holds:
(6.2) X =
X
j
�
j
j�
j
ih�
j
j;
where fj�
j
ig, fj�
j
ig are orthonormal eigenve tor systems for XX
y
andX
y
X,
respe tively. (There is freedom in the hoi e of ea h system, but if one is
�xed, the other is determined uniquely.)
6.3. Quantum gates and ir uits. Computation onsists of transforma-
tions, regarded as elementary and performed one at a time.
Elementary transformation in the
lassi al ase: a map from B
n
to
B
n
whi h alters and depends upon
a small number (not depending on
n) of bits; the remaining bits are
not used.
Elementary transformation in the quan-
tum ase: the tensor produ t of an arbitrary
unitary operator a ting on a small number
(r = O(1)) of qubits, denoted altogether by
B
r
, and the identity operator a ting on the
remaining qubits.
The tensor produ t of an operator U a ting on an ordered set A of
qubits and the identity operator a ting on the remaining qubits, is denoted
by U [A℄. In this situation, we say that the operator U is applied to the
register A. This de�nition is somewhat vague, but the formal onstru tion
of the operator U [A℄ is pretty straightforward.
First, let us de�ne X[A℄ when A onsists of just one qubit, say p. In this
ase, X[p℄ = I
B
(p�1)
X I
B
(n�p)
. Note that X[p℄ and Y [q℄ ommute if
p 6= q. In the general ase A = (p
1
; : : : ; p
r
), we an represent U as follows:
U =
X
j
1
;:::;j
r
;k
1
;:::;k
r
u
j
1
;:::;j
r
; k
1
;:::;k
r
�
jj
1
ihk
1
j
�
� � �
�
jj
r
ihk
r
j
�
:
A tually, all we need is a representation of the form
(6.3) U =
X
m
X
m;1
� � � X
m;r
;
where X
m;1
; : : : ;X
m;r
2 L(B) are arbitrary one-qubit operators. Then, by
de�nition,
(6.4) U [p
1
; : : : ; p
r
℄ =
X
m
X
m;1
[p
1
℄ � � �X
m;r
[p
r
℄:
The result does not depend on the hoi e of the representation (6.3) due to
the universality property of the tensor produ t (see p. 55). In the ase at
hand, we have a multilinear map F (X
m;1
; : : : ;X
m;r
) = X
m;1
[p
1
℄ � � �X
m;r
[p
r
℄,
whereas the orresponding linear map
G : U 7! U [p
1
; : : : ; p
r
℄ : L(B
r
)! L(B
n
)
is given by (6.4).
6. De�nitions and notation 59
Example 6.1. Let U =
�
u
00
u
01
u
10
u
11
�
. Then the operators U [1℄ and U [2℄,
a ting on the spa e B
2
, are represented by these matri es:
U [1℄ =
0
B
B
�
u
00
0 u
01
0
0 u
00
0 u
01
u
10
0 u
11
0
0 u
10
0 u
11
1
C
C
A
; U [2℄ =
0
B
B
�
u
00
u
01
0 0
u
10
u
11
0 0
0 0 u
00
u
01
0 0 u
10
u
11
1
C
C
A
:
The rows and olumns are asso iated with the basis ve tors arranged in the
lexi ographi order: j00i; j01i; j10i; j11i.
[1℄ Problem 6.5.
a) Let H =
1
p
2
�
1 1
1 �1
�
. Write the matrix of the operator H[2℄
a ting on the spa e B
3
.
b) Let U be an arbitrary two-qubit operator with matrix elements u
jk
=
hjjU jki, where j; k 2 f00; 01; 10; 11g. Write the matrix for U [3; 1℄.
At this point the omputational omplexity begins, whi h makes quan-
tum omputers so powerful. Let U a t on two qubits, i.e., U is a 4 � 4
matrix. Then U [1; 2℄ = U I is a matrix of size 2
n
� 2
n
that onsists of
2
n�2
opies of U pla ed along the prin ipal diagonal. This matrix repre-
sents one elementary step. When we apply several su h operators to various
pairs of qubits, the result will appear onsiderably more ompli ated. There
is no obvious way of determining this result, apart from dire t multipli a-
tion of the orresponding matri es. Inasmu h as the size of the matri es is
exponentially large, exponential time is required for their multipli ation.
We remark, however, that the al ulation of matrix elements is possible
with polynomially bounded memory. Suppose we need to �nd the matrix
element U
xy
of the operator
U = U
(l)
[j
l
; k
l
℄U
(l�1)
[j
l�1
; k
l�1
℄ � � �U
(2)
[j
2
; k
2
℄U
(1)
[j
1
; k
1
℄:
It is obvious that
�
U
(l)
� � �U
(1)
�
x
l
x
0
=
X
x
l�1
;:::;x
1
U
(l)
x
l
x
l�1
� � �U
(1)
x
1
x
0
:
(Here x
0
; : : : ; x
l
are n-bit strings.) To ompute this sum, it suÆ es to allo-
ate l�1 registers for keeping the urrent values of x
l�1
; : : : ; x
1
, one register
for keeping the partial sum, and some onstant number of registers for the
al ulation of the produ t U
(l)
x
l
x
l�1
� � �U
(1)
x
1
x
0
.
De�nition 6.1 (Quantum ir uit). Let A be a �xed set of unitary op-
erators. (We all A a basis, or a gate set, whereas its elements are alled
60 2. Quantum Computation
gates.) A quantum ir uit over the basis A is a sequen e U
1
[A
1
℄; : : : ; U
L
[A
L
℄,
where U
j
2 A, and A
j
is an ordered set of qubits.
The operator realized by the ir uit is U = U
L
[A
L
℄ � � �U
1
[A
1
℄ (U :
B
n
! B
n
). The number L is alled the size of the ir uit.
We usually assume that A is losed under inversion: if X 2 A, then
X
�1
2 A. In this ase U and U
�1
are realized by ir uits of the same size.
Note that several gates, say U
j
1
; : : : ; U
j
s
, an be applied simultaneously
to disjoint sets of qubits (su h that A
j
a
\ A
j
b
= ; if a 6= b). We say that
a ir uit has depth � d if it an be arranged in d layers of simultaneously
applied gates. The depth an be also hara terized as the maximum length
of a path from input to output. (By a path we mean a sequen e of gates
U
k
1
; : : : ; U
k
d
(k
1
< � � � < k
d
) su h that ea h pair of adja ent gates, k
l
and
k
l+1
, share a qubit they a t upon, but no other gate a ts on this qubit
between the appli ations of U
k
l
and U
k
l+1
.)
De�nition 6.1 is not perfe t be ause it ignores the possibility to use
additional qubits (an illas) in the omputational pro ess. Therefore we
give yet another de�nition.
(Operator realized by a quantum ir uit using an illas). This is an
operator U : B
n
! B
n
su h that the produ t
W = U
L
[A
L
℄ � � �U
1
[A
1
℄;
a ting on N qubits (N � n), satis�es the ondition W
�
j�i j0
N�n
i
�
=
(U j�i) j0
N�n
i for any ve tor j�i 2 B
n
.
In this manner we \borrow" additional memory, �lled with zeros, that we
must ultimately return to its prior state. What sense does su h a de�nition
make? Why is it ne essary to insist that the additional qubits return to
the state j0
N�n
i? A tually, this ondition is rather te hni al. However, it
is important that at the end of the omputation the quantum state is a
produ t state, i.e., has the form j�
0
i j�
0
i (with arbitrary j�
0
i). If this is the
ase, then the �rst subsystem will be in the spe i�ed state j�
0
i, so that the
se ond subsystem (the added memory) may be forgotten. In the opposite
ase, the joint state of the two subsystems will be entangled, so that the �rst
subsystem annot be separated from the se ond.
7. Corresponden e between lassi al and
quantum omputation
Quantum omputation is supposed to be more general than lassi al om-
putation. However, quantum ir uits do not in lude Boolean ir uits as a
spe ial ase. Therefore some work is required to spe ialize the de�nition
7. Corresponden e between lassi al and quantum omputation 61
of a quantum ir uit and prove that the resulting omputational model is
equivalent to the Boolean ir uit model.
The lassi al analogue of a unitary operator is an invertible map on
a �nite set, i.e., a permutation. An arbitrary permutation G : B
k
! B
k
orresponds naturally to a unitary operator
b
G on the spa e B
k
a ting
a ording to the rule
b
Gjxi
def
= jGxi:
By analogy with De�nition 6.1, we may de�ne reversible lassi al ir-
uits, whi h realize permutations.
De�nition 7.1 (Reversible lassi al ir uit). Let A be a set of permu-
tations of the form G : B
k
! B
k
. (The set A is alled a basis; its elements
are alled gates.) A reversible lassi al ir uit over the basis A is a sequen e
of permutations G
1
[A
1
℄; : : : ; G
l
[A
L
℄, where A
j
is a set of bits and G
j
2 A.
(Permutation realized by a reversible ir uit). This is the produ t of
permutations G
l
[A
l
℄ � � �G
1
[A
1
℄.
(Permutation realized by a reversible ir uit using an illas). This
is a permutation G su h that the produ t of permutations
W = G
l
[A
l
℄ � � �G
1
[A
1
℄
(a ting on N bits, N � n) satis�es the ondition W (x; 0
N�n
) = (Gx; 0
N�n
)
for arbitrary x 2 B
n
.
In what ases a fun tion given by a Boolean ir uit an be realized by
a reversible ir uit? Reversible ir uits realize only permutations, i.e., in-
vertible fun tions. This diÆ ulty an be over ome in this way: instead of
omputing a general Boolean fun tion F : B
n
! B
m
, we ompute the per-
mutation F
�
: B
n+m
! B
n+m
given by the formula F
�
(x; y) = (x; y�F (x))
(here � denotes bitwise addition modulo 2). Then F
�
(x; 0) = (x; F (x))
ontains the value of F (x) we need.
Note that two-bit permutation gates do not allow to realize all fun tions
of the form F
�
. It turns out that any permutation on two-bit states, g :
B
2
! B
2
, is a linear fun tion (under the natural identi� ation of the set B
with the two-element �eld F
2
): g(x; y) = (ax � by � ; dx � ey � f), where
a; b; ; d; e; f 2 F
2
. Therefore all fun tions realized by reversible ir uits over
the basis of permutations on two bits, are linear.
However, permutations on three bits already suÆ e to realize any per-
mutation. In fa t, the following two fun tions form a omplete basis for
reversible ir uits: negation : and the To�oli gate,
V
�
: (x; y; z) 7�!
(x; y; z � xy). Here we mean realization using an illas, i.e., it is allowed
to borrow bits in the state 0 under the ondition that they return to the
same state after the omputation is done.
62 2. Quantum Computation
Lemma 7.1. Let a fun tion F : B
n
! B
m
be realized by a Boolean ir-
uit of size L and depth d over some basis A (the fan-in and fan-out being
bounded by a onstant). Then we an realize a map of the form (x; 0) 7�!
(F (x); G(x)) by a reversible ir uit of size O(L) and depth O(d) over the ba-
sis A
�
onsisting of the fun tions f
�
(f 2 A) and the fun tion
he
: (x; y) 7�!
(x; x� y).
Remark 7.1. In addition to the \useful" result F (x), the indi ated map
produ es some \garbage" G(x).
Remark 7.2. The gate
he
is usually alled \Controlled NOT" for reasons
that will be ome lear later. Note that
he
= I
�
, where I is the identity
map on a single bit. The essential meaning of the operation
he
is reversible
opying of the bit x (if the initial value of y is 0).
Remark 7.3. The gate
he
allows one to inter hange bits in memory, sin e
the fun tion ($) : (a; b) 7�! (b; a) an be represented as follows:
($)[j; k℄ =
he
[j; k℄
he
[k; j℄
he
[j; k℄:
Proof of Lemma 7.1. Consider the Boolean ir uit that omputes F . Let
the input variables be x
1
; : : : ; x
n
, and the auxiliary variables (in luding the
result bits) x
n+1
; : : : ; x
n+L
. A reversible ir uit we are to onstru t will also
have n+ L bits; the bits x
n+1
; : : : ; x
n+L
are initialized by 0.
Ea h assignment in the original (Boolean) ir uit has the form x
n+k
:=
f
k
(x
j
k
; : : : ; x
l
k
), f
k
2 A, j
k
; : : : ; l
k
< n+ k. In the orresponding reversible
ir uit, the analogue of this assignment will be the a tion of the permutation
(f
k
)
�
, i.e., x
n+k
:= x
n+k
� f
k
(x
j
k
; : : : ; x
l
k
):
Sin e the initial values of the auxiliary variables were equal to 0, their
�nal values will be just as in the original ir uit. In order to obtain the
required form of the result, it remains to hange positions of the bits.
In this argument, we may assume that the original ir uit has a layered
stru ture, so that several assignments an o ur simultaneously. However,
the on urrent assignments should not share their input variables. If this
is not the ase, we need to insert expli it opy gates between the layers;
ea h opy gate will be repla ed by
he
in the reversible ir uit. This results
in depth in rease by at most onstant fa tor, due to the bounded fan-out
ondition.
The entire omputational pro ess is onveniently represented by the fol-
lowing diagram (above the re tangles is written the number of bits and,
inside, their ontent).
7. Corresponden e between lassi al and quantum omputation 63
n
L�m
m
x
0 0
| assignments by the ir uit
x
x
n+1
. . . x
L�m
F (x)
| permutations of bits
F (x) G(x)
�
Lemma 7.2 (Garbage removal). Under the onditions of Lemma 7.1,
one an realize the fun tion F
�
by a reversible ir uit of size O(L+ n+m)
and depth O(d) using an illas.
Proof. We perform the omputation from the proof of Lemma 7.1, add ea h
bit of the result to the orresponding bit of y with
he
, and undo the above
omputation.
n
L
m
x
0
y
| omputation by the ir uit from
the proof of Lemma 7.1
m
L+ n�m
m
F (x) G(x)
y
| addition of F (x) to y modulo 2
F (x) G(x) F (x)� y
| reversal of the omputation that
was done in the �rst step
x
0
F (x)� y
�
Remark 7.4. Reversible omputation provides an answer to the following
question: how mu h energy is required to ompute a given Boolean fun -
tion [10, 45℄? Theoreti ally, reversible operations an be performed at no
energy ost. On the other hand, irreversible operations, like bit erasure,
pose a fundamental problem. When su h an operation is performed, two
di�erent logi al states (0 and 1) be ome identi al (0). However, physi al
laws on a mi ro-s ale are reversible. The solution to this apparent paradox
is that the di�eren e between the initial states, 0 and 1, is onverted into
a di�eren e between two physi al states that both orrespond to the same
logi al value 0. This may be interpreted as an in rease in disorder (entropy)
in physi al degrees of freedom beyond our ontrol, whi h eventually appears
in the surrounding environment in the form of heat. The amount of energy
required to erase a single bit is very small (kT ln 2), but still nonzero. The
theoreti al energy ost of information erasure on a hard disk of apa ity
1 gigabyte is equal to 2:4 � 10
�11
Joules, whi h orresponds to the energy
spent in moving the disk head by a fra tion of the size of an atom. This is
many orders of magnitude smaller than the a tual displa ement of the head
through formatting.
64 2. Quantum Computation
On the other hand, if the apa ity of disks were to ontinue growing as
fast as now, then at the end of the twenty-third entury formatting of a hard
disk would require as mu h energy as the Sun generates in a year.
The garbage removal lemma shows that it is possible to avoid su h losses
of energy onne ted with irreversible operations.
It is likewise possible to show that arbitrary omputation performed
with memory s, an be realized in a reversible manner through the use of
memory not ex eeding s
O(1)
. We will give a sket h of the proof. However,
the reader should keep in mind that omputation with bounded spa e is
not easily de�ned in terms of ir uits. Indeed, if a ir uit is allowed to
be exponentially large (though of polynomial \width"), it an ontain the
value table of the desired fun tion, whi h makes its omputation trivial.
Therefore, a rigorous proof should either deal with ir uits of some regular
stru ture, or involve a de�nition of a reversible Turing ma hine.
An arbitrary omputation with a given memory s an be redu ed to
solving a poly(s)-size instan e of TQBF, sin e TQBF is PSPACE- omplete.
We will show how to ompute reversibly, with a small amount of memory,
the value of the formula
(7.1) 9x
1
8y
1
� � � 9x
M
8y
M
f(x
1
; y
1
; : : : ; x
M
; y
M
; z);
where f(�) is omputed by a Boolean ir uit of size L. A tually, in this ase
the value of the formula (7.1) an be represented by a reversible ir uit with
O(L +M) bits. The omputation will be organized re ursively, beginning
with the innermost quanti�ers.
In order to ompute 8xF (x; z), we ompute F (0; z) and put the result
into a supplementary bit. Then we ompute F (1; z) and put the result into
another bit. Next we ompute 8x F (x; z) = F (0; z) ^ F (1; z) and save the
result in a third bit. In order to remove the garbage, we undo all al ulations,
ex ept for the �nal step.
Dealing with the formula 9xF (x; y) in a similar manner, we arrive at
the following result: adding a quanti�er in one Boolean variable in reases
the required memory by at most a onstant number of bits.
In on lusion, we formulate a theorem on omputation of reversible fun -
tions, whi h is a dire t generalization of Lemma 7.2.
Theorem 7.3. Let F and F
�1
be omputed by Boolean ir uits of size � L
and depth � d. Then F an be realized by a reversible ir uit of size O(L+n)
and depth O(d) using an illas.
Proof. The omputation is performed a ording to the following s heme
(for simpli ity we do not show the an illas that are used in the omputation
from Lemma 7.2).
8. Bases for quantum ir uits 65
n n
x
0
| omputation of F
�
by the ir uit from the proof of
Lemma 7.2
x
F (x)
| permutation of bits
F (x)
x
| applying (F
�1
)
�
(by the ir uit from the proof of
Lemma 7.2) yields x � F
�1
(F (x)) = 0 in the right regis-
ter
F (x)
0
�
[1!℄ Problem 7.1. Prove that negation and the To�oli gate form a omplete
basis for reversible ir uits.
8. Bases for quantum ir uits
How do we hoose a basis (gate set) for omputation by quantum ir uits?
There are un ountably many unitary operators. So, either a omplete basis
must ontain an in�nite (un ountable) number of gates, or else we have
to weaken the ondition of exa t realization of an operator by a ir uit,
hanging it to a ondition of approximate realization. We will examine both
possibilities.
8.1. Exa t realization.
Theorem 8.1. The basis onsisting of all one-qubit and two-qubit unitary
operators allows the realization of an arbitrary unitary operator.
The rest of the se tion onstitutes a proof of this theorem.
8.1.1. Operators with quantum ontrol.
De�nition 8.1. For ea h operator U : B
n
! B
n
, an operator �(U) :
B B
n
! B B
n
(\ ontrolled U") is de�ned by the following relations:
(8.1)
�(U)j0i j�i = j0i j�i;
�(U)j1i j�i = j1i U j�i:
X
Graphi ally, we represent an operator �(X) as
shown in the �gure. The top line orresponds to the
ontrol qubit (the �rst tensor fa tor in (8.1)) while
the bottom line represents the other qubits. The
dire tion of the arrows orresponds to the order in whi h operators a t on
an input ve tor. For example, in Figure 8.1 (see below) the �rst operator is
�(Y
�1
). In this book, we draw arrows from right to left, whi h is onsistent
with the onvention that ABj�i means \take j�i, apply B, then apply A".
66 2. Quantum Computation
We will also need operators with several ontrolling qubits:
(8.2) �
k
(U)jx
1
; : : : ; x
k
i j�i =
(
jx
1
; : : : ; x
k
i j�i if x
1
� � � x
k
= 0;
jx
1
; : : : ; x
k
i U j�i if x
1
� � � x
k
= 1:
Example 8.1. Let �
x
def
= b: =
�
0 1
1 0
�
: Then �(�
x
) =
he
, and �
2
(�
x
) =
�
�
(the To�oli gate).
8.1.2. The realization of the To�oli gate. Now we onstru t the Tof-
foli gate using transformations on two qubits. To start, we �nd a pair of
operators that satisfy the relation XY X
�1
Y
�1
= i�
x
. For example, the
following pair will do:
(8.3) X =
1
p
2
�
�i �1
1 i
�
; Y =
�
0 1
�1 0
�
:
Let us larify the geometri meaning of this onstru tion. The uni-
tary group U(2) a ts on three-dimensional Eu lidean spa e. To de�ne this
a tion, we note that 2� 2 Hermitian matri es with zero tra e form a three-
dimensional Eu lidean spa e: the inner produ t between A and B is given
by
1
2
Tr(AB) and an orthonormal basis is formed by the Pauli matri es
(8.4) �
x
=
�
0 1
1 0
�
; �
y
=
�
0 �i
i 0
�
; �
z
=
�
1 0
0 �1
�
:
A unitary operator U 2 U(2) a ts on this spa e by this rule: U : E 7!
UEU
�1
: It is possible to show (see [44, x11.12℄) that the a tion we have just
de�ned yields an isomorphism U(2)=U(1)
�
=
SO(3), where U(1) =
�
2 C :
j j = 1
is the subgroup of phase shifts, and SO(3) is the group of rotations
of three-dimensional spa e (i.e., the group of orthogonal transformations
with determinant 1).
Under this a tion, �
x
orresponds to a rotation about the x axis by 180
Æ
,
X to a rotation about the ve tor (0; 1; 1) by 180
Æ
, and Y to a rotation about
the y axis by 180
Æ
.
Shown in Figure 8.1 is a ir uit that realizes the To�oli gate by using
the operators �(X), �(Y ) and �
2
(�i). The last of these is a phase shift
(multipli ation by �i) ontrolled by two bits.
Let us test this ir uit. Suppose the input ve tor is ja; b; i = jai jbi j i,
where a; b; 2 B . If a = b = 1, then the operator �iXY X
�1
Y
�1
= �
x
is applied to j i, whi h hanges j0i to j1i and vi e versa. However, if at
least one of the ontrolling bits is 0, then j i is multiplied by the identity
operator. This is exa tly how the To�oli gate a ts on basis ve tors. This
a tion extends to the whole spa e B
3
by linearity.
8. Bases for quantum ir uits 67
X
�1
Y
�1
X Y
�i
Fig. 8.1. Implementation of the To�oli gate.
8.1.3. The realization of �
k
(U) for U 2 U(B). Let U be a unitary
operator a ting on one qubit. We will show how to realize the operator
�
k
(U) for arbitrary k by a ting only on pairs of qubits. Our �rst solution
uses an illas. We a tually onstru t an operator W whi h a ts on the spa e
of N + 1 qubits B
(N+1)
and satis�es the ondition
W
�
j�i j0
N�k
i
�
= �(U)j�i j0
N�k
i:
(Caution: this ondition does not mean that W = �(U) I.)
There exists a reversible ir uit P of size O(k) and depth O(log k) that
omputes the produ t of k input bits x
1
� � � x
k
(the result being a single bit),
and also produ es some garbage G(x
1
; : : : ; x
k
) (N�1 bits). It is represented
graphi ally in Figure 8.2 (written above ea h box is the number of bits in
the orresponding memory segment).
x
1
; : : : ; x
k
0x
1
x
2
� : : : � x
k
G(x
1
; : : : ; x
k
)
k
N � k
1
N � 1
P
Fig. 8.2
Figure 8.3 shows how to onstru t the operator W using the ir uit P
and an operator with one ontrolling qubit. The ir uit P is applied �rst,
followed by the reverse ir uit P
�1
, so that all N bits return to their initial
state. In the meantime, the �rst bit (the top line in Figure 8.3) takes the
value x
1
� � � x
k
. It is used as the ontrol qubit for �(U), whereas qubit N +1
is the target. The ir uit in the �gure an also be des ribed by the equation
W = P
�1
�(U)P or, more expli itly,
W [ 1; : : : ; k;N+1
| {z }
�(U)
; k+1; : : : ; N
| {z }
an illas
℄ = P
�1
[1; : : : ; N ℄ �(U)[1; N+1℄ P [1; : : : ; N ℄:
The use of an illas an be avoided at the ost of an in rease in the ir-
uit size. Let us onsider the operator �
k
(i�
x
) �rst. A ir uit C
k
for the
realization of this operator an be onstru ted re ursively: it onsists of two
opies of the ir uit C
dk=2e
, two opies of the ir uit C
bk=2
, and a onstant
68 2. Quantum Computation
P
P
�1
U
x
1
:
:
:
x
k
:
:
:
0
G(x)
.
.
.
.
.
.
x
1
� x
2
� : : : � x
k
x
1
x
2
0
Fig. 8.3. Implementation of the operator �
k
(U) using an illas.
number of one-qubit gates. Therefore we get a re urren e relation for the
ir uit size, L
k
= 2L
bk=2
+ 2L
dk=2e
+ , so that L
k
= O(k
2
). The on rete
onstru tion is shown in Figure 8.4 ( f. Figure 8.1). We again use the opera-
tors X and Y (see (8.3)) satisfying XY X
�1
Y
�1
= i�
x
. Now we apply them
with multiple ontrol qubits: Y is ontrolled by the qubits 1; : : : ; dk=2e,
whereas X is ontrolled by the qubits dk=2e+1; : : : ; k. It remains to noti e
that X and Y are onjugate to i�
x
, i.e., X = V (i�
x
)V
�1
, Y =W (i�
x
)W
�1
for some unitary V and W . Hen e �
b
(X) and �
a
(Y ) (where a = dk=2e,
b = bk=2 ) an be obtained if we onjugate �
b
(i�
x
) and �
a
(i�
x
) by V and
W (resp.) applied on the last qubit.
X
�1
Y
�1
X Y
x
1
x
2
x
dk=2e
x
dk=2e+1
x
k
Fig. 8.4. Implementation of the operator �
k
(i�
x
) without an illas.
The operator �
k
(Z) for an arbitrary Z 2 SU(2) an be realized by two
appli ations of �
k
(�
x
) and four appli ations of one-qubit gates, as in the
solution to Problem 8.2 (see Figure S8.1a). Note that one opy of �
k
(�
x
)
an be repla ed by �
k
(i�
x
), and the other by �
k
(�i�
x
).
Consider now the general ase, �
k
(U), where U 2 U(2). Let U =
U
0
= e
i'
1
Z
0
, where Z
0
2 SU(2). Then �
k
(e
i'
1
) = �
k�1
(U
1
), where U
1
=
8. Bases for quantum ir uits 69
�(e
i'
1
) 2 U(2). Thus we have
�
k
(U)[1; : : : ; k; k+1℄ = �
k�1
(U
1
)[1; : : : ; k℄ �
k
(Z
0
)[1; : : : ; k; k+1℄:
We pro eed by indu tion, obtaining the equation
(8.5)
�
k
(U)[1; : : : ; k; k+1℄
= U
k
[1℄ �
1
(Z
k�1
)[1; 2℄ � � � �
k
(Z
0
)[1; : : : ; k; k+1℄:
It is represented graphi ally in Figure 8.5. The size of the resulting ir uit
is O(n
3
). (This onstru tion an be made more eÆ ient; see Problem 8.6.)
U
k
Z
k�1
Z
2
Z
1
Z
0
x
1
x
2
x
k�1
x
k
Fig. 8.5. An illa-free realization of �
k
(U), U 2 U(2).
8.1.4. The realization on an arbitrary operator. We ontinue the
proof of Theorem 8.1. The a tion of �
k
(U) may be des ribed as follows:
the operator U a ts on the subspa e generated by the ve tors j1; : : : ; 1; 0i
and j1; : : : ; 1; 1i, and the identity operator a ts on the orthogonal omple-
ment of this subspa e. Our next task is to realize a similar operator in whi h
a nontrivial a tion is arried out on the subspa e spanned by an arbitrary
pair of basis ve tors. Suppose we want to realize an arbitrary operator on the
subspa e spanned by jxi and jyi, where x = (x
1
; : : : ; x
n
), y = (y
1
; : : : ; y
n
),
x
j
; y
j
2 B . Let f be a permutation su h that f(x) = (1; : : : ; 1; 0), f(y) =
(1; : : : ; 1; 1). We may assume that f is linear, i.e., f : x 7! Ax+ b, where A
is an invertible matrix, and b is a ve tor over the two-element �eld F
2
. Su h
permutations an be realized by reversible ir uits over the basis
�
:;
he
without an illas. Then the operator we need is represented in the form
b
f
�1
�
n�1
(U)
b
f . (Re all that
b
f is the operator orresponding to the permu-
tation f .)
Therefore we an a t arbitrarily on pairs of basis ve tors. Sin e we only
used ir uits of size poly(n), the onstru ted a tions are realized eÆ iently.
The �nal part in the proof of Theorem 8.1 is not eÆ ient. Now we forget
70 2. Quantum Computation
about qubits (i.e., the tensor produ t stru ture of our spa e), so we just have
a Hilbert spa e of dimension M = 2
n
. We want to represent an arbitrary
unitary operator U by the a tions on pairs of basis ve tors. This will be
polynomial in M , hen e exponential in n.
Lemma 8.2. An arbitrary unitary operator U on the spa e C
M
an be rep-
resented as a produ t of M(M � 1)=2 matri es of the form
(8.6)
0
B
B
B
B
B
B
B
B
B
B
B
B
�
1 0 : : : : : : : : : : : : : : : : : : : : : :
.
.
.
.
.
.
0 : : : : : : : : : : : : : : : : : : :
0 : : : 1 0 : : : : : : : : :
0 : : : : : :
�
a b
d
�
0 : : : : : :
0 : : : : : : : : : : : : : : : : 1 0 0
: : : : : : : : : : : : : : : : : : : : : :
.
.
.
0
0 : : : : : : : : : : : : : : : : : : : : : : : : 1
1
C
C
C
C
C
C
C
C
C
C
C
C
A
; where
�
a b
d
�
2 U(2):
Proof. First we note that for any numbers
1
;
2
there exists a 2�2 unitary
matrix V su h that
V
�
1
2
�
=
�
p
j
1
j
2
+ j
2
j
2
0
�
:
Consequently, for a unit ve tor j�i 2 C
M
there exists a sequen e of unitary
operators V
(1)
; : : : ; V
(M�1)
su h that V
(1)
� � � V
(M�1)
j�i = j1i, where V
(s)
a ts on the subspa e C (jsi; js + 1i) (as the matrix (8.6)) and leaves the
remaining basis ve tors un hanged.
Now let an M �M unitary matrix U be given. Multiplying U
�1
on
the left by suitable matri es U
(1;1)
; : : : ; U
(1;M�1)
, we an transform the �rst
olumn into the ve tor j1i. Sin e the olumns remain orthogonal, the �rst
row be omes h1j. A ting in the same way on the remaining olumns, we
obtain a set of matri es U
(j;s)
, 1 � j � s �M � 1, (where U
(j;s)
a ts on jsi
and js+ 1i) satisfying the ondition
U
(M�1;M�1)
�
U
(M�2;M�2)
U
(M�2;M�1)
�
� � �
�
U
(1;1)
� � �U
(1;M�1)
�
U
�1
= I:
This proof is onstru tive, i.e., it provides an algorithm for �nding the
matri es U
(j;s)
. The running time of this algorithm depends on M and an-
other parameter Æ, the pre ision of arithmeti operations with real numbers.
Spe i� ally, the algorithm omplexity is O(M
3
) � poly(log(1=Æ)). �
Problems
[2!℄ 8.1. Prove that any operator U 2 U(B) an be realized (without an il-
las) by a onstant size ir uit over the basis
�
�(e
i'
) : ' 2 R
[ fHg.
8. Bases for quantum ir uits 71
[1!℄ 8.2. Prove that any operator of the form �(U), U 2 U(B) an be
realized (without an illas) by a onstant size ir uit over the basis of one-
qubit gates and the gate �(�
x
).
(Therefore, this basis allows the realization of an arbitrary unitary op-
erator. Indeed, in the proof of Theorem 8.1 we only use gates of the form
�(U) and one-qubit gates.)
[2!℄ 8.3. Suppose that a basis A is losed under inversion and allows the
realization of any one-qubit operator up to a phase fa tor (e.g., A = SU(2) ).
Prove that the multipli ation by a phase fa tor an be realized over A using
one an illa.
[2!℄ 8.4. Suppose that a unitary operator U : B
n
! B
n
satis�es the
ondition U j0i = j0i. Constru t a ir uit of size 6n+ 1 realizing �(U) over
the basis
�
U;�
2
(�
x
)
, using an illas. (The gate U should be applied only
on e.)
[3℄ 8.5. Realize the operator �
k
(�
x
) (k � 2) by a ir uit of size O(k)
onsisting of To�oli gates. It is allowed to use a onstant number of an illas
in su h a way that their initial state does not matter, i.e., the ir uit should
a tually realize the operator W = �
k
(�
x
) I
B
r, where r = onst.
[2℄ 8.6. Realize the operator �
k
(U) by a ir uit of size O(k
2
) over the basis
of all two-qubit gates. The use of an illas is not allowed.
8.2. Approximate realization. We now pass to �nite bases. In this ase
it is only possible to obtain an approximate representation of operators as
produ ts of basis elements. In order to de�ne the approximate realization,
we need a norm on the operator spa e.
On the spa e of ve tors there is the Eu lidean norm
j�i
=
p
h�j�i. By
the de�nition of a norm, it satis�es the following onditions:
j�i
(
= 0 if j�i = 0;
> 0 if j�i 6= 0;
(8.7)
j�i+ j�i
�
j�i
+
j�i
;(8.8)
j�i
= j j
j�i
:(8.9)
We now introdu e a norm on the spa e of operators L(N ).
De�nition 8.2. The norm of an operator X (the so- alled operator norm;
in general, there are others) is
X
= sup
j�i6=0
Xj�i
j�i
:
72 2. Quantum Computation
We note that kXk
2
is the largest eigenvalue of the operator X
y
X:
This norm possesses all the properties of norms indi ated above and,
beyond these, several spe ial properties:
kXY k � kXk kY k;(8.10)
kX
y
k = kXk;(8.11)
kX Y k = kXk kY k;(8.12)
kUk = 1 if U is unitary:(8.13)
Now we give the de�nition of approximate realization. If the operator
in question is U , then its approximate realization will be denoted by
~
U .
De�nition 8.3. The operator
~
U approximates the operator U with pre ision
Æ if
(8.14) k
~
U � Uk � Æ:
This de�nition has two noteworthy orollaries. First, if
~
U approximates
U with pre ision Æ, then
~
U
�1
approximates U
�1
with the same pre ision
Æ. Indeed, if we multiply the expression
~
U � U by
~
U
�1
on the left and by
U
�1
on the right, the norm does not in rease (due to the properties (8.10)
and (8.13) ). Thus we obtain a orollary of the inequality (8.14): kU
�1
�
~
U
�1
k � Æ.
The se ond property is as follows. Consider the produ t of several opera-
tors, U = U
L
� � �U
2
U
1
. If ea h U
k
has an approximation
~
U
k
with pre ision Æ
k
,
then the produ t of these approximations,
~
U =
~
U
L
� � �
~
U
2
~
U
1
, approximates
U with pre ision
P
Æ
k
(i.e., errors a umulate linearly):
~
U
L
� � �
~
U
2
~
U
1
� U
L
� � �U
2
U
1
�
X
j
Æ
j
:
It suÆ es to look at the example with two operators:
~
U
2
~
U
1
� U
2
U
1
=
~
U
2
(
~
U
1
� U
1
) + (
~
U
2
� U
2
)U
1
�
~
U
2
(
~
U
1
� U
1
)
+
(
~
U
2
� U
2
)U
1
�
~
U
2
~
U
1
� U
1
+
~
U
2
� U
2
U
1
=
~
U
1
� U
1
+
~
U
2
� U
2
:
Note that we have used the fa t the the norm of a unitary operator is
1. (With nonunitary operators, the approximation errors ould a umulate
mu h faster, e.g., exponentially.)
Remark 8.1. Every model that aims at solving omputational problems by
real physi al pro esses, has to be s rutinized for stability to approximation
errors. (In real life the parameters of any physi al pro ess an be given only
8. Bases for quantum ir uits 73
with ertain pre ision.) In parti ular, omputation with exponential error
a umulation is almost de�nitely useless from the pra ti al point of view.
Now we generalize De�nition 8.3 to allow the use of an illas.
De�nition 8.4. The operator U : B
n
! B
n
is approximated by the
operator
~
U : B
N
! B
N
with pre ision Æ using an illas if, for arbitrary j�i
in B
n
, the inequality
(8.15)
~
U
�
j�i j0
N�n
i
�
� U j�i j0
N�n
i
� Æ
j�i
is satis�ed.
We an formulate this de�nition in another way. Let us introdu e a
linear map V : B
n
! B
N
whi h a ts by the rule V : j�i 7! j�i j0
N�n
i.
The map V is not unitary, but isometri , i.e., V
y
V = I
B
n. The ondition
from the last de�nition may be written as
(8.16)
~
UV � V U
� Æ:
The basi properties of approximation remain true for approximation
using an illas (whi h, of ourse, should be veri�ed; see Problem 8.8).
What bases allow the realization of an arbitrary unitary operator with
arbitrary pre ision? What is the size of the ir uit that is needed to a hieve
a given pre ision Æ? How to onstru t this ir uit eÆ iently? Unfortunately,
we annot give a universal answer to these questions. In onstru ting quan-
tum algorithms, we will use the following (widely adopted) standard basis.
De�nition 8.5. The basis Q = fH;K;K
�1
;�(�
x
);�
2
(�
x
)g, where
H =
1
p
2
�
1 1
1 �1
�
; K =
�
1 0
0 i
�
;
is alled standard.
Theorem 8.3. Any unitary operator U on a �xed number of qubits an be
realized with pre ision Æ by a poly(log(1=Æ))-size, poly(log log(1=Æ))-depth
ir uit over the standard basis, using an illas. There is a polynomial algo-
rithm that onstru ts this ir uit on the des ription of U .
This theorem will be proved and generalized in Se tion 13; see Theo-
rem 13.5 on page 134 for a sharper result. The proof is based on a so- alled
phase estimation pro edure ( f. Problem 13.4 | quantum Fourier trans-
form).
As far as general bases are on erned, we will use the following de�nition.
De�nition 8.6. Let A be a gate set that is losed under inversion. We all
A a omplete basis (or a universal gate set) if the appli ations of its elements
74 2. Quantum Computation
generate a dense subgroup in the groupU(B
k
)=U(1) for some k � 2. (Here
U(1) orresponds to multipli ation by phase fa tors.)
(The phase fa tors are unimportant from the physi al point of view, as well
as for the de�nition of quantum omputation that will be given in Se tion 9.
If we really need to realize phase shifts, we an use the result of Problem 8.3.)
Remark 8.2. Why don't we a ept an illas in the de�nition of a omplete
basis? Indeed, it seems more natural to all a basisA omplete if any unitary
operator U an be realized with an arbitrary pre ision Æ by a quantum ir uit
over this basis, using an illas. Unfortunately, with this de�nition it is not
lear how to estimate the size of the ir uit in question. On the ontrary,
De�nition 8.6 provides a rather general way of obtaining su h an estimate;
see Se tion 8.3. It is not known whether the two de�nitions of a omplete
basis are equivalent.
Remark 8.3. There is yet another de�nition of a omplete basis, whi h is
based on an even more general notion of realization of a unitary operator
than the realization using an illas. A basis is alled omplete if it an e�e t
an arbitrary unitary operator on \en oded qubits" with any given pre ision
(see Se tion 15 for exposition of quantum odes). The idea is that the
quantum state of ea h qubit is represented by a state of several qubits; it
is even permitted to have multiple representations of the same state.
4
This
situation is hara terized by an isometri map V : B F ! B
k
, in whi h
ase we say that a single logi al qubit is represented by k physi al qubits (the
spa e F orresponds to the nonuniqueness of the representation). The gates
of the basis a t on physi al qubits, whereas the operator we want to realize
a ts on logi al qubits.
In su h a general model, it is again possible to estimate the size of the
ir uit that is needed to a hieve the given pre ision Æ. Moreover, the gates
of the basis an be spe i�ed with a onstant pre ision Æ
0
, yet arbitrarily a -
urate realization is possible. This fundamental result is alled the threshold
theorem for fault-tolerant quantum omputation [65, 42, 3, 36℄.
Theorem 8.4 ( f. [36℄). The standard basis Q is omplete.
Note that this theorem does not follow from Theorem 8.3. The proof of
the theorem is ontained in the solutions to Problems 8.10{8.12.
Remark 8.4. If we remove the To�oli gate from the basis Q, it eases to be
omplete. However, many important omputations an be done even with
4
Traditionally, a quantum ode is de�ned as a single preferred representation, whereas the
other representations are regarded as the preferred one, subje ted to a \ orre table error". What-
ever the terminology, multiple representations allow us to perform omputation with ina urate
gates. Su h gates introdu e \errors", or un ertainty in the resulting state, but one an arrange
that it is only the hoi e of representation that is un ertain, the en oded state remaining inta t.
8. Bases for quantum ir uits 75
su h a redu ed basis. In parti ular, as will be evident later, error- orre ting
ir uits for quantum odes an be realized without the To�oli gate.
Problems
[1℄ 8.7. Prove the properties (8.10){(8.13) of the operator norm.
[1℄ 8.8. Prove the two basi properties of approximation with an illas:
a) If
~
U approximates U with pre ision Æ, then
~
U
�1
approximates U
�1
with the same pre ision Æ.
b) If unitary operators
~
U
k
approximate unitary operators U
k
(1 � k �
L) with pre ision Æ
k
, then
~
U
L
� � �
~
U
1
approximates U
L
� � �U
1
with pre ision
P
k
Æ
k
.
[3℄ 8.9. Suppose that a unitary operator
~
U approximates a unitary op-
erator U with pre ision Æ, using an illas. Prove that there exists an op-
erator W that realizes U pre isely (i.e., the equality W
�
j�i j0
N�n
i
�
=
(U j�i) j0
N�n
i holds) and satis�es
W �
~
U
� O(Æ).
[2!℄ 8.10. SupposeX and Y are non ommuting elements of the group SO(3)
that rotate by angles in ommensurate with �. Prove that the group gener-
ated by X and Y is an everywhere dense subset of SO(3).
[3!℄ 8.11. LetM be a Hilbert spa e of �nite dimensionM � 3. Consider the
subgroupH � U(M), the stabilizer of the 1-dimensional subspa e generated
by some unit ve tor j�i 2 M. Let V be an arbitrary unitary operator not
�xing the subspa e C (j�i). Prove that the set of operators H [ V
�1
HV
generates the whole group U(M).
(Note that under the onditions of this problem U(M) and H may be
fa tored by the subgroup of phase shifts U(1).)
[3!℄ 8.12. Prove that the appli ations of the operators from the standard
basis generate an everywhere dense subset of U
�
B
3
�
=U(1).
[2℄ 8.13. Let R = �i exp(�i��
x
), � irrational. Prove that the negation
�
x
, the Deuts h gate �
2
(R) and its inverse
5
form a omplete basis.
8.3. EÆ ient approximation over a omplete basis. How an one
estimate the omplexity of realizing a unitary operator U over a omplete
basis A with a given pre ision Æ? How to onstru t the orresponding ir uit
eÆ iently? This questions arise if we want to simulate ir uits over another
basis C by ir uits over A. We would like to prove that su h simulation does
5
The inverse of the Deuts h gate is not really ne essary; it is in luded solely to onform to
De�nition 8.6.
76 2. Quantum Computation
not in rease the size of the ir uit too mu h. In this regard, we may assume
that U 2 C is �xed, while Æ tends to zero.
Let U : B
n
! B
n
be an arbitrary unitary operator. It an be repre-
sented by a matrix with omplex entries, where ea h entry is a pair of real
numbers, and ea h number is an in�nite sequen e of binary digits. We set
the question of omputing these digits aside. Instead, we assume that ea h
matrix entry is spe i�ed with a suitable pre ision, namely, Æ=2
n+1
. In this
ase the overall error in U , measured by the operator norm, does not ex eed
Æ=2. (Taking this input error into a ount, the algorithm itself should work
with pre ision Æ=2, but we will rather ignore su h details.)
The problem an be divided into two parts. First, we realize U over the
in�nite basis A
0
that onsists of all one-qubit and two-qubit gates. Se ond,
we approximate ea h gate V of the resulting ir uit C
0
by a ir uit C over the
basis A. The �rst part is mostly done in the proof of Theorem 8.1; we just
need to add some details. By examining the proof, we �nd that the ir uit
C
0
has size L
0
= exp(O(n)). If we want to represent U with pre ision
Æ, we need to ompute all gates of the ir uit with pre ision Æ
0
= Æ=L =
exp(�O(n)) Æ, whi h amounts to omputing the entries of the orresponding
matri es with pre ision Æ
00
= Æ
0
=2
n
= exp(�O(n)) Æ. This an be done in
time T = exp(O(n)) �poly(log(1=Æ)). The presen e of the exponential fa tor
should not bother us, sin e in the pra ti al appli ation U is �xed, and so is
n. Thus the �rst part is �nished and we pro eed to the se ond part.
8.3.1. Initial (non onstru tive) stage. Let us onsider the problem of
approximating an element V 2 U(B
2
) � U(B
k
) by a ir uit C over the
basis A with pre ision Æ (the number k ame from De�nition 8.6). We are
looking for approximations up to a phase fa tor; therefore we may assume
that V 2 SU(M), A � SU(M), where M = 2
k
. Then the ir uit C is
simply a sequen e of elements U
1
; : : : ; U
L
2 A su h that kV � U
L
� � �U
1
k �
Æ. De�nition 8.6 guarantees that su h a sequen e exists, but its minimum
length L an be arbitrary large (e.g., in the ase where the elements of A
are very lose to the identity). So, before we an onstru t C in an e�e tive
fashion, some initial setup is required. We will now des ribe it brie y. In this
des ription, we refer to some new on epts and fa ts that will be explained
later.
First, we generate suÆ iently many produ ts of elements of A so that
they form an "-net, where " is a suitable onstant independent of M . This
may take an arbitrary long time. The net may ome out too rowded, but
we an make it \�-sparse" (� = onst) by removing redundant points. Su h
a net has at most exp(O(M
2
)) elements; see Problem 8.14. (This bound is
tight. For suÆ iently small ", any "-net has at least exp((M
2
)) elements;
this is due to the fa t that SU(M) is a manifold of dimension M
2
� 1.)
8. Bases for quantum ir uits 77
Then we onsider the net as a new basis. It is possible to obtain an upper
bound for the approximation omplexity relative to this basis, but not to
the original one. As a part of our main theorem we will prove that any "-net
in SU(M) generates a dense subgroup, provided " is small enough.
6
It is the moment to re all some basi on epts from geometry. A distan e
fun tion on a set S is a fun tion d : S�S ! R, su h that (i) d(x; x) = 0, (ii)
d(x; y) > 0 if x 6= y, (iii) d(x; y) = d(y; x), and (iv) d(x; z) � d(x; y)+d(y; z).
A Æ-net for R � S is a set � � S whose Æ-neighborhood ontains R, i.e.,
for any x 2 R there is a y 2 � su h that d(x; y) � Æ. We say that � has no
extra points if any point of � belongs to the Æ-neighborhood of R. The net
� is alled �-sparse (0 < � < 1) if it has no extra points, and the distan e
between any two distin t points of � is greater than �Æ.
The group SU(M) is equipped with the distan e given by the operator
norm, d(U; V ) = kU�V k. Note that the diameter of SU(M) (the maximum
possible distan e) is 2. Let r > Æ > 0. Then an (r; Æ)-net in SU(M) is a
Æ-net for the r-neighborhood of the identity; this neighborhood is denoted
by S
r
. The ratio q = r=Æ > 1, alled quality, will play an important role in
our arguments.
[2!℄ Problem 8.14 (sparse nets).
a) (removing redundant points). Let � be a Æ-net for R � S. Prove
that there is a subset �
�
� � whi h is an �-sparse Æ=(1 � �)-net for R.
b) (few points are left). Prove that any �-sparse (r; r=q)-net in SU(M)
has at most (q=�)
O(M
2
)
elements.
8.3.2. Main theorem. Let A � SU(M) be a �nite subset whi h is losed
under inversion. The elements of A will be alled generators. They an be
treated in two ways: as elements of SU(M) (represented by matri es with
a suitable pre ision), or as abstra t symbols | referen es to the initially
spe i�ed elements. A ir uit is a sequen e of su h symbols, i.e., a word in
the alphabet \A". (We indi ate the abstra t use of generators by quotation
marks, e.g., \U" 2 \A".) The words onstitute the free group F [\A"℄. The
produ t of two words is obtained by on atenation, whereas the inverse of
\U
1
" � � � \U
n
" is \U
�1
n
" � � � \U
�1
1
").
Theorem 8.5. There exists a universal onstant " > 0 su h that for any
� > 0 the following is true:
6
A more general statement is true: there is a universal onstant "
0
> 0 su h that any "
0
-net
in any ompa t semisimple Lie group generates a dense subgroup, where the distan e is measured
by the operator norm for the adjoint a tion [27℄.
78 2. Quantum Computation
1. For any M , an "-net A � SU(M) ( losed under inversion), a number
Æ > 0, and an element V 2 SU(M), there is a sequen e of generators
U
1
; : : : ; U
L
2 A, L = O
�
(log(1=Æ))
3+�
�
, su h that kV � U
L
� � �U
1
k � Æ.
2. Assuming that the net A is not too redundant, jAj = exp(O(M
2
)), the
fun tion (M; Æ;A; V ) 7! \U
L
" � � � \U
1
" an be omputed by an algorithm
with running time T = exp(O(M
2
)) (log(1=Æ))
3
+O(L).
Corollary 8.5.1. Let A be a omplete basis, and C an arbitrary �nite basis.
Then any ir uit C of size L and depth d over the basis C an be simulated by
a ir uit C
0
of size L
0
= O
�
L (log(L=Æ))
�
and depth d
0
= O
�
d (log(L=Æ))
�
over the basis A. (Here = 3+�, whereas Æ denotes the simulation pre ision:
C realizes a unitary operator U , C
0
realizes U
0
, and kU � U
0
k � Æ .)
The orollary is obvious: ea h gate of C should be approximated with
pre ision Æ=L. The simulation is very eÆ ient in terms of size, but it is not
so good in terms of depth. In a ommon situation d � (logL)
k
, k � 1, so
that d
0
= O(d
1+ =k
) (assuming that Æ = onst).
Remark 8.5. The upper bound on the number of generators L in Theo-
rem 8.5 an be improved if we drop the se ond ondition (the existen e of an
eÆ ient algorithm). Let A � SU(M) be an arbitrary subset that is losed
under inversion and generates a dense subgroup in SU(M). Then any el-
ement U 2 SU(M) an be approximated with pre ision Æ by a produ t of
L � C(A) log(1=Æ) generators [31℄. On the other hand, the lower bound
L � (M
2
) log(1=Æ)= log jAj follows from a volume onsideration.
Remark 8.6. The presen e of the exponential exp(O(M
2
)) in the algorithm
omplexity bound is rather disturbing (re all that M = 2
k
, where k is
the number of qubits). As far as the asymptoti behavior at Æ ! 0 is
on erned, it seems possible to make the omputation polynomial in M ,
that is, the exponential may be ome an additive term rather than a fa tor.
(To this end, one may try to use bases in the tangent spa e instead of nets
| the reader is wel ome to explore this idea.) However, it is a hallenge
to eliminate the exponential altogether. This may be only possible if one
hanges the assumptions of the theorem, e.g., by saying that produ ts of
poly(M) elements from A onstitute an "-net (rather than A being an "-
net itself). Su h a basis A an onsist of only poly(M) elements, so it is
reasonable to ask whether there is an approximation algorithm with running
time poly(M log(1=Æ)). This appears to be a diÆ ult question in global
unitary geometry.
8.3.3. Idea of the proof and geometri lemmas. The proof of Theo-
rem 8.5 is based on four geometri properties of the group SU(M) endowed
with the operator norm distan e d. First, the distan e is biinvariant, i.e.
8. Bases for quantum ir uits 79
d(WU;WV ) = d(U; V ) = d(UW;V W ). The se ond property is the re-
sult of Problem 8.14b; this is the only pla e where the number M omes
in. The remaining two properties are related to the group ommutator,
[[U; V ℄℄ = UV U
�1
V
�1
. Let us onsider the appli ation of the group multi-
pli ation and the group ommutator to a pair of subsets A;B � SU(M),
AB =
�
UV : U 2 A; V 2 B
; [[A;B℄℄ =
�
[[U; V ℄℄ : U 2 A; V 2 B
:
Then the following in lusions hold:
[[S
a
; S
b
℄℄ � S
2ab
;(8.17)
S
ab=4
� [[S
a
; S
b
℄℄S
O(ab(a+b))
:(8.18)
(Re all that S
r
denotes the r-neighborhood of the identity. The right-hand
side of the last in lusion represents the O(ab(a+b))-neigborhood of [[S
a
; S
b
℄℄.
Indeed, the r-neighborhood of any set T an be expressed as TS
r
.)
The in lusion (8.17) is easy to prove:
[[U; V ℄℄� I
=
UV � V U
=
(U � I)(V � I)� (V � I)(U � I)
� 2 kU � Ik kV � Ik:
Formula (8.18) an be proved by approximating the group ommutator
by the orresponding Lie algebra bra ket.
7
In our ase, the Lie algebra is
formed by skew-Hermitian M �M matri es with zero tra e,
su(M) =
�
X : X
y
= �X; TrX = 0
; [X;Y ℄ = XY � Y X:
The exponential map exp : su(M)! SU(M) is simply the matrix exponen-
tiation.
The eigenvalues of X 2 su(M) have the form
eigenvalues(X) =
�
ix
1
; : : : ; ix
M
;
M
X
k=1
x
k
= 0; x
k
2 R:
Using a basis in whi h X is diagonal, one an see that if kXk = t � � then
k exp(X)�Ik = 2 sin(t=2). Therefore k exp(X)�Ik � kXk if either of these
numbers is small. Let R
t
denote the t-neighborhood of 0 in su(M) (with
respe t to the operator norm). The map exp : R
t
! S
2 sin t=2
is bije tive for
t < �. So we may represent group elements near the identity as exp(X) and
try to repla e [[ exp(X); exp(Y )℄℄ by exp([X;Y ℄); see inequality (8.21) below.
Thus the in lusion (8.18) an be obtained from the result of the following
problem.
[3!℄ Problem 8.15. Prove that R
ab=4
� [R
a
; R
b
℄ � R
2ab
.
7
We assume that the reader knows some basi fa ts about Lie groups and Lie algebras, whi h
an be found in the �rst hapter of any textbook (see e.g. [1, 15, 34, 56℄).
80 2. Quantum Computation
(Note that for su(2), [R
a
; R
b
℄ = R
2ab
| this follows from the standard
representations of the bra ket in su(2)
�
=
so(3) as the ve tor produ t in R
3
.)
[2!℄ Problem 8.16. Prove that for any X;Y 2 su(M)
exp(X) � I �X
� O
�
kXk
2
�
;(8.19)
exp(X) exp(Y )� exp(X + Y )
� O
�
kXk kY k
�
;(8.20)
[[ exp(X); exp(Y )℄℄� exp([X;Y ℄)
� O
�
kXk kY k
�
kXk+ jY k
��
:(8.21)
(The impli it onstants in O(: : : ) should not depend on M .)
How does one use the above four properties in an e�e tive pro edure?
We are going to de�ne three important operations with nets in SU(M) (in
addition to removing redundant points; see Problem 8.14a). Operations,
alled \shrinking" and \teles oping", are used to build in reasingly tight
nets �
0
;�
1
; : : : ;�
n
, n = O(log(1=Æ)) in in reasingly small neighborhoods of
the identity. Spe i� ally, ea h �
j
is an (r
j
; r
j
=q)-net, where r
j
= r
0
�
�j
,
q > � > 1 (q and � are some onstants). Elements of �
j
are produ ts of
L
j
= O(j
2+�
) generators. Then we use the onstru ted nets to approximate
an arbitrarily element V 2 SU(M) in a pro edure alled \zooming in"
(think of the nets as magnifying glasses of di�erent strength).
\Shrinking" is the operation that employs the group ommutator. It
does what its name suggests, namely makes smaller nets from bigger ones.
An (r; r=q)-net shrinks to an (r
2
=4; 5r
2
=q)-net. Suppose that elements of
the original net were produ ts of l generators. Taking the ommutator
multiplies l by 4, whereas the radius r gets approximately squared, and so
does the pre ision Æ = r=q (we assume that q is bounded by a onstant).
Repetition of this pro edure ould yield the desired rate of ompression, and
even better, Æ � r � exp(�l
1=2
). However, the quality of the net degrades
at ea h step, q 7! q=20. The \teles oping" omes to res ue, but at some
ost. Also, we need to sele t a sparse subnet after ea h shrinking to keep
the number of points in ea h net bounded by exp(O(M
2
)). The resulting
rate of ompression is slightly lower, Æ � r � exp(�l
1=(2+�)
), where � an
be arbitrary small. Therefore l = O
�
(log(1=Æ))
2+�
�
.
[2!℄ Problem 8.17. In items (b) and ( ) below, assume that G is an ar-
bitrary group with a biinvariant distan e fun tion d. The result of (a) is
spe i� to SU(M).
(a) (\shrinking"). Let �
1
� SU(M) be an (r
1
; r
1
=q)-net, and �
2
�
SU(M) an (r
2
; r
2
=q)-net. Denote by [[�
1
;�
2
℄℄
�
an �-sparse subnet sele ted
from [[�
1
;�
2
℄℄ =
�
[[U
1
; U
2
℄℄ : U
1
2 �
1
; U
2
2 �
2
(see Problem 8.14a). Prove
that [[�
1
;�
2
℄℄
1=6
is an (r
1
r
2
=4; 5r
1
r
2
=q)-net provided q > 20 and r
1
; r
2
�
O(q
�1
).
8. Bases for quantum ir uits 81
(b) (\teles oping" | ombining two nets into one of higher quality).
Let �
1
� G be an (r
1
; Æ
1
)-net, and �
2
� G an (r
2
; Æ
2
)-net, where Æ
1
� r
2
.
Prove that the set �
1
�
2
=
�
U
1
U
2
: U
1
2 �
1
; U
2
2 �
2
is an (r
1
; Æ
2
)-net.
( ) (\zooming in" | iterative approximation). Let �
0
;�
1
; : : : ;�
n
� G
be a sequen e of nets: �
0
is a Æ
0
-net for the entire G, whereas for j � 1
ea h �
j
is an (r
j
; Æ
j
)-net. Suppose that r
j
= r
0
�
�j
, Æ
j
= Æ
0
�
�j
, where
r
0
=Æ
0
= q > � > 1. Prove that any element V 2 G an be approximated by
Z = Z
0
Z
1
� � �Z
n
(Z
j
2 �
j
) so that d(V;Z) � Æ
n
.
8.3.4. The algorithm. Without loss of generality, we may assume that
� = 1=p, where p is an integer. The algorithm onsists of three stages.
Prepro essing. Computation at this stage does not depend on V . We build
in reasingly tight nets in in reasingly small neighborhoods of the identity.
This is done by shrinking an initial net p times and \teles oping" it with one
of the previous nets to regain the original quality; then the y le repeats.
More pre isely, we onstru t a set of nets �
j;k
, j = 0; : : : ; n = O(log(1=Æ)),
k = 0; : : : ; p, a ording to the re ursion rules
�
j;k
=
��
�
dj=2e;k�1
;�
bj=2 ;k�1
��
1=6
for k = 1; : : : ; p;
�
j;0
= �
j�1;p
�
j;p
:
Ea h �
j;k
is an (r
j;k
; r
j;k
=q
k
)-net, where
r
j;k
= r
0;k
�
�j
; r
0;k
= 4C
�p2
k
=(2
p
�1)
; q
k
= C
2p�k
; � = C
p
; C = 20:
The re ursion relations work only for suÆ iently large j (namely, r
j;k
should
be small enough to satisfy the ondition of Problem 8.17a). The �rst few
nets have to be obtained by pi king points from the initial net A; hen e
we need to set the onstant " small enough. (A ording to this rule, the
onstant " depends on p. To avoid su h dependen e, we need to run the
�rst few steps using p = 1, and then swit h to the desired p.)
Ea h element of �
j;k
is a produ t of L
j;k
generators. The numbers L
j;k
satisfy the relations
L
j;k
= 2L
dj=2e;k�1
+ 2L
bj=2 ;k�1
(k = 1; : : : ; p); L
j;0
= L
j�1;p
+ L
j;p
:
An upper bound L
j;k
� j
2+1=p
2
�k=p
�
u
0
� u
1
=j
�
(with onstant u
0
and u
1
)
an be obtained by indu tion; hen e L
j;k
= O(j
2+�
).
When onstru ting elements of the nets �
j;k
, we do not a tually write
them as sequen es of generators. Instead, we represent them as M �M
matri es and keep re ord of the way ea h element was obtained. This
stage involves exp(O(M
2
)) log(1=Æ) matrix multipli ations, whi h amounts
to exp(O(M
2
))(log(1=Æ))
3
bit operations.
82 2. Quantum Computation
Iterative approximation. We use the nets �
j;0
as indi ated in Problem 8.17 .
This yields an element Z = Z
0
� � �Z
n
(Z
j
2 �
j;0
) su h that kV �Z
0
� � �Z
n
k �
Æ. The omplexity of this stage is also exp(O(M
2
))(log(1=Æ))
3
.
Expansion. Now we need to represent ea h Z
j
as a word in the alphabet
\A". We have already omputed Z
j
as matri es, so we know the sequen e of
matrix multipli ations and inversions that have led to Z
j
. In other words,
we have a lassi al ir uit over the basis fmultipli ation, inversiong.
This ir uit will perform omputation over the free group as well. Thus
we plug symbols of the alphabet \A" to the inputs of the ir uit, and get
some words w
j
as the output; then we on atenate them to obtain the
word w representing Z. When omputing w
j
, we operate with exponentially
in reasing words; therefore the omplexity is dominated by the last step. So,
the number of operations is O(jw
j
j) = O(L
j;0
) = O(j
2+�
). Summing over j,
we on lude that w is omputed in O(L) steps, where L = jwj = O(n
3+�
)
(re all that n = O(log(1=Æ) ).
9. De�nition of Quantum Computation.
Examples
9.1. Computation by quantum ir uits. Until now, we have been de-
s ribing the work of a quantum omputer. Now it is time to de�ne when
this work leads to the solution of problems that are interesting to us. The
de�nition will resemble the de�nition of probabilisti omputation.
Consider a fun tion F : B
n
! B
m
. We examine a quantum ir uit op-
erating with n bits, U = U
L
� � �U
2
U
1
: B
N
! B
N
. Loosely speaking, this
ir uit omputes F if, after having applied U to the initial state jx; 0
N�n
i
and \having looked" at the �rst m bits, we \see" F (x) with high probability.
(The remaining qubits an ontain arbitrary garbage.)
We only need to dis uss the nature of that probability. The pre ise
meaning of the words \having looked" and \see" is that a measurement
of the values of the orresponding qubits is performed. Several di�erent
answers an be obtained as the result of this measurement, ea h with its
own probability. Later (in Se tion 10) this question will be onsidered in
more details. To give a de�nition of quantum omputation of a fun tion F ,
it suÆ es (without inje ting physi al explanations of this fa t) to a ept the
following: the probability of getting a basis state x in the measurement of
the state j i =
P
x
x
jxi equals
(9.1) P(j i; x) = j
x
j
2
:
We are interested in the probability that the omputer will �nish its
work in a state of the form (F (x); z), where z is arbitrary.
9. De�nition of Quantum Computation. Examples 83
De�nition 9.1. The ir uit U = U
L
� � �U
2
U
1
omputes F if for any x we
have
X
z
jhF (x); zjU jx; 0
N�n
ij
2
� 1� ";
where " is some �xed number smaller than 1=2. (Note that F (x) and x
onsist of di�erent numbers of bits, although the total lengths of (F (x); z)
and (x; 0
N�n
) must be equal to N .)
Just as for probabilisti omputation, the hoi e of " is unimportant,
inasmu h as it is possible to e�e t several opies of the ir uit indepen-
dently and to hoose the result that is most frequently obtained. From the
estimate (4.1) on p. 37 it follows that in order to de rease the probabil-
ity of failure by a fa tor of a, we need to take k = �(log a) opies of the
ir uit U . The hoi e of the most frequent result is realized by a lassi al
ir uit, using the majority fun tion MAJ(x
1
; : : : ; x
k
) (whi h takes value 1
when more than half of its arguments equal 1 and value 0 otherwise). The
fun tion MAJ(x
1
; : : : ; x
k
) an be realized over a omplete basis by a ir uit
of size O(k) (see Problem 2.15). Therefore the a-fold redu tion of the error
probability is a hieved at the ost of in reasing the ir uit size by the fa tor
O(log a).
[1℄ Problem 9.1. Prove that the above argument is a tually orre t if we
in orporate the fun tion MAJ into the quantum ir uit. Spe i� ally, show
that we may use the fun tion MAJ
�
realized by a reversible ir uit, so that
its input bits are the output qubits of k opies of the ir uit U .
[2!℄ Problem 9.2. Suppose that ea h gate of the ir uit U
k
is approximated
by
~
U
k
with pre ision Æ. Prove that the resulting ir uit
~
U =
~
U
L
� � �
~
U
2
~
U
1
satis�es the inequality from De�nition 9.1, with " repla ed by ~" = "+ 2LÆ.
(The suggested solution is based on the general notion of quantum prob-
ability; see Se tion 10, espe ially Remark 10.1.)
Now that we have the de�nition of quantum omputation, we an make
a omparison of the e�e tiveness of lassi al and quantum omputing. In
the Introdu tion we mentioned three fundamental examples where quantum
omputation appears to be more e�e tive than lassi al. We begin with the
example where the greater e�e tiveness of quantum omputation has been
proved (although the in rease in speed is only polynomial).
9.2. Quantum sear h: Grover's algorithm. We will give a de�nition of
a general sear h problem in lassi al and quantum formulations. It belongs
to a lass of omputational problems with ora le, in whi h the input is given
as a fun tion ( alled a \bla k box", or an ora le) rather than a binary word.
y
x
A(x; y)
84 2. Quantum Computation
Suppose we have a devi e (see the diagram) that
re eives inputs x and y and determines the value of
some predi ate A(x; y). We are interested in the pred-
i ate F (x) = 9yA(x; y). This resembles the de�nition
of the lass NP, but now the internal stru ture of the devi e al ulating the
predi ate A is ina essible to us. Under su h onditions, it is not possible to
omplete the al ulation faster than in N = 2
n
steps on a lassi al omputer,
where n is the number of bits in the binary word y.
The problem an be formulated even without x: we need to ompute
the value of the \fun tional" F(A) = 9yA(y). If x is present, we an regard
it as a part of the ora le, i.e., repla e A with the predi ate A
x
su h that
A
x
(y) = A(x; y). Then F(A
x
) = F (x) = 9yA(x; y).
Remark 9.1. The version of the problem without x has another interpre-
tation, whi h is quite rigorous (unlike the analogy with NP). Let us think
of A as a bit string: y is the index of a bit, whereas A(y) is the value of
that bit ( f. the paragraph pre eeding De�nition 2.2 on page 26). Then
F(A) =
W
y2B
n
A(y), the OR fun tion with N = 2
n
inputs.
It turns out that a quantum omputer an determine the value of F(A)
and even �nd a y for whi h A(y) is satis�ed, in time O(
p
N). The lower
bound (
p
N) has also been obtained, showing that in this situation quan-
tum omputers give only a quadrati speed-up in omparison with lassi al
ones.
jyiU jyi
U
In the quantum formulation, the problem looks as
follows. A quantum omputer an query the ora le by
sending y so that di�erent values of y may form su-
perpositions, and the ora le will return superpositions
a ordingly. Interestingly enough, the ora le an en ode the answer into
a phase fa tor. Spe i� ally, our ora le (or \bla k box") is de�ned as an
operator U a ting by the rule
U jyi =
(
jyi if A(y) = 0;
�jyi if A(y) = 1:
We assume that the omputer an hoose whether to query the ora le or
not, whi h orresponds to applying the operator �(U).
The goal is to ompute the value F(A) and �nd an \answer" y for whi h
A(y) is satis�ed. This should be done by a quantum ir uit, using �(U) as
a gate (in addition to the standard basis).
The results that we have already mentioned are formulated as follows
( f. [32, 75℄): there exist two onstants C
1
; C
2
su h that there is a ir uit of
size � C
1
p
N , de iding the problem for an arbitrary predi ate A; and, for
9. De�nition of Quantum Computation. Examples 85
an arbitrary ir uit of size � C
2
p
N , there exists a predi ate A for whi h
the problem is not de ided by this ir uit (i.e., the ir uit gives an in orre t
answer with probability > 1=3).
We will onstru t a quantum ir uit for a simpli�ed version of the prob-
lem: we assume that the \answer" exists and is unique, and we denote it
by y
0
; we need to �nd y
0
. The ir uit will be des ribed in terms of operator
a tion on the basis ve tors.
Consider two operators
U = I � 2jy
0
ihy
0
j;
V = I � 2j�ih�j; where j�i =
1
p
N
X
y
jyi.
The operator U is given to us (it is the ora le). The operator V is
represented by the matrix
V =
0
B
�
1�
2
N
: : : �
2
N
.
.
.
.
.
.
.
.
.
�
2
N
: : : 1�
2
N
1
C
A
(re all that N = 2
n
).
Let us realize V by a quantum ir uit. We will pro eed as follows:
we transform j�i to j0
n
i by some operator W , then apply the operator
I � 2j0
n
ih0
n
j, and �nally apply W
�1
.
It is easy to onstru t an operatorW that takes j�i to j0
n
i. The following
will do: W = H
n
, where H is the Hadamard gate from the standard basis
(see De�nition 8.5). In fa t, j�i =
1
p
2
n
(j0i+ j1i)
n
, and H :
1
p
2
(j0i+ j1i) 7!
j0i.
Now we have to implement the operator I � 2j0
n
ih0
n
j. We will use a
reversible lassi al ir uit that realizes the operator Z : B
n+1
! B
n+1
,
Zja
0
; : : : ; a
n
i = ja
0
� f(a
1
; a
2
; : : : ; a
n
); a
1
; : : : ; a
n
i;
f(a
1
; : : : ; a
n
) =
(
1 if a
1
= � � � = a
n
= 0;
0 if 9 j : a
j
6= 0:
(Up to a permutation of the arguments, Z =
f
�
.) Sin e f has Boolean
ir uit omplexity O(n), Z an be realized by a reversible ir uit of size
O(n) (see Lemma 7.2).
The ir uit that realizes the operator V is shown in Figure 9.1. The en-
tral portion, in orporating Z, �
z
and Z, realizes the operator I � 2j0
n
ih0
n
j.
In this ir uit, the operator �
z
= K
2
(K from the standard basis) is used.
86 2. Quantum Computation
ZZ
�
z
WW
a
0
= 0
a
1
.
.
.
a
n
0
.
.
.
0
0
.
.
.
0
0
Fig. 9.1. Implementation of the operator V .
We note that W
2
and Z
2
a t trivially (as the identity operator) on
ve tors with zero-valued borrowed qubits. Therefore the de isive role is
played by the operator �
z
a ting on an auxiliary qubit, whi h likewise returns
to its initial value 0 in the end.
We must not be onfused by the fa t that although �
z
a ts only on
an f
�
- ontrolled qubit, the whole ve tor hanges as a result. In general,
the distin tion between \reading" and \writing" in the quantum ase is not
absolute and depends on the hoi e of basis. Let us give a relevant example.
�
x
HH
H H
Fig. 9.2. �(�
x
) in a di�erent basis.
Let us �nd the matrix of �(�
x
) : ja; bi 7! ja; a� bi relative to the basis
1
p
2
(j0i � j1i) for ea h of the qubits. In other words, we need to write the
matrix of the operator X = (H H) �(�
x
) (H H) relative to the lassi al
basis. The ir uit for this operator is shown in Figure 9.2. Using the equality
9. De�nition of Quantum Computation. Examples 87
�(�
x
)j ; di = j ; � di, we �nd the a tion of X on any basis ve tor:
Xja; bi =
1
2
(H H)�(�
x
)
X
;d
(�1)
a +bd
j ; di
=
1
2
(H H)
X
;d
(�1)
a +bd
j ; � di
=
1
4
X
a
0
;b
0
; ;d
(�1)
a
0
+b
0
( +d)
(�1)
a +bd
ja
0
; b
0
i
=
1
4
X
a
0
;b
0
2Æ
b;b
0
� 2Æ
a; a
0
�b
0
ja
0
; b
0
i = ja� b; bi:
Thus, in the basis
1
p
2
(j0i � j1i), the ontrolling and the ontrolled qubits
have hanged pla es. Whi h bit is \ ontrolling" (is \read") and whi h is
\ ontrolled" (is \modi�ed") depends on the hoi e of basis. Of ourse, su h
a situation goes against our lassi al intuition. It is hard to imagine that by
passing to a di�erent basis, a quantum printer suddenly be omes a quantum
s anner.
[1℄ Problem 9.3. What will happen if we hange the basis only in one of
the qubits?
�
x
H H
For example, what will the matrix of the op-
erator with the ir uit shown in the diagram
look like? Also try to hange the basis in the
other qubit.
Let us return to the onstru tion of a ir uit for the general sear h
problem. What follows is the main part of Grover's algorithm. The ora le
U = I � 2jy
0
ihy
0
j was given to us, and we have realized the operator V =
I�2j�ih�j. We start omputation with the ve tor j�i, whi h an be obtained
from j0
n
i by applying the operator W . Now, with the aid of the operators
U and V , we are going to transform j�i to the solution ve tor jy
0
i. For this,
we will apply alternately the operators U and V :
� � � V UV U j�i = (V U)
s
j�i:
What do we get from this? Geometri ally, both operators are re e tions
through hyperplanes. The subspa e L = C (j�i; jy
0
i) is invariant under both
operators, and thus, under V U . Sin e the initial ve tor j�i belongs to L, it
suÆ es to onsider the a tion of V U on this subspa e.
j�i
jy
0
i
'=2
The omposition of two re e tions with respe t
to two lines is a rotation by twi e the angle between
those lines. The angle is easy to al ulate: h�jy
0
i =
88 2. Quantum Computation
1
p
N
= sin
'
2
, i.e., the lines are almost perpendi ular.
Therefore we may write V U = �R, where R is the
rotation by the small angle '. But then (V U)
s
=
(�1)
s
R
s
, where R
s
is the rotation through the angle
s'. The sign does not interest us (phase fa tors
do not a�e t probabilities). For large N , we have
' � 2=
p
N . Then, after s � (�=4)
p
N iterations, the initial ve tor is turned
by an angle s' � �=2 and be omes lose to the solution ve tor. This also
indi ates that the system ends up in the state jy
0
i with probability lose to
one.
To solve the sear h problem in the most general setting (when there
may be several answers, or there may be none), additional te hni al devi es
are needed. Note that the number of steps for the rotation from the initial
ve tor to some ve tor of the subspa e spanned by the solution ve tors is
inversely proportional to the square root of the number of solutions.
Problem 9.4. For a given n, onstru t poly(n)-size quantum ir uits
(over the basis of all two-qubit gates) whi h perform the following tasks.
[2℄ a) For a given number q, 1 � q � 2
n
, transform the state j0
n
i into the
state j�
n;q
i =
1
p
q
P
q�1
j=0
jji.
[2℄ b) Transform jq� 1; 0
n
i into jq� 1i j�
n;q
i for all q, assuming that q is
expressed in n binary digits.
[3!℄ ) Realize the Fourier transform operator F
q
over the group Z
q
:
F
q
jxi =
1
p
q
q�1
X
y=0
exp
�
2�i
xy
q
�
jyi;
where x and y are expressed in n binary digits. Consider the ase q = 2
n
.
(The ase of arbitrary q requires some extra tools, so we will onsider it
later; see Problem 13.4.)
9.3. A universal quantum ir uit. The se ond of the examples men-
tioned in the Introdu tion was simulation of a quantum me hani al system.
This is a vaguely posed problem sin e the hoi e of parti ular systems and
distinguishing \essential" degrees of freedom play an important role. The
problem has been a tually solved in several settings. With high on�den e,
we may laim that every physi al quantum system an be eÆ iently simu-
lated on a quantum omputer, but we an never prove this statement. The
situation resembles that of Turing's thesis (see Se tion 1.3). Re all that
the validity of Turing's thesis is partially justi�ed by the existen e of the
universal Turing ma hine. In this vein, we may examine universality of our
9. De�nition of Quantum Computation. Examples 89
quantum omputation model by purely mathemati al means. Let us try to
simulate many ir uits by one.
We will not limit the type of gates we use to any parti ular basis. General
quantum ir uits have manageable des ription if the gates are spe i�ed as
matri es with entries given by binary fra tions to ertain pre ision Æ
1
. Then
the ina ura y of an r-qubit gate (in the operator norm) does not ex eed
Æ = MÆ
1
, where M = 2
r
is the size of the matrix. Suppose we have a
des ription Z of a quantum ir uit of size � L and pre ision Æ. Ea h gate of
the ir uit a ts on at most r qubits, so that the total length of the des ription
does not ex eed poly (L2
r
log(1=Æ)). The operator realized by the ir uit will
be denoted by Op(Z). We will try to simulate all ir uits with the given
parameters L; r; Æ.
Using the algorithm from the proof of Theorem 8.1, we redu e the prob-
lem to the ase r = 2. Then we apply Theorem 8.3. Thus we an realize
ea h operator in Z by a ir uit of size poly(2
r
log(1=Æ)) over the standard
basis using O(r) an illas. This yields (a des ription of) a ir uit R(Z)
over the standard basis, whi h has size S = poly (L2
r
log(1=Æ)), operates
on N = L + O(r) qubits, and approximates Op(Z) with pre ision O(LÆ).
The transformation Z 7! R(Z) is performed by a Boolean ir uit of size
poly (L2
r
log(1=Æ)). Hen e simulating a general ir uit is not mu h harder
than simulating ir uits over the standard basis.
The result is as follows. There is a universal quantum ir uit U of size
poly (L2
r
log(1=Æ)) that simulates the work of an arbitrary quantum ir uit
in the following way: for any ir uit des ription Z and input ve tor j�i, U
satis�es the ondition
U
�
jZi j�i j0
k
i
�
� jZi
�
Op(Z)j�i
�
j0
k
i
= O(LÆ):
That is, U works as a \programmable quantum omputer", with Z being
the \program".
The qubits of the ir uit U in lude N \ ontrolled" qubits that orre-
spond to the qubits of R(Z). Another subset of qubits holds jZi. There is
also a number of auxiliary qubits, some of whi h are alled \ ontrolling".
The key omponent of the ir uit U is a ir uit V , the produ t of the op-
erators V
j
= �(X)[j; k
j
℄ (or V
j
= �(X)[j; k
j
; l
j
℄, or V
j
= �(X)[j; k
j
; l
j
;m
j
℄),
with X from the standard basis, applied to ea h one (or pair, or triple) of
the ontrolled qubits in an arbitrary order. The ontrolling qubits j are all
di�erent. If we set one ontrolling qubit to 1 and all the others to 0, then the
ir uit V realizes an operator of the form X[k℄ (or X[k; l℄, or X[k; l;m℄) on
the ontrolled qubits. Hen e the omposition of S opies of V with di�erent
ontrolling qubits an simulate an arbitrary ir uit of size S over the stan-
dard basis, provided that the ontrolling qubits are set appropriately. To
90 2. Quantum Computation
set the ontrolling qubits, we need to ompute R(Z) by a reversible ir uit
(with garbage) and arrange the output in a ertain way. This omputation
should be reversed at the end.
9.4. Quantum algorithms and the lass BQP. Up until now we have
been studying nonuniform quantum omputation (i.e., omputation of Boo-
lean fun tions). Algorithms ompute fun tions on words of arbitrary length.
A de�nition of a quantum algorithm an be given using quantum ir uits
that have been already introdu ed. Roughly speaking, a lassi al Turing
ma hine builds a quantum ir uit that omputes the value of the fun tion
on one or many inputs. A tually, there are several equivalent de�nitions,
the following being the standard one. Let F : B
�
! B
�
be a fun tion su h
that the length of the output is polynomial in the length of the input. It is
omposed of a sequen e of Boolean fun tions F
n
: B
n
! B
m(n)
(restri tions
of F to inputs of length n = 0; 1; 2; : : : ). A quantum algorithm for the
omputation of F is a uniform sequen e of quantum ir uits that ompute
F
n
. Uniform means that the des ription Z
n
of the orresponding ir uit is
onstru ted by a lassi al Turing ma hine whi h takes n as the input. We
will say that the quantum algorithm omputes F in time T (n) if building
the ir uit takes at most T (n) steps. The size of the ir uit L is obviously
not greater than T (n).
A subtle point in this de�nition is what basis to use. It is safe to sti k
to the standard basis. Alternatively, the basis may onsist of all unitary
operators. In this ase, ea h r-qubit gate should be spe i�ed as a list of all its
matrix elements with pre ision 2
�r
L
�1
, so that the pre ision of the matrix
(in the operator norm) is L
�1
, where is a small onstant. If "+ 2 < 1=2
(see De�nition 9.1 and Problem 9.2) then the approximate ir uit works �ne.
Using the algorithm of Theorems 8.1 and 8.3, this ir uit an be transformed
to an equivalent ir uit of size poly(T (n)) over the standard basis (note that
T (n) in ludes the fa tor 2
r
). The onverse is obvious.
Remark 9.2. The use of an arbitrary omplete basis ould lead to \patholo-
gies". For example, let the basis ontain the gate
X =
�
os � � sin �
sin � os �
�
;
where � is a non omputable number, e.g., the n-th digit of � says whether
the universal Turing ma hine terminates at input n ( f. Problem 1.3). Then
p = sin
2
� is also non omputable. If we apply X to the state j0i and mea-
sure the qubit, we will get 1 with probability p and 0 with probability 1� p.
Repeating this pro edure exp(�(n)) times and ounting the number of 0s
and 1s, we an �nd the n-th digit of p with very small error probability.
Thus the gate X enables us to solve the halting problem! (This argument
9. De�nition of Quantum Computation. Examples 91
has nothing to do with quantum me hani s. A lassi al probabilisti om-
puter ould also get extra power if random numbers with arbitrary p were
allowed.) Of ourse, we want to avoid su h things in our theory, so we must
be areful about the hoi e of basis. However, in the real world \superpow-
erful" gates might exist. Experimentalists measure dimensionless physi al
onstants (su h as the �ne stru ture onstant) with in reasingly high pre-
ision, getting new digits of the number theoreti ians annot ompute. If
we ever learn where the fundamental physi al onstants ome from, we will
probably know whether they are omputable, and if they are not, whether
they arry some mathemati ally meaningful information, e.g., allow one to
solve the halting problem.
Remark 9.3. It is possible to de�ne a quantum Turing ma hine dire tly,
through superpositions of various states of a lassi al Turing ma hine (the
original de�nition of D. Deuts h [20℄ was just like this). The standard
de�nition turns out to be equivalent but more onvenient.
Using the universal quantum ir uit, we an simplify the standard def-
inition even further. It suÆ es to have a lassi al Turing ma hine that
generates a des ription of a quantum ir uit Z(x) whi h is only good to
ompute F (x) for a single value of x. In this ase, x is the input word for
the TM whereas the ir uit does not have input data at all (i.e., it operates
on supplementary qubits initialized by the state j0
N
i ). Indeed, if we have
su h a ma hine M , then we an build a ma hine M
0
whi h re eives n and
onstru ts a Boolean ir uit whi h omputes Z(x) for all values of x, jxj = n
(see Theorem 2.3). By a ertain polynomial algorithm, the Boolean ir uit
an be transformed into a reversible ir uit (with garbage) and ombined
with the universal quantum ir uit, so that the output of the former (i.e.,
Z(x)) be omes the \program" for the latter. This yields a quantum ir uit
that omputes F
n
.
Thus we will adopt the following de�nition.
De�nition 9.2. A quantum algorithm for the omputation of a fun tion
F : B
�
! B
�
is a lassi al algorithm (i.e., a Turing ma hine) that omputes
a fun tion of the form x 7! Z(x), where Z(x) is a des ription of a quantum
ir uit whi h omputes F (x) on empty input. The fun tion F is said to
belong to lass BQP if there is a quantum algorithm that omputes F in
time poly(n).
How does the lass BQP relate to the omplexity lasses introdu ed
earlier?
[3!℄ Problem 9.5. Prove that
BPP � BQP � PP � PSPACE:
92 2. Quantum Computation
The lass PP onsists of predi ates of the form
Q(x) =
�
�
�
fy : R
0
(x; y)g
�
�
<
�
�
fy : R
1
(x; y)g
�
�
�
;
where R
0
; R
1
2 P, and y runs through all words of length bounded by some
polynomial q(x).
This is almost all that is known about the orresponden e between BQP
and the other omplexity lasses. Indire t eviden e in favor of the stri t
in lusion BPP � BQP is given by the existen e of e�e tive quantum algo-
rithms for some number-theoreti problems traditionally regarded as diÆ ult
(see Se tion 13).
We also remark that there have re ently appeared interesting results on-
erning quantum analogs of some stronger omplexity lasses (not des ribed
in Part 1).
What's next? (A note to the impatient reader). We have spent four
se tions de�ning what quantum omputation is, but have given only few
nontrivial examples so far. The reader may want to see more examples right
now. If so, you may skip some material and read Se tion 13.1 (Simon's
algorithm). There will be some referen es to \mixed states", but all al u-
lations an be done with state ve tors as well. However, most other results
are based (not as mu h formally as on eptually) upon the general notion
of quantum probability and measurement. We will now pro eed to these
topi s.
10. Quantum probability
10.1. Probability for state ve tors. Let us dis uss several \physi al"
aspe ts of quantum omputation. Let a system of n qubits be in the state
j i =
P
x
x
jxi. The oeÆ ients of the expansion relative to the lassi al
basis are alled amplitudes. The square of the modulus of the amplitude,
j
x
j
2
, equals the probability of �nding the system in a given state x ( ompare
with (9.1)). In other words, under a measurement of the state of this quan-
tum system, a lassi al state will be obtained, a ording to the probability
distribution j
x
j
2
.
The quantity determined by formula (9.1) possesses the basi properties
of ordinary probability. The fa t that the square of the modulus of the am-
plitude is the probability of observing the system in state x agrees with the
fa t that the physi al states of quantum me hani s orrespond to ve tors of
unit length, and transformations of these states do not hange the length,
i.e., they are unitary. Indeed, h j i =
P
x
j
x
j
2
= 1 (the sum of probabil-
ities equals 1), and the appli ation of physi ally realizable operators must
preserve this relation, i.e., the operator must be unitary.
10. Quantum probability 93
Formula (9.1) is suÆ ient for the de�nition of quantum omputation
and the lass BQP. There are, however, situations for whi h this de�nition
turns out to be in onvenient or inappli able. Two fundamental examples
are measurement operators and algorithms that are based on them, and
the problem of onstru ting reliable quantum ir uits from unreliable gates
(error orre tion).
We therefore give a de�nition of quantum probability whi h general-
izes both what we observe (the state of the system) and the result of the
observation. We will arrive at this general de�nition by analysing a series
of examples. To begin with, we rewrite the expression for the probability
already btained in the form
j
x
j
2
= jh jxij
2
= h
�
x
z }| {
jxihxj i;
where �
x
denotes the proje tion to the subspa e spanned by jxi.
To make the next step toward the general de�nition of quantum prob-
ability, we ompute the probability that the �rst m bits have a given value
y = (y
1
; : : : ; y
m
). Let us represent basis states in the form of two blo ks:
x =
y
z
m
n�m
. We obtain
(10.1)
P(j i; y) =
X
z
P(j i; (y; z)) =
X
z
h jy; zihy; zj i
= h j
�
jyihyj I
�
j i = h j�
M
j i:
Here �
M
denotes the operator of orthogonal proje tion onto the subspa e
M = jyi B
(n�m)
. Formula (10.1) gives the de�nition of quantum proba-
bility also in the ase whereM is an arbitrary subspa e. In this ase the pro-
je tion onto the subspa eM� N is given by the formula �
M
=
P
j
je
j
ihe
j
j,
where e
j
runs over an arbitrary orthonormal basis for M.
Remark 10.1. The quantity
P
z
jhF (x); zjU jx; 0
N�n
ij
2
, whi h appears in
the de�nition of the evaluation of a fun tion F : B
n
! B
m
by a quantum
ir uit (De�nition 9.1), equals P(U jx; 0
N�n
i;M), where M = jF (x)i
B
N�m
. Re all on e again the meaning of this de�nition: the ir uit U =
U
L
� � �U
2
U
1
omputes F if for ea h x the probability to observe the orre t
result for F (x) after appli ation of the ir uit to the initial state jx; 0
N�n
i
is at least 1� ".
Proje tions do not represent physi ally realizable operators; more pre-
isely, they do not des ribe the evolution of one state of a system to another
over a �xed time period. Su h evolution is des ribed by unitary operators.
Nonetheless, taking some liberty, it is possible to bestow physi al meaning
on proje tion. A proje tion sele ts a portion of system states from among all
94 2. Quantum Computation
possible states. Imagine a �lter, i.e., a physi al devi e whi h passes systems
in states belonging toM but destroys the system if its state is orthogonal to
M. (For example, a polarizer does this to photons.) If we submit a system
in state j i to the input of su h a �lter, then the system at the output will
be in the state j�i = �
M
j i. The probability asso iated to this state is
generally smaller than one; it is p = h�j�i = h j�
M
j i. The number 1 � p
determines the probability that the system will not pass through the �lter.
Let us ompare lassi al and quantum probability.
Classi al probability Quantum probability
De�nition
An event is a subset M of a �xed �nite
set N .
An event is a subspa e M of some
�nite-dimensional Hilbert spa e N .
A probability distribution is given by a
fun tion w : N ! R with the properties
a)
P
j
w
j
= 1; b) w
j
� 0.
A probability distribution is given by a
state ve tor j i, h j i = 1.
Probability: Pr(w;M) =
P
j2M
w
j
. Probability: P(j i;M) = h j�
M
j i.
Properties
1. If M
1
\M
2
= ?, then
Pr(w;M
1
[M
2
) = Pr(w;M
1
)
+Pr(w;M
2
).
1
q
. If M
1
?M
2
, then
P(j i;M
1
�M
2
) = P(j i;M
1
)
+P(j i;M
2
).
2. (in the general ase)
Pr(w;M
1
[M
2
) = Pr(w;M
1
)
+Pr(w;M
2
)�Pr(w;M
1
\M
2
).
2
q
. If �
M
1
�
M
2
= �
M
2
�
M
1
, then
P(j i;M
1
+M
2
) = P(j i;M
1
)
+P(j i;M
2
)�P(j i;M
1
\M
2
).
Note that the onditionM
1
?M
2
(mutually ex lusive events) is equiv-
alent to the ondition �
M
1
�
M
2
= �
M
2
�
M
1
= 0.
If we have two nonorthogonal subspa es with zero interse tion, the quan-
tum probability is not ne essarily additive. We give a simple example where
P(j i;M
1
+M
2
) 6= P(j i;M
1
) +P(j i;M
2
).
Let j�i = j0i,M
1
= C (j0i) (the linear subspa e generated by the ve tor
j0i ), M
2
= C(j�i); where h�j�i is lose to 1. Then
1 = P(j�i;M
1
+M
2
) 6= P(j�i;M
1
) +P(j�i;M
2
) � 1 + 1:
10.2. Mixed states (density matri es). Thus, we have de�ned, in the
most general way, what quantity we measure. Now we need to generalize
what obje t we perform the measurement on. Su h obje ts will be something
more general than state ve tors or probability distributions. This will give
us a de�nition of probability that generalizes both lassi al and quantum
probability.
Consider a probability distribution on a �nite set of quantum states
�
j�
1
i; : : : ; j�
s
i
. The probability of the state j�
j
i will be denoted by p
j
;
10. Quantum probability 95
learly
P
j
p
j
= 1. We will al ulate the probability of observing a state in
the subspa e M:
(10.2)
X
k
p
k
P(j�i;M) =
X
k
p
k
h�
k
j�
M
j�
k
i
=
X
k
p
k
Tr (j�
k
ih�
k
j�
M
) = Tr(��
M
);
where � denotes the density matrix
8
� =
P
k
p
k
j�
k
ih�
k
j. The �nal expression
in (10.2) is what we take as the general de�nition of probability.
[1!℄ Problem 10.1. Prove that the operators of the form � =
P
k
p
k
j�
k
ih�
k
j
are pre isely the Hermitian nonnegative operators with tra e 1, i.e., opera-
tors that satisfy the onditions
1) � = �
y
; 2) 8j�i h�j�j�i � 0; 3) Tr � = 1:
From now on, by a density matrix we will mean an arbitrary operator
with these properties.
The arguments about the \probability distribution on quantum states"
were of an an illary nature. The problem is how to generalize the notion
of a quantum state to in lude lassi al probability distributions. The result
we have obtained (the last expression in (10.2)) depends only on the density
matrix, so that we may postulate that generalized quantum states and den-
sity matri es be the same. If a state is given by a density matrix of rank 1
(i.e., � = j�ih�j), then it is said to be pure; if it is given by a general density
matrix, it is alled mixed.
De�nition 10.1. For a quantum state given by a density matrix � and a
subspa eM, the probability of the \event" M equals P(�;M) = Tr(��
M
).
Diagonal matri es orrespond to lassi al probability distributions on
the set of basis ve tors. Indeed, onsider the quantum probability asso iated
with the diagonal matrix � =
P
j
w
j
jjihjj and the subspa e M spanned by
a subset of basis ve tors M . This probabilitity an also be obtained by the
lassi al formula: P(�;M) = Pr(w;M). From the physi al point of view,
a lassi al system is a quantum system that supports only diagonal density
matri es (see dis ussion of de oheren e in the next se tion). A state of su h
a system may be denoted as
(10.3) � =
X
j
w
j
� (j):
8
A tually, this is an operator rather than a matrix, although the term \density matrix" is
traditional. In the sequel, we will often have in mind a matrix, i.e., an operator expressed in a
parti ular basis.
96 2. Quantum Computation
Mathemati ally, this is just a di�erent notation of the probability distribu-
tion w. It is onvenient when we need to simultaneously deal with lassi al
and quantum systems.
Now we ontinue the omparison of the properties of lassi al and quan-
tum probability; for the latter we shall now understand the general de�nition
in terms of a density matrix. (Properties 1
q
and 2
q
remain valid.)
Classi al probability Quantum probability
Properties
3. Suppose a probability distribution of the
form w
jk
= w
(1)
j
w
(2)
k
is spe i�ed on the set
N = N
1
� N
2
. Consider two sets of out-
omes, M
1
� N
1
, M
2
� N
2
. Then the
probabilities multiply: Pr(w;M
1
�M
2
) =
Pr(w
(1)
;M
1
) Pr(w
(2)
;M
2
).
3
q
. Suppose a density matrix of
the form �
1
�
2
is de�ned on the
spa e N = N
1
N
2
. Consider two
subspa es, M
1
� N
1
, M
2
� N
2
.
Then the probabilities likewise mul-
tiply: P(�
1
�
2
;M
1
M
2
) =
P(�
1
;M
1
)P(�
2
;M
2
).
4: Consider a joint probability distribution
on the set N
1
�N
2
. The event we are inter-
ested in does not depend on the out ome
in the se ond set, i.e., M =M
1
�N
2
. The
probability of su h an event is expressed by
a \proje tion" of the distribution onto the
�rst set: Pr(w;M
1
� N
2
) = Pr(w
0
;M
1
),
where w
0
j
=
P
k
w
jk
.
4
q
. In the quantum ase, the restri -
tion to one of the subsystems is de-
s ribed by taking a partial tra e (see
below). Thus, even if the initial state
was pure, the resulting state of the
subsystem may turn out to be mixed:
P(�;M
1
N
2
) = P(Tr
N
2
�;M
1
).
De�nition 10.2. Let X 2 L(N
1
N
2
) = L(N
1
) L(N
2
). The partial
tra e of the operator X over the spa e N
2
is de�ned as follows: if X =
P
m
A
m
B
m
, then Tr
N
2
X =
P
m
A
m
(TrB
m
).
Due to the universality property of the tensor produ t (see p. 55), the
partial tra e does not depend on the hoi e of summands in the represen-
tation X =
P
m
A
m
B
m
. This may seem somewhat obs ure, so we will
give a dire t proof. Let us hoose orthonormal bases in the spa es N
1
, N
2
and express the partial tra e in terms of the matrix elements X
jj
0
kk
0
=
hj; j
0
jXjk; k
0
i. Let
A
m
=
X
j;k
a
m
jk
jjihkj and B
m
=
X
j
0
;k
0
b
m
j
0
k
0
jj
0
ihk
0
j:
Then
X =
X
j;j
0
;k;k
0
X
jj
0
kk
0
jj; j
0
ihk; k
0
j =
X
m
A
m
B
m
=
X
j;j
0
;k;k
0
;m
a
m
jk
b
m
j
0
k
0
jj; j
0
ihk; k
0
j;
so the partial tra e equals
Tr
N
2
X =
X
m
X
j;k
a
m
jk
�
X
l
b
m
ll
�
jjihkj =
X
j;k
X
l
X
jlkl
jjihkj:
10. Quantum probability 97
Let us onsider an example where taking the partial tra e of the density
matrix orresponding to a pure state leads to the density matrix orrespond-
ing to a mixed state.
Let N
1
= N
2
= B and � = j ih j, where j i =
1
p
2
(j0; 0i + j1; 1i). In
this ase � =
1
2
P
a;b
ja; aihb; bj, thus we obtain
Tr
N
2
� =
1
2
X
a
jaihaj =
�
1=2 0
0 1=2
�
:
This matrix orresponds to a mixed state (pure states orrespond to matri es
of rank 1). Moreover, this mixed state is equivalent to a lassi al probability
distribution: 0 and 1 have probabilities 1/2. Thus, dis arding the se ond
qubit yields a purely lassi al probability distribution on the �rst qubit.
Proposition 10.1. An arbitrary mixed state � 2 L(N ) an be represented
as the partial tra e Tr
F
(j ih j) of a pure state of a larger system, j i 2
NF . Su h j i is alled a puri� ation of �. (We may assume that dimF =
dimN .)
Proof. Set F = N
�
. Sin e � is a nonnegative (= positive semide�nite)
Hermitian operator, there exists
p
� 2 L(N ) = N N
�
. More expli itly,
let us hoose an orthonormal basis fj�
j
ig in whi h � is diagonal, i.e., � =
P
j
p
j
j�
j
ih�
j
j. Then
p
� =
P
j
p
p
j
j�
j
ih�
j
j.
Let us regard
p
� as a ve tor of the spa e N N
�
:
j
p
�i = j i =
X
j
p
p
j
j�
j
i j�
j
i; where j�
j
i = h�
j
j 2 N
�
:
This ve tor satis�es the desired requirements, i.e., Tr
F
(j ih j) = �. Indeed,
j ih j =
X
jk
p
p
j
p
k
�
j�
j
i j�
j
i
��
h�
k
j h�
k
j
�
:
Only terms with j = k ontribute to the partial tra e. Therefore
Tr
N
�
(j ih j) =
X
j
p
j
j�
j
ih�
j
j = �:
�
[2!℄ Problem 10.2. Consider a pure state j i 2 N F . Show that the
so- alled S hmidt de omposition holds:
j i =
X
j
�
j
j�
j
i j�
j
i;
where 0 < �
j
� 1, and the sets of ve tors fj�
j
ig � N and fj�
j
ig � F are
orthonormal.
98 2. Quantum Computation
Note that the numbers �
2
j
are the nonzero eigenvalues of the partial
tra es � = Tr
F
(j ih j) and �
0
= Tr
N
(j ih j). (Hen e the nonzero eigenval-
ues of � and �
0
oin ide.) The number of su h eigenvalues equals the rank of
� and �
0
. For example, if rank(�) = 1, the S hmidt de omposition onsists
of one term, and vi e versa. Thus the state � = Tr
F
(j ih j) is pure if and
only j i is a produ t state, i.e., j i = j�i j�i. In general, rank(�) is the
smallest dimension of the auxiliary spa e F whi h allows a puri� ation of �.
[2!℄ Problem 10.3 (\Puri� ation is unique up to unitary equivalen e").
Let j
1
i; j
2
i 2 N F be two pure states su h that Tr
F
(j
1
ih
1
j) =
Tr
F
(j
2
ih
2
j). Prove that j
2
i = (I
N
U)j
1
i for some unitary opera-
tor U on the spa e F .
10.3. Distan e fun tions for density matri es. In pra ti e, various
mixed states are always spe i�ed with some pre ision, so we need to somehow
measure \distan e" between density matri es. What would the most natural
de�nition of this distan e be? To begin with, let us ask the same question
for probability distributions.
Let w be the probability distribution of an out ome produ ed by some
devi e. Suppose that the devi e is faulty, i.e., with some probability " it goes
ompletely wrong, but with probability 1 � " it works as expe ted. What
an one tell about the a tual probability distribution w
0
of the out ome of
su h a devi e? The answer is
(10.4)
X
j
jw
0
j
� w
j
j � 2":
Conversely, if the inequality (10.4) is true, we an represent w
0
as the prob-
ability distribution produ ed by a pipeline of two pro esses: the �rst gener-
ates j a ording to the distribution w, whereas the se ond alters j with total
probability � ". We on lude that the natural distan e between probability
distributions is given by the `
1
norm, kw � w
0
k
1
=
P
j
jw
0
j
� w
j
j. Now we
will generalize this de�nition to arbitrary density matri es.
De�nition 10.3. The tra e norm of an operator A 2 L(N ) is
(10.5) kAk
tr
= Tr
�
p
A
y
A
�
:
For Hermitian operators, the tra e norm is the sum of the moduli of the
eigenvalues.
10. Quantum probability 99
[2℄ Problem 10.4. Verify that (10.5) a tually de�nes a norm. Prove that
kAk
tr
= sup
B 6=0
jTrABj
kBk
= max
U2U(N )
jTrAU j;(10.6)
kAk
tr
= inf
(
X
k
j�
k
i
j�
k
i
:
X
k
j�
k
ih�
k
j = A
)
(10.7)
(kXk denotes the operator norm; f. De�nition 8.2 on p. 71).
[1℄ Problem 10.5. Verify that the tra e norm has the following properties:
a) kABk
tr
; kBAk
tr
� kAk
tr
kBk, b) jTrAj � kAk
tr
,
) kTr
M
Ak
tr
� kAk
tr
, d) kABk
tr
= kAk
tr
kBk
tr
.
The following lemma shows why the tra e norm for density matri es an
be regarded as the analogue of the `
1
-norm for probability distributions.
Lemma 10.2. Let N =
L
j
N
j
be a de omposition of N into the dire t sum
of mutually orthogonal subspa es. Then for any pair of density matri es �
and ,
X
j
jP(�;N
j
)�P( ;N
j
)j � k�� k
tr
:
Proof. The left-hand side of the inequality an be represented in the form
Tr((�� )U), where U =
P
j
(��
N
j
). It is lear that U is unitary. We then
apply the representation of the tra e norm in the form (10.6). �
There is another ommonly used distan e fun tion on density matri es,
alled the �delity distan e. Let �; 2 L(N ). Consider all possible puri�-
ations of � and over an auxiliary spa e F of dimension dimF = dimN ;
these are pure states j�i; j�i 2 N F . Then the �delity distan e between �
and is
d
F
(�; )
def
= min
n
j�i � j�i
: Tr
F
(j�ih�j) = �; Tr
F
(j�ih�j) =
o
:
It is related to a quantity alled �delity :
(10.8) F (�; )
def
= max
n
�
�
h�j�i
�
�
2
: Tr
F
(j�ih�j) = �; Tr
F
(j�ih�j) =
o
:
(One an show that the ondition dimF = dimN in these de�nitions an
be relaxed: it is suÆ ient to require that dimF � maxfrank(�); rank( )g.
Thus, any auxiliary spa e F will do, as long as it allows puri� ations of �
and .)
Problem 10.6. Prove that
[1℄ a) d
F
(�; ) =
r
2
�
1�
p
F (�; )
�
;
100 2. Quantum Computation
[2℄ b) F (�; ) =
p
�
p
2
tr
;
[3℄ )
�
1�
k�� k
tr
2
�
2
� F (�; ) � 1�
�
k�� k
tr
2
�
2
.
11. Physi ally realizable transformations of
density matri es
In this se tion we introdu e a formalism for the des ription of irreversible
quantum pro esses. We will not use it in full generality (so some of the
results are super uous), but the basi on epts and examples will be helpful.
11.1. Physi ally realizable superoperators: hara terization.
All transformations of density matri es we will en ounter an be represented
by linear maps between operator spa es, L(N ) ! L(M). A general linear
map of this type is alled a superoperator. We now des ribe those superop-
erators that are admissible from the physi al point of view.
1. A unitary operator takes the density matrix of a pure state � = j�ih�j
to the matrix �
0
= U j�ih�jU
y
. It is natural to assume (by linearity) that
su h a formula also yields the a tion of a unitary operator on an arbitrary
density matrix:
� 7
U
�! U�U
y
:
2. A se ond type of transformation is the operation of taking the partial
tra e. If � 2 L(NF), then the operation of dis arding the se ond subsystem
is des ribed by the superoperator
Tr
F
: � 7! Tr
F
�:
3. We re all that it has been useful to us to borrow qubits in the state j0i.
Let the state � 2 L(B
n
). We onsider the isometri (preserving the inner
produ t) embedding V : B
n
! B
N
in a spa e of larger dimension, given
by the formula j�i 7
V
�! j�i j0
N�n
i. The density matrix � is transformed
thereby into �j0
N�n
ih0
N�n
j. For any isometri embedding V we similarly
obtain a superoperator V � V
y
that a ts as follows:
V � V
y
: � 7! V �V
y
:
We postulate that a physi ally realizable superoperator is a omposition
of an arbitrary number of transformations of types 2 and 3 (type 1 is a
spe ial ase of 3).
[3!℄ Problem 11.1. Prove that a superoperator T is physi ally realizable if
and only if it has the form
(11.1) T = Tr
F
(V � V
y
) : � 7! Tr
F
(V �V
y
);
11. Physi ally realizable transformations of density matri es 101
where V : N ! N F is an isometri embedding.
[2!℄ Problem 11.2 (\Operator sum de omposition"). Prove that a super-
operator T is physi ally realizable if and only if it an be represented in the
form
(11.2) T =
X
m
A
m
� A
y
m
: � 7!
X
m
A
m
�A
y
m
; where
X
m
A
y
m
A
m
= I:
The operation of taking the partial tra e means forgetting (dis arding)
one of the subsystems. We show that su h an interpretation is reasonable,
spe i� ally that the subsequent fate of the dis arded system in no way in-
uen es the quantities hara terizing the remaining system. Let us take a
system onsisting of two subsystems, whi h is in some state � 2 L(N F).
If we dis ard the se ond subsystem (to the trash), then it will be subje ted
to un ontrollable in uen es. Suppose we apply some operator U to the �rst
subsystem. We will then obtain a state = (U Y )�(U Y )
y
, where Y
is an arbitrary unitary operator (the a tion of the trash bin on the trash).
If we wish to �nd the probability for some subspa e M � N pertaining to
the �rst subsystem (the trash doesn't interest us), then the result does not
depend on Y and equals
P( ;MF) = P(Tr
F
;M) = P
�
U(Tr
F
�)U
y
;M
�
:
Here the �rst equality is the property 4
q
of quantum probability, whereas
the se ond equality represents a new property:
(11.3) Tr
F
�
(U Y )�(U Y )
y
�
= U(Tr
F
�)U
y
:
[1℄ Problem 11.3. Prove the identity (11.3).
[2!℄ Problem 11.4. Let us write a superoperator T : L(N )! L(M) in the
oordinate form:
T
�
jjihkj
�
=
X
j
0
;k
0
T
(j
0
j)(k
0
k)
jj
0
ihk
0
j:
Prove that the physi al realizability of T is equivalent to the set of three
onditions:
a)
P
l
T
(lj)(lk)
= Æ
jk
(Krone ker symbol);
b) T
�
(j
0
j)(k
0
k)
= T
(k
0
k)(j
0
j)
;
) The matrix
�
T
(j
0
j)(k
0
k)
�
is nonnegative (ea h of the index pairs is re-
garded as a single index).
[3!℄ Problem 11.5. Prove that a superoperator T : L(N ) ! L(M) is
physi ally realizable if and only if it satis�es the following three onditions:
a) Tr(TX) = TrX for any X 2 L(N );
b) (TX)
y
= TX
y
for any X 2 L(N );
102 2. Quantum Computation
) T is ompletely positive. Namely, for any additional spa e G the super-
operator T I
L(G)
: L(N G)! L(MG) maps nonnegative operators
to nonnegative operators.
11.2. Cal ulation of the probability for quantum omputation.
Now, sin e we have the general de�nitions of quantum probability and of a
physi ally realizable transformation of density matri es, there are two ways
to al ulate the probability that enters the de�nition of quantum omputa-
tion. Suppose we use a supplementary subsystem. After we no longer need
it, we an dis ard it to the trash and, in ounting the probability, take the
partial tra e over the state spa e of the supplementary subsystem. Or else
we may hold all the trash until the very end and onsider the probability
of an event of the form M
1
N
2
(on e we have stopped using the se ond
subsystem, no details of its existen e are of any importan e to us and we are
not interested in what pre isely happens to it in the trash bin). As already
stated, these probabilities are equal: P(�;M
1
N
2
) = P(Tr
N
2
�;M
1
).
Remark 11.1. It is not diÆ ult to de�ne a more general model of quantum
omputation in whi h suitable physi ally realizable superoperators (not ne -
essarily orresponding to unitary operators) serve as the elementary gates.
Su h a model of omputation with mixed states is more adequate in the phys-
i al situation where the quantum omputer intera ts with the surrounding
environment. In parti ular, one an onsider things like ombination of las-
si al and quantum omputation. From the omplexity point of view, the new
model is polynomially equivalent to the standard one, if a omplete basis is
used in both ases. (Completeness in the new model is most omprehensively
de�ned as the possibility to e�e t arbitrary unitary operators on \en oded
qubits"; f. Remark 8.2.) We also note that in the model of omputation
with mixed states a more natural de�nition of a probabilisti subroutine is
possible. We will not give this de�nition here, but refer interested readers
to [4℄.
11.3. De oheren e. The term \de oheren e" is generally used to denote
irreversible degradation of a quantum state aused by its intera tion with the
environment. This ould be an arbitrary physi ally realizable superoperator
that takes pure states to mixed states. For the purpose of our dis ussion,
de oheren e means the spe i� superoperator D that \forgets" o�-diagonal
matrix elements:
� =
X
j;k
�
jk
jjihkj 7
D
�!
X
k
�
kk
jkihkj:
This superoperator is also known as an extreme ase of a \phase damping
hannel". We will show that it is physi ally realizable. For simpli ity, let us
assume that D a ts on a single qubit.
11. Physi ally realizable transformations of density matri es 103
The a tion of D on a density matrix � an be performed in three steps.
First, we append a null qubit:
� 7! � j0ih0j:
Then we \ opy" the original qubit into the an illa. This an be a hieved by
applying the operator �(�
x
) : ja; bi 7! ja; a� bi. We get
� j0ih0j 7
�(�
x
)
���!
X
k
�
jk
jj; jihk; kj:
Finally, we take the partial tra e over the an illa, whi h yields the diagonal
matrix
X
k
�
kk
jkihkj:
Warning. The \ opying operation" we onsidered:
jji 7! jj; ji;
X
j;k
�
jk
jjihkj 7!
X
k
�
jk
jj; jihk; kj
(the omposition of the �rst two transformations) in fa t opies only the
basis states. We note that the opying of an arbitrary quantum state j�i 7!
j�i j�i is a nonlinear operator and so annot be physi ally realized. (This
statement is alled a \no- loning theorem".) We will take the liberty of
alling the operator of the form
X
j
j
j�
j
i 7!
X
j
j
j�
j
i j�
j
i
opying relative to the orthonormal basis fj�
j
ig.
So, the de oheren e superoperator D translates any state into a lassi al
one (with diagonal density matrix) by opying qubits. This an be inter-
preted as follows: if we onstantly observe a quantum system (make opies),
then the system will behave like a lassi al one. Thus the opying operation,
together with \forgetting" about the opy (i.e., the partial tra e), provides
a on eptual link between quantum me hani s and the lassi al pi ture of
the world.
Remark 11.2 (De oheren e in physi s). In Nature, de oheren e by
\ opying to the environment" is very ommon and, of ourse, does not
require a human observer. Let us onsider one famous example of quantum
phenomenon | an interferen e pattern formed by a single photon. It is
known from lassi al opti s that a light beam passing through two parallel
slits forms a pattern of bright and dark stripes on a s reen pla ed behind
the slits. This pattern an be re orded if one uses a photographi �lm
as the s reen. When the light is dim, the photographi image onsists of
random dots produ ed by individual photons (i.e., quanta of light). The
104 2. Quantum Computation
probability for a dot to appear at a given position
9
x is the probability that
a photon hits the �lm at the point x. What will happen if the light is so
dim that only one photon rea hes the �lm? Quantum me hani s predi ts
that the photon arrives in a ertain superposition j i =
P
x
x
jxi, so the
above probability equals j
x
j
2
. Thus, the quantum state of the photon is
transformed into a lassi al obje t | a dot, lo ated at a parti ular pla e
(although the appearan e of the dot at a given position x o urs with the
probability related to the orresponding amplitude
x
). When and how does
this transition happen?
When the photon hits the �lm, it breaks a hemi al bond and generates a
defe t in a light-sensitive grain (usually, a small rystal of silver ompound).
The photon is delo alized in spa e, so a superposition of states with the de-
fe t lo ated at di�erent pla es is initially reated. Basi ally, this is the same
state j i =
P
x
x
jxi, but x now indi ates the position of the defe t. The
transition from the quantum state j i to the lassi al state
P
x
j
x
j
2
jxihxj
is de oheren e. It o urs long before anyone sees the image, even before the
�lm is developed. About every 10
�12
se onds sin e the defe t was reated,
it s atters a phonon (a quantum of soni vibrations). This has the e�e t of
\ opying" the state of the defe t to phonon degrees of freedom relative to
the position basis.
The above explanation is based on the assumption that the phonon
s attering (or whatever auses the de oheren e) is irreversible. But what
does this assumption mean if the s attering is just a unitary pro ess whi h
o urs in the �lm? In the pre eding mathemati al dis ussion, the irreversible
step was the partial tra e; it was justi�ed by the fa t that the opy was
\dis arded", i.e., never used again. On the ontrary, the s attered phonon
stays around and an, in prin iple, s atter ba k to \undo the opying".
In reality, however, the s attering does not reverse by itself. One reason
is that the phonons intera t with other phonons, ausing the \ opies" to
multiply. The quantum state qui kly be omes so entangled that it annot
be disentangled. (Well, this argument is more empiri al than logi al; it
basi ally says that things an be lost and never found | a true fa t with
no \proof" whatsoever. For some parti ular lasses of Hamiltonians, some
assertions about irreversibility, like \information es apes to in�nity", an
be formulated mathemati ally. Proving this kind of statements is a diÆ ult
and generally unsolved problem.)
9
We think of the position on the �lm as a dis rete variable; spe i� ally, it refers to a grain of
light-sensitive substan e. The whole grain either turns dark (if it aught a photon) or stays white
when the �lm is developed. Speaking about photons, we have oversimpli�ed the situation for
illustrative purposes. In modern �lms, a single photon does not yet produ e a suÆ ient hange to
be developed, but several (3 or more) photons per grain do. For single photon dete tion, physi ists
use other kinds of devi es, e.g., ones based on semi ondu tors.
11. Physi ally realizable transformations of density matri es 105
Some irreversibility postulate or assumption is ne essary to give an in-
terpretation of quantum me hani s, i.e., to introdu e a lassi al observer.
It seems that the exa t nature of irreversibility is the real question behind
many philosophi al debates surrounding quantum me hani s. Another thing
that is not fully understood, is the meaning of probability in the physi al
world. Both problems exist in lassi al physi s as well; quantum me hani s
just makes them more evident.
Fortunately (espe ially to mathemati ians), the theory of quantum om-
putation deals with an ideal world where nothing gets lost. If we observe
something, we an also \un-observe", unless we expli itly hoose to dis ard
the result or to keep it as the omputation output. As far as probabilities
are on erned, we deal with them formally rather than trying to explain
them by something else. [End of remark℄
In the ase of a single qubit, the de oheren e superoperator (the o�-
diagonal matrix elements set to zero) an be also obtained if we apply the
operator �
z
with probability 1=2:
� 7!
1
2
�+
1
2
�
z
��
z
:
Su h a pro ess is alled random dephasing: the state j1i is multiplied by the
phase fa tor �1 with probability 1=2. Thus, the dephasing leads likewise to
the situation that the system behaves lassi ally.
[2!℄ Problem 11.6. Suppose we have a physi ally realizable superoperator
T : L(N ) ! L(N F) with the following property: Tr
F
(T�) = � for any
pure state �. Prove that TX = X (for any operator X), where is a
�xed density matrix on the spa e F .
Think of N as a system one wants to observe, and F as an \observation
re ord". Then the ondition Tr
F
(T�) = � indi ates that the superoperator T
does not perturb the system, whereas TX = X means that the obtained
re ord does not arry any information about �. Thus, it is impossible to
get any information about an unknown quantum state without perturbing the
state.
11.4. Measurements. In des ribing quantum algorithms, it is often nat-
ural (though not mathemati ally ne essary) to assume that, together with
a quantum omputational system, a lassi al one might also be used. An
important type of intera tion between quantum and lassi al parts is mea-
surement of a quantum register. It yields a lassi al \re ord" (out ome),
while the quantum register may remain in a modi�ed state, or may be de-
stroyed.
106 2. Quantum Computation
Consider a system onsisting of two parts, a quantum part (N ) and
a lassi al part (K). The density matrix is diagonal with respe t to the
lassi al indi es, i.e.,
� =
X
j;k;l
�
jkll
(jjihkj) (jlihlj) =
X
l
w
l
(l)
jlihlj;
where w
l
=
P
j
�
jjll
is the probability of having the lassi al state l, and
the operator
(l)
= w
�1
l
P
j;k
�
jkll
possesses all the properties of a density
matrix. In this manner, quantum- lassi al states are always de omposed
into \ onditional" (by analogy with onditional probability) density matri es
(l)
. In su h a ase we will use in su h a ase a spe ial notation similar to
that of equation. (10.3): � =
P
l
w
l
� (
(l)
; l) =
P
l
(w
l
(l)
; l). (Here the dot
does not have spe ial meaning as, e.g., in (11.1); it just indi ates that w
l
is
a fa tor rather than a fun tion.)
Suppose we have a set of mutually ex lusive possibilities, whi h is ex-
pressed as a de omposition of the state spa e into a dire t sum of pair-
wise orthogonal subspa es, N =
L
j2
L
j
, where = f1; : : : ; rg is the
set of orresponding lassi al out omes. (We say \mutually ex lusive" be-
ause, if a subspa e L
0
is orthogonal to a subspa e L
00
, and � 2 L(L
0
), then
P(�;L
00
) = 0.)
A transformation of density matri es, that we will all a proje tive mea-
surement, is su h that for states of the subspa e L
j
, a \measuring devi e"
puts the number of the state j into the lassi al register:
(11.4) if j�i 2 L
j
, then j�ih�j 7! (j�ih�j; j):
Although the measurement maps the spa e L(N ) to L(N K), the result is
always diagonal with respe t to the lassi al basis in K. Therefore we an as-
sume that the measurement is a linear map R : L(N )! L(N )�f1; : : : ; rg =
L
r
j=1
L(N ); su h linear maps will also be alled superoperators.
By linearity, equation (11.4) implies that R� = (�; j) for any � 2 L(L
j
).
However, to de�ne the a tion of R on an arbitrary �, we have to use the
ondition that R is physi ally realizable.
[3℄ Problem 11.7. Prove that the superoperator
(11.5) R : � 7!
X
j
�
�
L
j
��
L
j
; j
�
is the only physi ally realizable superoperator of the type L(N )! L(N )�
f1; : : : ; rg that is onsistent with (11.4).
Thus we arrive at our �nal de�nition.
11. Physi ally realizable transformations of density matri es 107
De�nition 11.1. A proje tive measurement is a superoperator of the form
(11.5), whi h an also be written as follows:
(11.6) � 7!
X
j
P(�;L
j
) �
�
(j)
; j
�
;
where
(j)
=
�
L
j
��
L
j
P(�;L
j
)
.
We may say that P(�;L
j
) is the probability of getting a spe i�ed out-
ome j. If the out ome j is a tually obtained, the state of the quantum
system after the measurement is
(j)
. If we measure pure states, i.e., if
� = j�ih�j, then
(j)
= j�
j
ih�
j
j, where
j�
j
i =
�
L
j
j�i
p
P(j�i;L
j
)
:
Let us give a simple example of a measurement. We opy a qubit (rela-
tive to the lassi al basis) and apply the de oheren e superoperator to the
opy. In this example, �
L
0
= j0ih0j, �
L
1
= j1ih1j, and the measurement
superoperator is
� =
�
�
00
�
01
�
10
�
11
�
7! �
00
�
�
j0ih0j; 0
�
+ �
11
�
�
j1ih1j; 1
�
:
We have onsidered nondestru tive measurements. A destru tive mea-
surement an be des ribed as a nondestru tive one after whi h the measured
system is dis arded. This orresponds to the transition from a quantum state
� to the lassi al state
P
j
P(�;L
j
) � (j) (where (j) is a short notation for
jjihjj ). A general physi ally realizable transformation of a quantum state
to a lassi al state is given by the formula
(11.7) � 7!
X
k
Tr(�X
k
) � (k);
where X
k
are nonnegative Hermitian operators satisfying
P
k
X
k
= I. Su h
a set of operators fX
k
g is alled a POVM (positive operator-valued mea-
sure), whereas the superoperator (11.7) is alled a POVM measurement.
Remark 11.3. Nondestru tive POVM measurements ould also be de�ned,
but there is no su h de�nition that would be natural enough. An ex eption
is the ase where the operators X
k
ommute with ea h other. Then they
an be represented as linear ombination of proje tions �
j
with nonnegative
oeÆ ients, whi h an be interpreted as \ onditional probabilities". Su h
measurements (in the nondestru tive form) and their realizations will be
studied in the next se tion.
108 2. Quantum Computation
[2!℄ Problem 11.8. Prove that any POVM measurement an be repre-
sented as an isometri embedding into a larger spa e, followed by a proje tive
measurement.
[3!℄ Problem 11.9 (\Quantum teleportation"; f. [4℄). Suppose we have
three qubits: the �rst is in an arbitrary state � (not known in advan e),
whereas the se ond and third are in the state j�
00
i =
1
p
2
(j0; 0i + j1; 1i).
On the �rst two qubits we perform the measurement orresponding to the
orthogonal de omposition
B
2
= C (j�
00
i)� C (j�
01
i)� C (j�
10
i)� C (j�
11
i);
where
j�
ab
i =
1
p
2
X
(�1)
b
j ; � ai:
Find a way to restore the initial state � from the remaining third qubit and
the measurement out ome (a; b). Write the whole sequen e of a tions (the
measurement and the re overy) in the form of a quantum ir uit.
(Informally, this pro edure an be des ribed as follows. Suppose that
Ali e wants to transmit to Bob
13
a quantum state � by a lassi al ommu-
ni ation hannel, e.g., over the phone. It turns out that this is possible,
provided Ali e and Bob have prepared in advan e the state j�
00
i so that
ea h of them keeps half of the state, i.e., a single qubit. Ali e performs the
measurement and tells the result to Bob. Then Bob translates his qubit to
the state �. Thus the unknown state gets \teleported".)
11.5. The superoperator norm. At �rst sight, it seems natural to de�ne
a norm for superoperators of type L(N ;M) by analogy with the operator
norm,
(11.8) kTk
1
= sup
X 6=0
kTXk
tr
kXk
tr
;
and to use this norm to measure distan e between physi ally realizable
transformations of density matri es. (Of ourse, the norm is applied to
the di�eren e between two physi ally realizable superoperators, whi h is
not physi ally realizable.) However, the use of the norm (11.5) turns out to
be in onvenient be ause it is \unstable" with respe t to the tensor produ t.
Let us explain this in more detail.
Suppose we want to hara terize the distan e between physi ally realiz-
able superoperators P;R 2 L(N ;M). From the physi al point of view, both
superoperators pertain to some quantum system, whi h is a part of the Uni-
verse. Certainly, we do not expe t that the answer to our problem would
13
These two hara ters are en ountered in almost every paper on quantum information
theory.
11. Physi ally realizable transformations of density matri es 109
depend on what happens in some other galaxy, and even on the existen e
of that galaxy. In other words, we expe t that the distan e between P and
R is the same as the distan e between P I
L(G)
and R I
L(G)
, whatever
additional spa e G we hoose. But this is not always the ase if we use
kP �Rk
1
as the distan e fun tion.
Example 11.1. Consider the superoperator whi h transposes a matrix,
T : jjihkj 7! jkihjj (j; k = 0; 1):
It is obvious that kTk
1
= 1. However,
T I
L(B)
1
= 2. Indeed, let the
superoperators T I
L(B)
a t on the operator X =
P
j;k
jj; jihk; kj; then
kXk
tr
= 2 but
(T I
L(B)
)X
tr
= 4. Hen e
T I
L(B)
1
� 2. The upper
bound
T I
L(B)
1
� 2 is also easily obtained.
One may argue that this is not quite \bad" a ounterexample yet, sin e
T does not have the form P�R, where P and R are physi ally realizable (for
example, be ause Tr(TI
B
) 6= 0 ). Consider, however, another superoperator,
Q : � 7! (T�) �
z
:
It is easy to see that kQk
1
= kTk
1
k�
z
k
tr
= 2; similarly,
Q I
L(B)
1
= 4.
The superoperator Q satis�es the following two onditions:
a) Tr(QX) = 0; and b) (QX)
y
= QX
y
(for any X):
It is possible to show (using the result of Problem 11.4) that any superop-
erator with these properties an be represented as (P � R), where P and
R are physi ally realizable, and is a positive real number.
Fortunately, it turns out (as we will prove below) that the pathology of
Example 11.1 has a restri tion by dimension. Spe i� ally, if dimG � dimN ,
then kT I
G
k
1
= kTk
}
, where the quantity kTk
}
does not depend on G.
Before proving this assertion, let us examine its onsequen es.
First, it is lear that the quantity kTk
}
de�ned in this manner is a norm.
Se ond, let us noti e that kT Rk
1
� kTk
1
kRk
1
, sin e the tra e norm is
multipli ative with respe t to the tensor produ t. Substituting the identity
operator for R, we obtain kTk
}
� kTk
1
. Similarly, if we repla e T by
T I
L(G)
, and R by R I
L(G)
(where the dimension of G is large enough),
we get kT Rk
}
� kTk
}
kRk
}
.
Third, it follows from the de�nition that kTRk
1
� kTk
1
kRk
1
; therefore
kT Rk
1
= k(T I)(I R)k
1
� kT Ik
1
kI Rk
1
:
Repla ing T and R by T I
L(G)
and R I
L(G)
(resp.), we get kT Rk
}
�
kTk
}
kRk
}
, whi h is the opposite to the previous inequality. Hen e the
norm k � k
}
is multipli ative with respe t to the tensor produ t,
(11.9) kT Rk
}
= kTk
}
kRk
}
:
110 2. Quantum Computation
In order to prove that
T I
L(G)
1
stabilize when dimG � dimN , we
give another de�nition of the stable superoperator norm kTk
}
.
First, we note that an arbitrary superoperator T : L(N ) ! L(M) an
be represented in the form T = Tr
F
(A � B
y
), where A;B 2 L(N ;MF)
(re all that A �B
y
denotes the superoperator X 7! AXB
y
; f. Problem 11.1).
Without loss of generality we may assume that dimF = (dimN )(dimM).
In fa t, the dimension of F an be made as low as the rank of the matrix
(T
(j
0
j)(k
0
k)
) de�ned in Problem 11.4.
De�nition 11.2. Consider all representations of T : L(N )! L(M) in the
form T = Tr
F
(A � B
y
). Then
kTk
}
= inf
n
kAk kBk : Tr
F
(A �B
y
) = T
o
:
It follows from Theorem 11.1 below that this quantity does not depend
on the hoi e of the auxiliary spa e F , provided at least one representation
T = Tr
F
(A
0
� B
y
0
) exists. For the minimization of kAk kBk, it suÆ es to
onsider operators with norms kAk � kA
0
k and kBk � kB
0
k. The set of
su h pairs (A;B) is ompa t, hen e the in�mum is a hieved.
Theorem 11.1. If dimG � dimN , then kT I
G
k
1
= kTk
}
.
Proof. Let T = Tr
F
(A � B
y
). Using the properties of the tra e norm from
Problem 10.5, we obtain
(T I
L(G)
)X
tr
=
Tr
F
�
(A I
G
)X(B
y
I
G
)
�
tr
�
(A I
G
)X(B
y
I
G
)
tr
� kAk kBk kXk
tr
:
Hen e kTk
}
�
T I
L(G)
1
.
Proving the opposite inequality is somewhat more ompli ated. With-
out loss of generality we may assume that kTk
}
= 1, and the ini�mum in
De�nition 11.2 is a hieved when kAk = kBk = 1.
We show at �rst that there exist three density matri es �; 2 L(N ) and
� 2 L(F) su h that Tr
M
(A�A
y
) = Tr
M
(B B
y
) = � . Let
K = Ker(A
y
A� I
N
); L = Ker(B
y
B � I
N
);
E =
n
Tr
M
(A�A
y
) : � 2 D(K)
o
; F =
n
Tr
M
(B B
y
) : 2 D(L)
o
;
where D(L) denotes the set of density matri es on the subspa e L. Then
E;F � D(F), so that in pla e of � we an put any element of E \ F .
We prove that E \ F 6= ;. Sin e E and F are ompa t onvex sets, it
suÆ es to prove that there is no hyperplane that would separate E from
F . In other words, there is no Hermitian operator Z 2 L(F) su h that
Tr(XZ) > Tr(Y Z) for all pairs of X 2 E and Y 2 F . This follows from
11. Physi ally realizable transformations of density matri es 111
the ondition that the value of kAk kBk is minimal; in parti ular, it annot
de rease under the transformation
A 7! (I
M
e
�tZ
)A; B 7! (I
M
e
tZ
)B;
where t is a small positive number.
Thus, let Tr
M
(A�A
y
) = Tr
M
(B�B
y
) = � 2 D(F), where �; 2 D(N ).
We an use the additional spa e G to onstru t puri� ations of � and ,
i.e., to represent them in the form � = Tr
G
(j�ih�j), = Tr
G
(j�ih�j), where
j�i; j�i 2 N G are unit ve tors (see Proposition 10.1). This is possible
due to the ondition dimG � dimN . We set X = j�ih�j. It is obvious that
kXk
tr
= 1.
We prove that
(T I
L(G)
)X
tr
� 1. If we set
X
0
= (T I
L(G)
)X; j�
0
i = (A I
G
)j�i; j�
0
i = (B I
G
)j�i;
then
X
0
= Tr
F
(j�
0
ih�
0
j); Tr
MG
(j�
0
ih�
0
j) = Tr
MG
(j�
0
ih�
0
j) = �:
From this it follows, �rst, that the ve tors j�
0
i and j�
0
i have unit length.
Se ond, there is a unitary operator U a ting on the spa e MG su h that
(U I
F
) j�
0
i = j�
0
i (see Problem 10.3). Consequently,
kX
0
k
tr
� jTrUX
0
j =
�
�
Tr
�
(U I
F
)j�
0
ih�
0
j
�
�
�
=
�
�
Tr(j�
0
ih�
0
j)
�
�
= 1:
�
Surprisingly enough, the superoperator norm is onne ted not only to
the tra e norm, but also to the �delity (see (10.8)).
[3℄ Problem 11.10. Let T = Tr
F
(A � B
y
), where A;B : N ! F M.
Prove that
kTk
2
}
= max
n
F
�
Tr
M
(A�A
y
); Tr
M
(B B
y
)
�
: �; 2 D(N )
o
;
where D(N ) denotes the set of density matri es on N .
Note that the operators �
0
= Tr
F
(A �B
y
) and
0
= Tr
M
(B B
y
) are not
density matri es: the ondition Tr�
0
= Tr
0
= 1 is not satis�ed. However,
the de�ninion of �delity and the result of Problem 10.6b (but not 10.6a
or 10.6 ) is valid for arbitrary nonnegative Hermitian operators.
The result of Problem 11.10 has been used in the study of the omplexity
lass QIP [38℄.
112 2. Quantum Computation
12. Measuring operators
A measuring operator is a generalization of an operator with quantum on-
trol. Su h operators are very useful in onstru ting quantum algorithms.
After mastering this tool, we will be ready to ta kle diÆ ult omputational
problems.
12.1. De�nition and examples. Consider a state spa e N K and �x
a de omposition of the �rst fa tor into a dire t sum of pairwise orthogonal
subspa es: N =
L
j2
L
j
( = f1; : : : ; rg). Ea h operator of the form
W =
X
j
�
L
j
U
j
is alled a measuring operator, where �
L
j
is the proje tion onto the subspa e
L
j
, and U
j
2 L(K) is unitary.
In order to justify the name \measuring", we onsider the following
pro ess. Let � 2 L(N ) be a quantum state we want to measure. We onne t
it to an instrument in the initial state j0i (we assume that in the state spa e
of the instrument, K, some �xed basis is hosen, e.g., K = B
n
). Then the
joint state of the system is des ribed by the density matrix � j0
m
ih0
m
j.
Now we apply the measuring operator W . We obtain the state
W
�
� j0
m
ih0
m
j
�
W
y
=
X
j
�
�
L
j
��
L
j
�
�
U
j
j0ih0jU
y
j
�
(we have used the de�ning properties of a proje tion: �
y
= �, �
2
= �).
Finally, we make the instrument lassi al by applying the de oheren e
transformation. This means that the matrix is diagonalized with respe t to
the se ond tensor fa tor. Let us see how the fa tor U
j
j0ih0jU
y
j
in the above
sum hanges:
U
j
j0ih0jU
y
j
7!
X
k
�
�
hkjU
j
j0i
�
�
2
jkihkj:
Thus we obtain a bipartite mixed state, whi h is lassi al in the se ond
omponent,
X
j
X
k
�
�
L
j
��
L
j
�
�
hkjU
j
j0i
�
�
2
; k
�
=
X
j
X
k
P(kjj) �
�
�
L
j
��
L
j
; k
�
:
Here we have introdu ed the onditional probabilities P(kjj) =
�
�
hkjU
j
j0i
�
�
2
.
We will see that they obey the usual rules of probability theory, provided all
measuring operators we use are de�ned with respe t to the same orthogonal
de omposition of the spa e N .
12. Measuring operators 113
Summing up, the whole pro edure orresponds to the transformation of
density matri es
T : � 7!
X
k;j
P(kjj) �
�
�
L
j
��
L
j
; k
�
:
We note that a proje tive measurement (as de�ned in the pre eding se tion)
is a spe ial ase of this, P(kjj) = Æ
kj
. The more general pro ess just de-
s ribed an be alled a \probabilisti proje tive measurement." It an also
be viewed as a nondestru tive version of the POVM measurement
� 7!
X
k
Tr(�X
k
) � (k); where X
k
=
X
j
P(kjj)�
L
j
:
Let us give a few examples of measuring operators.
1. The operator �(U) = �
0
I + �
1
U , a ting on the spa e B N , is
measuring.
1
0
. It is more interesting that �(U) is measuring also with respe t to the
se ond subsystem, N . Sin e U is a unitary operator, it an be de om-
posed into the sum of proje tions onto the eigenspa es: U =
P
j
�
j
�
L
j
,
j�
j
j = 1. Then
�(U) =
X
j
(�
0
+ �
j
�
1
)�
L
j
=
X
j
�
1 0
0 �
j
�
�
L
j
:
However, the onditional probabilities are trivial: P(0jj) = 1, P(1jj) =
0. Thus su h an operator, even though measuring a ording to the
de�nition, does not a tually measure anything.
Now we will try to modify the operator �(U) to make the onditional
probabilities nontrivial.
H
H
Physi ist's approa h. Let U be the operator
of phase shift for light as it passes through a
glass plate. We an split the light beam into two
parts by having it pass through a semitranspar-
ent mirror. Then one of the two beams passes
through the glass plate, after whi h the beams merge at another semitrans-
parent mirror (see the diagram). The resulting interferen e will allow us to
determine the phase shift.
A mathemati al variant of the pre eding example. The operator
H =
1
p
2
�
1 1
1 �1
�
serves as an analog of the semitransparent mirror. As is evident from the
diagram above, we need to apply it at the beginning and at the end. The
114 2. Quantum Computation
middle part is represented by the operator �(U) (the ontrolling qubit or-
responds to whether the photon passes through the plate or not). Thus we
obtain the operator
(12.1) �(U) = (H I) �(U) (H I) : B N ! B N :
If the initial ve tor has the form j i = j�i j�i (j�i 2 L
j
), then
�(U)j i = j�
0
i j�i, where
j�
0
i =
1
p
2
�
1 1
1 �1
��
1 0
0 �
j
�
1
p
2
�
1 1
1 �1
�
j�i =
1
2
�
1 + �
j
1� �
j
1� �
j
1 + �
j
�
j�i:
Therefore
�(U) =
X
j
R
j
z }| {
1
2
�
1 + �
j
1� �
j
1� �
j
1 + �
j
�
�
L
j
:
Now we al ulate the onditional probabilities. The eigenvalues have mod-
ulus 1, so that we an write �
j
= exp(2�i'
j
). As a onsequen e, we have
(12.2) P(0jj) =
�
�
h0jR
j
j0i
�
�
2
=
�
�
�
�
1 + �
j
2
�
�
�
�
2
=
1 + os(2�')
2
:
In the next se tion, the measuring operator (12.1) will be used to esti-
mate the eigenvalues of unitary operators. For this, we will need to apply
the operator �(U) several times to the same \obje t" (the spa e N ), but
with di�erent \instruments" ( opies of the spa e B). But �rst we must make
sure that this is orre t, in the sense that the probabilities multiply as they
should.
12.2. General properties. We will onsider measuring operators that
orrespond to a �xed orthogonal de omposition N =
L
j
L
j
.
1. The produ t of measuring operators is a measuring operator. Indeed,
let two su h operators be
W
(1)
=
X
j
R
(1)
j
�
L
j
and W
(2)
=
X
j
R
(2)
j
�
L
j
:
Inasmu h as �
L
j
�
L
k
= Æ
jk
�
L
k
, we have
W
(2)
W
(1)
=
X
j
R
(2)
j
R
(1)
j
�
L
j
:
2. The onditional probabilities for produ ts of measuring operators with
\di�erent instruments" are multipli ative. Spe i� ally, if R
(1)
=
~
R
1
I
K
2
,
R
(2)
= I
K
1
~
R
2
(both operators a t on K
1
K
2
), then P(k
1
; k
2
jj) =
12. Measuring operators 115
P(k
1
jj)P(k
2
jj). This equality follows immediately from the de�nition of
onditional probabilities and the obvious identity
�
h�
1
j h�
2
j
� �
U
1
U
2
� �
j�
1
i j�
2
i
�
= h�
1
jU
1
j�
1
i h�
2
jU
2
j�
2
i:
3. Formula of total probability. Let W =
P
R
j
�
L
j
be a measuring
operator. If we apply it to the state j0ih0j �, where � 2 L(N ), then the
resulting probability of the state k an be written in the form:
P
�
W
�
j0ih0j �
�
W
y
; C (jki) N
�
=
X
j
P(kjj)P(�;L
j
):
Proof. We have
W
�
j0ih0j �
�
W
y
= =
X
j
�
R
j
j0ih0jR
y
j
�
�
L
j
��
L
j
:
It was proved earlier that P
�
; C (jki) N
�
= P
�
Tr
N
( ); C (jki)
�
. Further,
Tr
N
( ) =
X
j
�
R
j
j0ih0jR
y
j
�
Tr
�
�
L
j
��
L
j
�
:
Sin e
Tr(�
L
j
��
L
j
) = Tr(�
2
L
j
�) = Tr(�
L
j
�)
def
= P(�;L
j
);
we obtain the desired expression P
�
; C (jki)N
�
=
P
j
P(kjj)P(�;L
j
). �
[1!℄ Problem 12.1. Let N =
L
j
L
j
be an orthogonal de omposition, U
j
2
K and
~
U
j
2 K B
N
some unitary operators. Suppose that for ea h j the
operator
~
U
j
approximates U
j
with pre ision Æ using an illas (namely, the
spa e B
N
). Then the measuring operator
~
W =
P
j
�
L
j
~
U
j
approximates
W =
P
j
�
L
j
U
j
with the same pre ision Æ.
12.3. Garbage removal and omposition of measurements. Mea-
surement operators are used to obtain some information about the value
of the index j in the de omposition N =
L
j2
L
j
. From a omputation
perspe tive, only a part of this information may be useful. In this situation,
the measurement operator an be written in the form
(12.3) W =
X
j2
�
L
j
R
j
; R
j
: B
N
! B
N
; R
j
j0i =
X
y;z
y;z
(j)jy; zi;
where y 2 B
m
represents the \useful result" and z 2 B
N�m
is \garbage".
Ignoring the garbage, we get the onditional probabilities
(12.4) P(yjj)
def
=
0
�
�
R
y
j
�
M
y
R
j
�
�
0
�
; M
y
= C
�
jyi
�
B
(N�m)
:
How an one onstru t another measuring operator U that would pro-
du e y with the same onditional probabilities, but without garbage? It
seems that there is no general solution to this problem, ex ept for the ase
116 2. Quantum Computation
where the result y is deterministi , namely, P(yjj) = Æ
y;f(j)
for some fun -
tion f : ! B
m
(so that we an say that W a tually measures the value of
f(j) ). Then we an use the same tri k as in the proof of Lemma 7.2: we
measure f(j), opy the result, and \un-measure".
We are going to extend this simple result in three ways. First, we
will assume that W measures f with some error probability "; we will �nd
out with what pre ision the above pro edure orresponds to a garbage-free
measurement (the answer is
p
2"). Se ond, the formula for the onditional
probabilities (12.4) makes sense for an arbitrary orthogonal de omposition
B
N
=
L
y2�
M
y
. Third, instead of opying the result, we an apply any
operator V that is measuring with respe t to the indi ated de omposition.
The opying orresponds to V : jy; z; vi 7! jy; z; y � vi.
[2!℄ Problem 12.2. Let N =
L
j2
L
j
and B
N
=
L
y2�
M
y
be orthog-
onal de ompositions. Consider two measuring operators on N K B
N
,
su h that B
N
serves as the \instrument" for one and the \obje t" for the
other:
W =
X
j2
�
L
j
I
K
R
j
; V =
X
y2�
I
N
Q
y
�
M
y
:
Suppose that W measures a fun tion f : ! � with error probability
� ", i.e., the onditional probabilities P(yjj) = h0
N
jR
y
j
�
M
y
R
j
j0
N
i satisfy
P(f(j)jj) � 1 � ". Then the operator
~
U = W
�1
VW approximates the
operator
U =
X
j2
�
L
j
Q
f(j)
: N K ! N K
with pre ision 2
p
", using B
N
as the an illary spa e. If V is the opy
operator, then the pre ision is
p
2".
13. Quantum algorithms for Abelian groups
The only nontrivial quantum algorithm we have onsidered so far is Grover's
algorithm for the solution of the universal sear h problem (see Se tion 9.2).
Unfortunately, there we a hieved only a polynomial in rease in speed. For
this reason Grover's algorithm does not yield any serious onsequen es (of
type BQP � BPP) for omplexity theory. At present time, there is no proof
that quantum omputation is super-polynomially faster than lassi al prob-
abilisti omputation. But there are several pie es of indire t eviden e in
favor of su h an assertion. The �rst of these is an example of a problem
with ora le ( f. De�nition 2.2 on page 26 and the beginning of Se tion 9.2),
13. Quantum algorithms for Abelian groups 117
for whi h there exists a polynomial quantum algorithm, while any lassi-
al probabilisti algorithm is exponential.
10
This example, onstru ted by
D. Simon [66℄, is alled the hidden subgroup problem for (Z
2
)
k
. The famous
fa toring algorithm by P. Shor [62℄ is based on a similar idea. After dis-
ussing these examples, we will solve the hidden subgroup problem for the
group Z
k
, whi h generalizes both results.
The hidden subgroup problem. Let G be a group with a spe i�ed
representation of its elements by binary words. There is a devi e (an ora le)
that omputes some fun tion f : G! B
n
with the following property:
(13.1) f(x) = f(y)() x� y 2 D;
where D � G is an initially unknown subgroup. It is required to �nd that
subgroup. (The result should be presented in a spe i�ed way.)
13.1. The problem of hidden subgroup in (Z
2
)
k
; Simon's algorithm.
We onsider the problem formulated above for the group G = (Z
2
)
k
. The
elements of this group an be represented by length k words of zeros and
ones; the group operation is bitwise addition modulo 2. We may regard G
as the k-dimensional linear spa e over the �eld F
2
. Any subgroup of G is a
linear subspa e, so it an be represented by a basis.
It is easy to show that a \hidden subgroup" annot be found qui kly
using a lassi al probabilisti ma hine. (A lassi al ma hine sends words
x
1
; : : : ; x
l
to the input of the \bla k box" and re eives answers y
1
; : : : ; y
l
.
Ea h subsequent query x
j
depends on the previous answers y
1
; : : : ; y
j�1
and
some random number r that is generated in advan e.)
Proposition 13.1. Let n � k. For any lassi al probabilisti algorithm
making no more than 2
k=2
queries to the ora le, there exist a subgroup D �
(Z
2
)
k
and a orresponding fun tion f : (Z
2
)
k
! B
n
for whi h the algorithm
is wrong with probability > 1=3.
Proof. For the same subgroup D there exist several di�erent ora les f . We
assume that one of them is hosen randomly and uniformly. (If the algorithm
is wrong with probability > 1=3 for the randomized ora le, then it will be
wrong with probability > 1=3 for some parti ular ora le.) The randomized
ora le works as follows. If the present query is x
j
, and x
j
�x
s
2 D for some
s < j, the answer y
j
oin ides with the answer y
s
that was given before.
Otherwise, y
j
is uniformly distributed over the set B
n
n fy
1
; : : : ; y
j�1
g. The
randomized ora le is not an ora le in the proper sense, meaning that its
answer may depend on the previous queries rather than only on the urrent
10
One should keep in mind that the omplexity of problems with ora le frequently di�ers
from the omplexity of ordinary omputational problems. A lassi al example is the theorem
stating that IP = PSPACE [58, 59℄. The ora le analogue of this assertion is not true [28℄!
118 2. Quantum Computation
one. In this manner, the randomized ora le is equivalent to a devi e with
memory, whi h, being asked the question x
j
, responds with the smallest
number s
j
� j su h that x
j
� x
s
j
2 D. Indeed, if we have a lassi al
probabilisti ma hine making queries to the randomized ora le, we an adapt
it for the use with the devi e just des ribed. For this, the ma hine should
be altered in su h a way that it will transform ea h answer s
j
to y
j
and
pro eed as before. (That is, y
j
= y
s
if s
j
< j, or y
j
is uniformly distributed
over B
n
n fy
1
; : : : ; y
j�1
g if s
j
= j.)
Let the total number of queries be l � 2
k=2
. Without loss of generality
all queries an be assumed di�erent. In the ase D = f0g, all answers are
also di�erent, i.e., s
j
= j for all j. Now we onsider the ase D = f0; zg,
where z is hosen randomly, with equal probabilities, from the set of all
nonzero elements of the group (Z
2
)
k
. Then, regardless of the algorithm that
is used, z =2 fx
j
�x
1
; : : : ; x
j
�x
j�1
g with probability � 1� (j� 1)=(2
k
� 1).
The ondition for z implies that s
j
= j. This is true for all j = 1; : : : ; l
with probability � 1� l(l�1)=(2(2
k�1
�1)) > 1=2. Re all that we have two
random parameters: z and r. We an �x z in su h a way that the probability
of obtaining the answers s
j
= j (for all j) will still be greater than 1=2. Let
us see what a lassi al ma hine would do in su h a ir umstan e. If it gives
the answer \D = f0 g" with probability � 2=3, we set D = f0; zg, and then
the resulting answer will be wrong with probability > (2=3) � (1=2) = 1=3.
If, however, the probability of the answer \D = f0g" is smaller than 2=3,
we set D = f0g. �
Let us de�ne a quantum analog of the ora le f . The orresponding
quantum ora le is a unitary operator
(13.2) U : jx; yi ! jx; y � f(x)i
(� denotes bitwise addition). We note that the quantum ora le allows linear
ombinations of the various queries; therefore it is possible to use it more
eÆ iently than the lassi al ora le.
Now we des ribe Simon's algorithm for �nding the hidden subgroup in
Z
k
2
. Let E = G=D and letE
�
be the group of hara ters onE. (By de�nition,
a hara ter is a homomorphism E ! U(1).) In the ase G = (Z
2
)
k
we an
hara terize the group E
�
in the following manner:
E
�
= fh 2 (Z
2
)
k
su h that h � z = 0 for all z 2 Dg;
where h � z denotes the inner produ t modulo 2. (The hara ter orrespond-
ing to h has the form z 7! (�1)
h�z
.) Let us show how one an generate
a random element h 2 E
�
using the operator U . After generating suÆ-
iently many random elements, we will �nd the group E
�
, hen e the desired
subgroup D.
13. Quantum algorithms for Abelian groups 119
We begin by preparing the state
j�i = 2
�k=2
X
x2G
jxi = H
k
j0
k
i
in the �rst quantum register. In the se ond register we pla e the state j0
n
i
and apply the operator U . Then we dis ard the se ond register, i.e., we will
no longer make use of its ontents. Thus we obtain the mixed state
� = Tr
2
�
U
�
j�ih�j j0ih0j
�
U
y
�
= 2
�k
X
x;y:x�y2D
jxihyj:
Now we apply the operator H
k
to the remaining �rst register. This yields
a new mixed state
= H
k
�H
k
= 2
�2k
X
a;b
X
x;y:x�y2D
(�1)
a�x�b�y
jaihbj:
It is easy to see that
P
x;y:x�y2D
(�1)
a�x�b�y
is di�erent from zero only in the
ase where a = b 2 E
�
. Hen e
=
1
jE
�
j
X
a2E
�
jaihaj:
This is pre isely the density matrix orresponding to the uniform probability
distribution on the groupE
�
. It remains to apply the following lemma, whi h
we formulate as a problem.
[2!℄ Problem 13.1. Let h
1
; : : : ; h
l
be independent uniformly distributed
random elements of an Abelian group X. Prove that they generate the
entire group X with probability � 1� jXj=2
l
.
Therefore, 2k random elements suÆ e to generate the entire group E
�
with probability of error � 2
�k
, where \error" refers to the ase where
h
1
; : : : ; h
2k
a tually generate a proper subgroup of E
�
. (Note that su h a
small | ompared to 1/3 | probability of error is obtained without mu h
additional expense. To make it still smaller, it is most eÆ ient to use the
standard pro edure: repeat all al ulations several times and hoose the
most frequent out ome.)
Summing up, to �nd the \hidden subgroup" D, we need O(k) queries to
the quantum ora le. The overall omplexity of the algorithm is O(k
3
).
13.2. Fa toring and �nding the period for raising to a power. A
se ond pie e of eviden e in favor of the hypothesis BQP � BPP is the
fast quantum algorithm for fa toring integers into primes and for another
number-theoreti problem | �nding the dis rete logarithm. They were
found by P. Shor [62℄. Let us dis uss the �rst of these two problems.
120 2. Quantum Computation
Fa toring (an integer into primes). Suppose we are given a positive
integer y. It is required to �nd its de omposition into prime fa tors
y = p
�
1
1
p
�
2
2
� � � p
�
k
k
:
This problem is thought to be so omplex that pra ti al ryptographi
algorithms are based on the hypotheti al diÆ ulty of its solution. From the
theoreti al viewpoint, the situation is somewhat worse: there is neither a
redu tion of problems of lass NP to the fa toring problem, nor any other
\dire t" eviden e in favor of its omplexity. (The word \dire t" is put in
quotation marks be ause at present the answer to the question P
?
= NP is
unknown.) Therefore, the onje ture about the omplexity of the fa toring
problem omplements the abundant olle tion of unproved onje tures in
the omputational omplexity theory. It is desirable to de rease the number
of su h problems. Shor's result is a signi� ant step in this dire tion: if
we ommit an \a t of faith" and believe in the omplexity of the fa toring
problem, then the need for yet another a t of faith (regarding the greater
omputational power of the quantum omputer) disappears.
We will onstru t a fast quantum algorithm for solving not the fa toring
problem, but another problem, alled Period finding, to whi h the fa -
toring problem is redu ed with the aid of a lassi al probabilisti algorithm.
Period finding. Suppose we are given an integer q > 1 that an be
written using at most n binary digits (i.e., q < 2
n
) and another integer a
su h that 1 � a < q and g d(a; q) = 1 (where g d(a; q) denotes the greatest
ommon divisor). It is required to �nd the period of a with respe t to q,
i.e., the smallest nonnegative number t su h that a
t
� 1 (mod q).
In other words, the period is the order of the number a in the multipli a-
tive group of residues (Z=qZ)
�
. We will denote the period of a with respe t
to q by per
q
(a).
Below we will examine a quantum algorithm for the solution of the period
�nding problem. But we will begin by des ribing the lassi al probabilisti
redu tion of the fa toring problem to the problem of �nding the period. We
suggest that the reader reviews the probabilisti test for primality presented
in Part 1 (see Se tion 4.2).
13.3. Redu tion of fa toring to period �nding. Thus, let us assume
that we know how to �nd the period. It is lear that we an fa tor the
number y by running O(log y) times a subprogram whi h, for any omposite
number, �nds a nontrivial divisor with probability at least 1=2. (Of ourse,
it is also ne essary to use the standard pro edure for ampli� ation of su ess
probability; see formula (4.1) on p. 37 and the paragraph pre eding it.)
Pro edure for �nding a nontrivial divisor.
13. Quantum algorithms for Abelian groups 121
Input. An integer y (y > 1).
Step 1. Che k y for parity. If y is even, then give the answer \2";
otherwise pro eed to Step 2.
Step 2. Che k whether y is the k-th power of an integer for k =
2; : : : ; log
2
y. If y = m
k
, then give the answer \m"; otherwise pro eed to
Step 3.
Step 3. Choose an integer a randomly and uniformly between 1 and
y � 1. Compute b = g d(a; q) (say, by Eu lid's algorithm). If b > 1, then
give the answer \b"; otherwise pro eed to Step 4.
Step 4. Compute r = per
y
(a) (using the period �nding algorithm that
we assume we have). If r is odd, then the answer is \y is prime" (whi h
means that we give up �nding a nontrivial divisor). Otherwise pro eed to
Step 5.
Step 5. Compute d = g d(a
r=2
� 1; y). If d > 1, then the answer is \d";
otherwise the answer is \y is prime".
Analysis of the divisor �nding pro edure. If the above pro edure yields
a number, it is a nontrivial divisor of y. The pro edure fails and gives the
answer \y is prime" in two ases: 1) when r = per
y
(a) is odd, or 2) when r is
even but g d(a
r=2
� 1; y) = 1, i.e., a
r=2
� 1 is invertible modulo y. However,
(a
r=2
+ 1)(a
r=2
� 1) � a
r
� 1 � 0 (mod y), hen e a
r=2
+ 1 � 0 (mod y) in
this ase. The onverse is also true: if r is even and a
r=2
+ 1 � 0 (mod y),
then the answer is \y is prime".
Let us prove that our pro edure su eeds with probability at least 1 �
1=2
k�1
, where k is the number of distin t prime divisors of y. (Note that
this probability vanishes for prime y, so that the the pro edure also works as
a primality test.) In the proof we will need the Chinese Remainder Theorem
(Theorem A.5 on page 241) and the fa t that the multipli ative group of
residues modulo p
�
, p prime, is y li (see Theorem A.11).
Let y =
Q
k
j=1
p
�
j
j
be the de omposition of y into prime fa tors. We
introdu e the notation
a
j
� a (mod p
�
j
j
); r
j
= per
(p
�
j
j
)
a
j
= 2
s
j
r
0
j
; where r
0
j
is odd:
By the Chinese Remainder Theorem, r is the least ommon multiple of all
the r
j
. Hen e r = 2
s
r
0
, where s = maxfs
1
; : : : ; s
k
g and r
0
is odd.
We now prove that the pro edure yields the answer \y is prime" if and
only if s
1
= s
2
= � � � = s
k
. Indeed, if s
1
= � � � = s
k
= 0, then r is odd.
If s
1
= � � � = s
k
� 1, then r is even, but a
r
j
=2
j
� �1 (mod p
�
j
j
) (using
the y li ity of the group (Z=p
�
j
j
Z)
�
), hen e a
r=2
� �1 (mod y) (using the
Chinese Remainder Theorem). Thus the pro edure yields the answer \y is
122 2. Quantum Computation
prime" in both ases. Conversely, if not all the s
j
are equal, then r is even
and s
m
< s for some m, so that a
r=2
m
� 1 (mod p
�
m
m
). Hen e a
r=2
6� �1
(mod y), i.e., the pro edure yields a nontrivial divisor.
To give a lower bound of the su ess probability, we may assume that
the pro edure has rea hed Step 4. Thus a is hosen a ording to the uniform
distribution over the group (Z=qZ)
�
. By the Chinese Remainder Theorem,
the uniform random hoi e of a is the same as the independent uniform
random hoi e of a
j
2 (Z=p
�
j
j
Z)
�
for ea h j. Let us �x j, hoose some s � 0
and estimate the probability of the event s
j
= s for the uniform distribution
of a
j
. Let g
j
be a generator of the y li group (Z=p
�
j
j
Z)
�
. The order of this
group may be represented as p
�
j
j
� p
�
j
�1
j
= 2
t
j
q
j
, where q
j
is odd. Then
�
�
fa
j
: s
j
= sg
�
�
=
�
�
�
�
g
l
j
: l = 2
t
j
�s
m; where m is odd
�
�
�
=
�
q
j
if s = 0;
(2
s
� 2
s�1
)q
j
if s = 1; : : : ; t
j
:
For any given s, the probability of the event s
j
= s does not ex eed 1=2. Now
let s = s
1
be a random number (depending on a
1
); then Pr[s
j
= s℄ � 1=2
for j = 2; : : : ; k. It follows that
Pr[s
1
= s
2
= � � � = s
k
℄ � (1=2)
k�1
:
This yields the desired estimate of the su ess probability for the entire pro-
edure: with probability at least 1� 1=2
k�1
the pro edure �nds a nontrivial
divisor of y.
13.4. Quantum algorithm for �nding the period: the basi idea.
Thus, the problem is this: given the numbers q and a, onstru t a polynomial
size quantum ir uit that omputes per
q
(a) with error probability � � 1=3.
The ir uit will operate on a single n-qubit register, as well as on many other
qubits, some of whi h may be onsidered lassi al. The n-qubit register is
meant to represent residues modulo q (re all that q < 2
n
).
Let us examine the operator that multiplies the residues by a, a ting by
the rule
U
a
: jxi 7! jax mod qi:
(A more a urate notation would be U
q;a
, indi ating the dependen e on q.
However, q is �xed throughout the omputation, so we suppress it from the
subs ript. We keep a be ause we will also use the operators U
b
for arbitrary
b.) This operator permutes the basis ve tors for 0 � x < q (re all that
(a; q) = 1). However, we represent jxi by n qubits, so x may take any value
between 0 and 2
n
� 1. We will assume that the operator U
a
a ts trivially on
su h basis ve tors, i.e., U
a
: jxi = jxi for q � x < 2
n
.
13. Quantum algorithms for Abelian groups 123
Sin e for the multipli ation of the residues there is a Boolean ir uit of
polynomial | O(n
2
) | size, there is a quantum ir uit (with an illas) of
about the same size.
The permutation given by the operator U
a
an be de omposed into y-
les. The y le ontaining 1 is (1; a; a
2
; : : : ; a
per
q
(a)�1
); it has length per
q
(a).
The algorithm we are dis ussing begins at the state j1i, to whi h the op-
erator U
a
gets applied many times. But su h transformations do not take
us beyond the orbit of 1 (the set of elements whi h onstitute the y le de-
s ribed above). Therefore we onsider the restri tion of the operator U
a
to
the subspa e generated by the orbit of 1.
Eigenvalues of U
a
: �
k
= e
2�i�k=t
; where t is the period;
Eigenve tors of U
a
: j�
k
i =
1
p
t
t�1
X
m=0
e
�2�i�km=t
ja
m
i:
It is easy to verify that the ve tors j�
k
i are indeed eigenve tors. It suÆ es to
note that the multipli ation by a leads to a shift of the indi es in the sum.
If we hange the variable of summation in order to remove this shift, we get
the fa tor e
2�i�k=t
.
If we are able to measure the eigenvalues of the operator U
a
, then we
an obtain the numbers k=t. First let us analyze how this will help us in
determining the period.
Suppose we have a ma hine whi h in ea h run gives us the number
k=t, where t is the sought-for period and k is a random number uniformly
distributed over the set f0; : : : ; t � 1g. We suppose that k=t is represented
as an irredu ible fra tion k
0
=t
0
(if the ma hine were able to give the number
in the form k=t, there would be no problem at all).
Having obtained several fra tions of the form k
0
1
=t
0
1
; k
0
2
=t
0
2
; : : : ; k
0
l
=t
0
l
, we
an, with high probability, �nd the number t by redu ing these fra tions to
a ommon denominator.
Lemma 13.2. If l � 2 fra tions are obtained, then the probability that their
least ommon denominator is di�erent from t is less than 3 � 2
�l
.
Proof. The fra tions k
0
1
=t
0
1
; k
0
2
=t
0
2
; : : : ; k
0
l
=t
0
l
an be obtained as redu tions
of fra tions k
1
=t; : : : ; k
l
=t (i.e., k
0
j
=t
0
j
= k
j
=t), where k
1
; : : : ; k
l
are indepen-
dently distributed random numbers. The least ommon multiple of t
0
1
; : : : ; t
0
l
equals t if and only if the greatest ommon divisor of k
1
; : : : ; k
l
and t is equal
to 1.
The probability that k
1
; : : : ; k
l
have a ommon prime divisor p does not
ex eed 1=p
l
. Therefore the probability of not getting t after redu ing to a
124 2. Quantum Computation
ommon denominator does not ex eed
P
1
k=2
1
k
l
< 3 � 2
�l
(the range of the
index k in this sum obviously in ludes all prime divisors of t). �
Now we onstru t the ma hine M that generates the number k=t (in
the form of an irredu ible fra tion) for random uniformly distributed k.
This will be a quantum ir uit whi h realizes the measuring operator W =
P
t�1
k=0
V
k
�
L
k
, where L
k
= C (j�
k
i), the subspa e generated by j�
k
i. The
operators V
k
are the form j0i 7!
P
y;z
jy; zi, where y is an irredu ible fra tion
and z is garbage. In this, the onditional probabilities should satisfy the
inequality
(13.3) P
�
k
t
�
�
�
k
�
def
=
X
z
�
�
�
D
k
t
; z
�
�
�
V
k
�
�
�
0
E
�
�
�
2
� 1� ";
where
k
t
denotes the irredu ible fra tion equal to the rational number k=t.
The onstru tion of su h a measuring ir uit is rather omplex, so we
�rst explain how it is used to generate the out ome y with the desired
probability w
y
=
P
k2M
y
1
t
, whereM
y
=
n
k :
k
t
= y
o
. Let us take the state
j1i as the initial state. A dire t omputation (the reader is advised to arry
it through) shows that
j1i =
1
p
t
t�1
X
k=0
j�
k
i:
If we perform the measurement on this state, then by the formula for total
probability we obtain
Pr[out ome = y℄ = P
�
W (j0i j1i); y
�
=
X
k
P(yjk)P(j1i;L
k
):
The probabilities of all j�
k
i are equal: P(j1i;L
k
) = jh�
k
j1ij
2
= 1=t, whi h
orresponds to the uniform distribution of k. The property (13.3) guarantees
that we obtain the out ome
k
t
with probability � 1 � ". Well, the reader
may �nd this statement not rigorous be ause k does not have a ertain value.
To be ompletely pedanti , we need to derive an inequality similar to (10.4),
namely,
X
y
�
�
�
Pr[out ome = y℄� w
y
�
�
�
� 2�:
S hemati ally, the ma hine M fun tions as follows:
j1i �!
random hoi e of k
(God playing di e)
j�
k
i
��! measuring of W
�
�*
H
Hj
y 6=
k
t
with probability
� ",
y =
k
t
with probability
� 1� ".
The random hoi e of k happens automati ally, without applying any op-
erator whatsoever. Indeed, the formula of total probability is arranged in
13. Quantum algorithms for Abelian groups 125
su h a way as if: before the measurement begins, a random k was generated,
whi h then remains onstant. (Of ourse, the formula is only true when the
operator W is measuring with respe t to the given subspa es L
k
.)
13.5. The phase estimation pro edure. Now we will onstru t the op-
erator that measures the eigenvalues of U
a
. The eigenvalues have the form
�
k
= e
2�i'
k
; where '
k
=
k
t
mod 1:
The phase '
k
is a real number modulo 1, i.e., '
k
2 R=Z. (The set R=Z an
be onveniently represented as a ir le of unit length.) The pro edure for
determining '
k
is alled phase estimation.
As we already mentioned, we an limit ourselves to the study of the
a tion of the operator U
a
on the input ve tor j�
k
i. The onstru tion is
divided into four stages.
1. We onstru t a measuring operator su h that the onditional probabili-
ties depend on the value of ' = '
k
. Thus a single use of this operator
will give us some information about ' (like ipping a biased oin tells
something about the bias, though in on lusively) (see 13.5.1).
2. We lo alize the value of ' with modest pre ision. It is the moment to
emphasize that, in all the arguments, there are two parameters: the
probability of error " and the pre ision Æ. As the result of a measure-
ment, we obtain some number y, for whi h the ondition jy�'j
mod 1
< Æ
must hold with probability at least 1 � ". (Here j � j
mod 1
denotes the
distan e on the unit length ir le, e.g., j0:1 � 0:9j
mod 1
= 0:2.) For the
time being, a modest pre ision will do, say Æ = 1=16 (see 13.5.2).
3. Now we must in rease the pre ision. Spe i� ally, we determine ' with
pre ision 1=2
2n+2
(see 13.5.3).
4. We need to pass from the approximate value of ' to the exa t one,
represented in the form of an irredu ible fra tion. It is essential to
be able to distinguish between numbers of the form ' = k=t, where
0 � k < t < 2
n
. Noti e that if k
1
=t
1
6= k
2
=t
2
, then jk
1
=t
1
�k
2
=t
2
j
mod 1
�
1=(t
1
t
2
) > 1=2
2n
. Therefore, knowing ' = k=t with pre ision 1=2
2n+1
,
one an, in prin iple, determine its exa t value. Moreover, this an be
done eÆ iently by the use of ontinued fra tions (see 13.5.4).
At stage 3, we will use the operator U
b
for arbitrary b (not just for b = a,
the number for whi h we seek the period). To this end, we introdu e an
operator U that sends jb; xi to jb; bx mod qi whenever g d(b; q) = 1. How the
operator U a ts in the remaining ases is not important; this a tion an be
de�ned in an arbitrary omputationally trivial way, so that U be represented
by a quantum ir uit of size O(n
2
). In fa t, all the earlier arguments about
126 2. Quantum Computation
the simulation of Boolean ir uits by quantum ir uits hold true for the
simulation of ir uits that ompute partially de�ned fun tions.
[1℄ Problem 13.2. Using the operator U , realize the operator �(U
b
) for
arbitrary b relatively prime to q.
13.5.1. How to get some information about the phase. In Se tion 12
we introdu ed the operator �(U
a
) = (H I)�(U
a
)(H I), whi h measures
the eigenvalues. In our ase �
k
= e
2�i'
k
and we an write the operator in
the form
�(U
a
) =
X
k
V
k
�
L
k
; V
k
=
1
2
�
1 + e
2�i'
k
1� e
2�i'
k
1� e
2�i'
k
1 + e
2�i'
k
�
;
and its a tion in the form
j0i j�
k
i 7
�(U
a
)
���!
�
1 + e
2�i'
k
2
j0i+
1� e
2�i'
k
2
j1i
�
j�
k
i;
so that for the onditional probabilities we get the expressions
P(0jk) =
�
�
�
�
1 + e
2�i'
k
2
�
�
�
�
2
=
1 + os(2�'
k
)
2
; P(1jk) =
1� os(2�'
k
)
2
:
Although the onditional probabilities depend on '
k
, they do not allow
one to distinguish between '
k
= ' and '
k
= �'. That is why another type
of measurement is needed. We will use the operator �(iU
a
). Its realization
is shown in the diagram below. It uses the operator K =
�
1 0
0 i
�
from
HH K
U
a
�(iU
a
)
the standard basis. The en ir led part of the diagram realizes the operator
�(iU
a
). Indeed, K multiplies only j1i by i, but this is just the ase where
the operator U
a
is applied (by the de�nition of �(U
a
)). For the operator
�(iU
a
) the onditional probabilities are
P(0jk) =
1� sin(2�'
k
)
2
; P(1jk) =
1 + sin(2�'
k
)
2
:
The omplexity of the realization of the operators �(U
a
) and �(iU
a
)
depends on the omplexity of the operator �(U
a
), whi h is not mu h higher
than the omplexity of the operator U ( f. Problem 13.2). Thus, �(U
a
) and
�(iU
a
) an be realized by quantum ir uits of size O(n
2
) in the standard
basis.
13. Quantum algorithms for Abelian groups 127
Input: a and q.
{ Computation of the powers a
2
j
mod q for j = 0; : : : ; 2n� 1 ( lassi al).
{ Setting up l quantum registers, ontaining the base state j1i.
j1i j1i : : : j1i
?
2s
0 : : : 0
0 : : : 0
: : :
2s
0 : : : 0
0 : : : 0
4ns auxiliary qubits
(\measurement de-
vi es")
� Quantum measurements with the aid of the operators �
�
U
a
2
j
�
and �
�
iU
a
2
j
�
.
v
(1)
1
: : : v
(1)
s
~v
(1)
1
: : : ~v
(1)
s
: : :
v
(2n)
1
: : : v
(2n)
s
~v
(2n)
1
: : : ~v
(2n)
s
measurement
out omes |
0s and 1s
� Counting the numbers of 0s and 1s ( lassi al).
os'� Æ
sin'� Æ
: : :
os(2
2n�1
')� Æ
sin(2
2n�1
')� Æ
� Trigonometri al ulations ( lassi al).
'�
1
16
: : : 2
2n�1
'�
1
16
� Sharpening the value of ' using the set of numbers '; : : : ; 2
2n�1
' ( lassi al).
'� 2
�(2n+2)
� Determining the exa t value of ' with the aid of ontinued fra tions
( lassi al).
Result: k
0
=t
0
| some fra tion, equal to '.
Similar
ompu-
tations
? ? ? ?
k
0
1
t
0
1
k
0
2
t
0
2
: : :
k
0
l
t
0
l
{ Cal ulation of the least ommon denominator ( lassi al).
Answer: t (with probability of error < 3 � 2
�l
+ nle
�(s)
).
Table 13.1. General s heme of the period �nding algorithm. Shown in
a box is the phase estimation part.
13.5.2. Determining the phase with onstant pre ision. We want to
lo alize the value of ' = '
k
, i.e., to infer the inequality j' � yj
mod 1
< Æ
for some (initially unknown) y and a given pre ision Æ. To get su h an
estimate, we apply the operators �(U
a
) and �(iU
a
) to the same \obje t of
measurement" but di�erent \instruments" (auxiliary qubits). The reasoning
is the same for both operators, so we limit ourselves to the ase �(U
a
).
128 2. Quantum Computation
We have the quantum register A that ontains j�
k
i. A tually, this reg-
ister initially ontains j1i =
1
p
t
P
t�1
k=0
j�
k
i, but we onsider ea h j�
k
i sepa-
rately. (We an do this be ause we apply only operators that are measuring
with respe t to the orthogonal de omposition
L
k
C (j�
k
i), so that di�erent
eigenve tors do not mix.) Let us introdu e a large number s of auxiliary
qubits. Ea h of them will be used in applying the operator �(U
a
).
As was proved in Se tion 12 (see p. 114), the onditional probabilities
in su h a ase multiply. For the operators
Q
s
r=1
�(U
a
)[r;A℄, the onditional
probabilities are equal to P(v
1
; : : : ; v
s
jk) =
Q
s
r=1
P(v
r
jk) (here v
r
denotes
the value of the r-th auxiliary qubit).
From this point on, the qubits ontaining the results of the \experi-
ments" will only be operated upon lassi ally. Sin e the onditional prob-
abilities multiply, we an assume that we are estimating the probability
p
�
= P(1jk) of the out ome 1 (\head" in a oin ip) by performing a series
of Bernoulli trials.
If the oin is tossed s times (where s is large), then the observed fre-
quen y (
P
v
r
)=s of the out ome 1 is lose to its probability p
�
. What is the
a ura y of this estimate? The exa t question: with what probability does
the number (
P
v
r
)=s fail to approximate p
�
with a given pre ision Æ? The
answer is given by Cherno�'s bound :
(13.4) Pr
h
�
�
�
s
�1
s
X
r=1
v
r
� p
�
�
�
�
� Æ
i
� 2e
�2Æ
2
s
:
(This inequality is a generalization of the inequality (4.1) whi h was used
to prove the ampli� ation of su ess probability in the de�nitions of BPP
and BQP.) Thus, for a �xed Æ we an �nd a suitable onstant = (Æ) su h
that the error is smaller than " when s = d log(1=")e = �(log(1=")) trials
are made.
So, we have learned how to �nd os(2�') and sin(2�') with any given
pre ision Æ. Now we hoose Æ so that the value ' an be determined from
the values of the sine and the osine with pre ision 1=16. This still takes
�(log(1=")) trials. The se ond stage is ompleted.
[3℄ Problem 13.3. Prove the inequality (13.4).
13.5.3. Determining the phase with exponential pre ision. To in-
rease the pre ision, we will use, along with �(U
a
), the operators �
�
(U
a
)
2
j
�
for all j � 2n�1. We an qui kly raise numbers to a power, but, in general,
omputing a power of an operator is diÆ ult. However, the operation U
a
of
(mod q)-multipli ation by a possesses the following remarkable property:
(U
a
)
p
= U
a
p
= U
(a
p
mod q)
:
13. Quantum algorithms for Abelian groups 129
Consequently, �
�
(U
a
)
2
j
�
= �(U
b
), where b � a
2
j
(mod q). The required
values for the parameter b an be al ulated using a ir uit of polynomial
size; then we an apply the result of Problem 13.2.
Let us return to the ir uit des ribed in 13.5.1. We found the eigenvalue
�
k
= � = e
2�i'
for some eigenve tor j�
k
i. This same ve tor is an eigenve tor
for any power of the operator U
a
, so that in the same quantum register we
an look for an eigenvalue of U
2
a
= U
a
2 (it equals �
2
= e
2�i�2'
), of U
4
a
= U
a
4
(it equals �
4
= e
2�i�4'
), et .
In other words, we an determine with pre ision 1=16 the values of ',
2',. . . , 2
2n�1
' modulo 1. But this allows us to determine ' with pre ision
1=2
2n+2
eÆ iently (in linear time with onstant memory). The idea is based
on the following obvious fa t: if jy � 2'j
mod 1
< Æ < 1=2, then
either jy
0
0
� 'j
mod 1
< Æ=2 or jy
0
1
� 'j
mod 1
< Æ=2;
where y
0
0
; y
0
1
are the solutions to the equation 2y
0
� y (mod 1). Thus we an
start from 2
2n�1
' and in rease the pre ision as we pro eed toward '. The
approximate values of 2
j�1
' (j = 2n; 2n � 1; : : : ; 1) will allow us to make
the orre t hoi es.
Let m = 2n. For j = 1; : : : ;m we repla e the known approximate value
of 2
j�1
' by �
j
, the losest number from the set
�
0
8
;
1
8
;
2
8
;
3
8
;
4
8
;
5
8
;
6
8
;
7
8
. This
guarantees that
j2
j�1
'� �
j
j
mod 1
< 1=16 + 1=16 = 1=8:
Let us introdu e a notation for binary fra tions: :�
1
� � ��
p
=
P
p
j=1
2
�j
�
j
(�
j
2 f0; 1g). Our algorithm is as follows.
Algorithm for sharpening the value of '. Set :�
m
�
m+1
�
m+2
= �
m
and pro-
eed by iteration:
�
j
=
(
0 if
�
�
:0�
j+1
�
j+2
� �
j
�
�
mod 1
< 1=4;
1 if
�
�
:1�
j+1
�
j+2
� �
j
�
�
mod 1
< 1=4
for j = m� 1; : : : ; 1
(ea h time, exa tly one of the two ases holds). The result satis�es the
inequality
j:�
1
�
2
� � ��
m+2
� 'j
mod 1
< 2
�(m+2)
:
The proof is a simple indu tion:
�
�
:�
j
� � ��
m+2
� 2
j�1
'
�
�
mod 1
< 2
�(m+3�j)
at ea h step.
This pro edure is an example of omputation by a �nite-state automaton
(see Problem 2.11). The state of the automaton is the pair (�
j+1
; �
j+2
),
whereas the input symbols are �
j
. It follows that the omputation an be
represented by a Boolean ir uit of size O(m) and depth O(logm).
130 2. Quantum Computation
13.5.4. Determining the exa t value of the phase. We have found
a number y satisfying jy � k=tj < 1=2
2n+1
. We represent it as a ontinued
fra tion (see Se tion A.7) and try all onvergents of y until we �nd a fra tion
k
0
=t
0
su h that jy � k
0
=t
0
j < 1=2
2n+1
. The se ond part of Theorem A.13
guarantees that the number k=t is ontained among the onvergents, and
therefore will be found unless the algorithm stops earlier. But it annot stop
earlier be ause there is at most one fra tion with denominator � 2
n
that
approximates a given number with pre ision 1=2
2n+1
. The running time of
this algorithm is O(n
3
).
Important observations. 1. It is essential that the ve tor j�
k
i does not
deteriorate during the omputation.
2. The entire period �nding pro edure depends on the parameters l and
s; they should be adjusted so that the error probability be small enough. The
error an o ur in determining the period t as the least ommon denominator
(see Lemma 13.2) or in estimating the osine and the sine of '
k
with onstant
pre ision Æ (see inequality (13.4)). The total probability of error does not
ex eed 3 � 2
�l
+ nle
�(s)
. If it is required to get the result with probability
of error � 1=3, then we must set l = 4, s = �(log n). In this way we
get a quantum ir uit of size O(n
3
logn). (In fa t, there is some room for
optimization; see Corollary 13.3.1 below.)
13.6. Dis ussion of the algorithm. We dis uss two questions that arise
naturally with regard to the algorithm that has been set forth.
|Whi h eigenvalues do we �nd? We �nd a randomly hosen eigen-
value. The distribution over the set of all eigenvalues an be ontrolled by
appropriately hoosing the initial state. In our period �nding algorithm, it
was j1i =
1
p
t
P
t�1
k=0
j�
k
i, whi h orresponded to the uniform distribution on
the set of eigenvalues asso iated with the orbit of 1.
| Is it possible to �nd eigenvalues of other operators in the
same way as in the algorithm for determining the period? Let us
be a urate: by �nding an eigenvalue with pre ision Æ and error probability
� " we mean onstru ting a measuring operator with garbage, as in equa-
tion (12.3), where j 2 is an index of an eigenvalue �
j
= e
2�i'
j
, L
j
is
the orresponding eigenspa e, and y = :�
1
� � ��
n
(n = dlog
2
(1=Æ)e). The
onditional probabilities (12.4) should satisfy
(13.5) Pr
h
jy � '
j
j
mod 1
< Æ
i
=
X
y: jy�'
j
j
mod 1
<Æ
P(yjj) � 1� ":
The answer to the question is \yes" | it is only ne essary to implement
�(U), whi h is usually easy. (For example, if U j0i = j0i, we an use the
result of Problem 8.4.) However, in general, the attainable pre ision is not
13. Quantum algorithms for Abelian groups 131
great and depends polynomially on the number of times the operator �(U)
is used. If one an eÆ iently ompute the powers of U , e.g., if one an
implement the operator
(13.6) �
m
(U) : jpi j�i 7! jpi U
p
j�i (0 � p < 2
m
);
then the pre ision an be made exponential, Æ = exp(�(m)).
13.7. Parallelized version of phase estimation. Appli ations. Re-
markably, the phase estimation pro edure (ex ept for its last part | the
ontinued fra tion algorithm) an be realized by a quantum ir uit of small
depth. This result is due to R.Cleve and J.Watrous [18℄, but our proof is
di�erent from theirs.
Theorem 13.3. Eigenvalues of a unitary operator U an be determined with
pre ision Æ = 2
�n
and error probability � " = 2
�l
by an O(n(l + log n))-
size, O(log n+log l)-depth quantum ir uit over the standard basis, with the
additional gate �
m
(U), m = n+ log(l+ logn) +O(1). This gate is used in
the ir uit only on e.
Proof. At the ore of the usual phase estimation pro edure is a sequen e
of operators �(U
2
k
), k = 1; : : : ; n� 1, applied to the same main register A
with distin t ontrol qubits 1; : : : ; t. (Here t = 2ns, whi h orresponds to
2n series of Bernoulli trials, ea h onsisting of s = �(l + logn) oin tosses.
Ea h series is made to determine a single number os(2
k
'
j
) or sin(2
k
'
j
)
with the suitable onstant pre ision and error probability 2
�l
=(2n).) We
need to parallelize these sequen es. This an be done as follows: instead of
applying the ir uit
�(U
p
t
)[t; A℄ � � � �(U
p
1
)[1; A℄;
we ompute p = p(u
1
; : : : ; u
t
) = u
1
p
1
+ � � � + u
t
p
t
(where u
1
; : : : ; u
t
2 B
are the values of the ontrol qubits), use p as the ontrol parameter for the
operator �
m
(U), and un- ompute p.
To optimize the omputation of p, we noti e that ea h p
r
is of the form
p
r
= 2
k
r
. The terms in the sum
P
t
r=1
u
r
p
r
an be divided into 2s groups in
su h a way that the numbers k
r
be distin t within ea h group. Therefore,
ea h group orresponds to an n-digit integer, and there are 2s = O(l+logn)
of su h integers. The sum an be omputed by a ir uit of size O(n(l+log n))
and depth O(log n+ log l) (see Problem 2.13a).
Let us estimate the omplexity of the remaining part of the pro edure.
Ea h gate �(U
2
k
) is a ompanied by two H gates and possibly by one K
gate, whi h ontribute O(t) to the size and O(1) to the depth. Further, one
needs to ount the number of \heads" in ea h of the 2n series of oin ips.
This is done by ir uits of size O(s) and depth O(log s). The subsequent
132 2. Quantum Computation
trigonometri al ulations are performed with onstant pre ision, so ea h
instan e of su h al ulation is done by a ir uit of onstant size. Finally,
sharpening of the value of '
j
is arried out by a ir uit of size O(n) and
depth O(log n). All these numbers stay within the required bounds. �
Unfortunately, Theorem 13.3 does not imply that the algorithms for
period �nding and fa toring an be fully parallelized. However, one an
derive the following orollary.
Corollary 13.3.1. Period �nding and fa toring an be performed by a uni-
form sequen e of O(n
3
)-size, O((log n)
2
)-depth quantum ir uits, with some
lassi al pre-pro essing and post-pro essing. The pre-pro essing and post-
pro essing are realized by uniform sequen es of O(n
3
)-size Boolean ir uits.
Note that if we use De�nition 9.2, lassi al pre-pro essing does not ount,
sin e it an be in luded into the ma hine that generates the ir uit. (How-
ever, the post-pro essing does ount.) The division into three stages is lear
from Table 13.1. The pre-pro essing amounts to modular exponentiation,
(q; a; p) 7! a
p
mod q. No small depth ir uit is known for the solution of this
problem. Thus we must ompute the numbers a
2
j
mod q (j = 0; : : : ; 2n�1)
in advan e. The post-pro essing in ludes �nding the exa t value of ' (by
the ontinued fra tion algorithm) and the al ulation of the least ommon
denominator (by Eu lid's algorithm).
Proof of Corollary 13.3.1. The operator
�(U
a
) :
�
�
p; x
�
7!
�
�
p; (a
p
x mod q)
�
is realized using the onstru tion from the proof of Theorem 7.3. We need
to ompute (a
p
x mod q) and (a
�p
x mod q) by ir uits of small depth. With
pre- omputed values of (a
2
j
mod q) and (a
�2
j
mod q), the omputation of
(a
p
x mod q) or (a
�p
x mod q) amounts to multiplying O(n) numbers and
al ulating the residue mod q, whi h is done by a ir uit of size O(n
3
) and
depth O((log n)
2
). �
Remark 13.1. R.Cleve and J.Watrous [18℄ also noti ed that the depth
an be de reased at the ost of in rease in size. Indeed, the multipli ation
of O(n) n-digit numbers an be performed with depth O(logn) and size
O(n
5
(log n)
2
) (see [9℄); therefore the same bound applies to period �nding
and fa toring.
Now we an also prove Theorem 8.3. We will a tually onsider a more
general situation: instead of realizing a single operator, we will try to sim-
ulate a ir uit (see Theorem 13.5 below). Let us begin with a lemma.
Lemma 13.4. The operator �
n
(e
2�i=2
n
) : jli 7! e
2�il=2
n
jli (0 � l < 2
n
) an
be realized with pre ision Æ = 2
�n
by an O(n
2
logn)-size O((log n)
2
)-depth
13. Quantum algorithms for Abelian groups 133
ir uit C
n
over the standard basis, using an illas. The ir uit C
n
an be
onstru ted algorithmi ally in time poly(n).
Proof. Let us assume that we have at our disposal an n-qubit register in
the state
(13.7) j
n;k
i =
1
p
2
n
2
n
�1
X
j=0
exp
�
�2�i
kj
2
n
�
jji;
where k is odd. We will now see how it helps to a hieve our goal of realizing
the operator �
n
(e
2�i=2
n
).
The ve tor j
n;k
i is an eigenve tor of the permutation operator X :
jji 7! j(j + 1) mod 2
n
i,
Xj
n;k
i = e
2�i'
k
j
n;k
i; '
k
= k=2
n
:
Appli ation of a power ofX to the target state j
n;k
i results in multipli ation
by a phase fa tor,
X
p
j
n;k
i = e
2�i(kp=2
n
)
j
n;k
i:
If k is odd, we an hoose p to satisfy kp � l (mod 2
n
), whi h will provide
the required phase fa tor e
2�il=2
n
. Thus, for the realization of the operator
jli 7! e
2�il=2
n
jli we use the value of l to ompute p, apply the operator
(13.8) �
n
(X) : jp; ji 7!
�
�
p; (j + p) mod 2
n
�
; p; j 2 f0; : : : ; 2
n
� 1g;
ontrolled by this p, and \un- ompute" p. The operator �
n
(X) an be
realized by a ir uit of size O(n) and depth O(log n) over the standard
basis.
Ideally, we would want to use the ve tor j
n;k
i for some parti ular k,
say, k = 1. But onstru ting su h a ve tor is not easy, so we will start from
a superposition of all odd values of k, namely,
j�i =
1
p
2
j0i �
1
p
2
j2
n�1
i =
1
p
2
n�1
2
n�1
X
s=1
j
n;2s�1
i:
Then we will measure k = 2s � 1 and solve the equation for p. We now
des ribe the required a tions.
1. Create the ve tor j�i = �
z
[1℄H[1℄ j0
n
i.
2. Measure k with error probability � " = Æ
2
=4. To �nd k, it suÆ es to de-
termine the phase '
k
= k=2
n
with pre ision Æ = 2
�n
. By Theorem 13.3,
su h phase estimation is realized by a ir uit of size O(n
2
) and depth
O(log n). The measured value should be odd, k = 2s � 1. (If it has
happened to be even, set k = 1.)
3. Find p = p(s; l) satisfying the equation (2s � 1)p � l (mod 2
n
) (see
below).
134 2. Quantum Computation
4. Apply X
p
to the n-bit register (whi h presumably ontains j
n;2s�1
i).
This will e�e t the desired phase shift.
5. Reverse the omputation done at Steps 1{3.
Apart from Step 1 and its reverse, the above pro edure an be des ribed
symboli ally as W
�1
VW , where W represents Steps 2 and 3, and V rep-
resents Step 4. Hen e the result of Problem 12.2 applies | the pro edure
realizes the operator U = �
n
(e
2�i=2
n
) with pre ision 2
p
" = Æ.
Step 3 is the most demanding for resour es. The solution to the equation
(2s� 1)p � l (mod 2
n
) an be obtained as follows:
p � �l
m�1
X
j=0
(2s)
j
� �l
t�1
Y
r=1
�
1 + (2s)
2
r
�
mod 2
n
; m = 2
t
; t = dlog
2
ne:
This al ulation is done by a ir uit of size O(n
2
log n) and depth O((log n)
2
)
( f. solution to Problem 2.14a). �
Theorem 13.5. Any ir uit C of size L and depth d over a �xed �nite basis
C an be simulated with pre ision Æ by an O
�
Ln+n
2
logn
�
-size O
�
d log n+
(log n)
2
�
-depth ir uit
~
C over the standard basis (using an illas), where n =
O(log(L=Æ)).
Proof. Due to the results of Problems 8.1 and 8.2, ea h gate of the original
basis C an be repla ed by a onstant size ir uit over the basis Q[
�
�(e
i'
) :
' 2 R
. Thus the ir uit C is transformed into a ir uit C
0
of size L
0
=
O(L) and depth d
0
= O(d) over the new basis. Ea h gate �(e
i'
) an be
approximated with pre ision Æ
0
= Æ=(3L
0
) by a gate of the form �(e
2�il=2
n
),
where n = dlog
2
(1=Æ
0
)e, and l is an integer. The operator �(e
2�il=2
n
) is a
spe ial ase of �
n
(e
2�i=2
n
), hen e we an use Lemma 13.4. However, the
resulting ir uit is somewhat larger than required, although it will suÆ e
for the proof of Theorem 8.3 (whi h orresponds to the ase L = d = 1).
To optimize the above simulation pro edure, let us examine the proof
of Lemma 13.4. Most of the resour e usage an attributed to solving the
equation kp � l (mod 2
n
). But this step is redundant if k = 1. In fa t, the
operator �
n
(e
2�i=2
n
) an be realized by applying �
n
(X) (see (13.8)) to the
target state j
n;1
i; this is done by a ir uit of size O(n) and depth O(log n).
Thus we need to reate L
0
opies of the state j
n;1
i and use one opy per
gate in the simulation of the ir uit C
0
. The exa t sequen e of a tions is as
follows.
1. Create the state j
n;0
i = H
n
j0
n
i.
2. Turn it into j
n;1
i = �
n
(e
�2�i=2
n
)j
n;0
i by the pro edure of Lemma 13.4.
This is done with pre ision Æ
0
= 2
�n
� Æ=3. The orresponding ir uit
has size O(n
2
log n) and depth O((log n)
2
).
13. Quantum algorithms for Abelian groups 135
3. Make L
0
opies of the state j
n;1
i out of one opy (see below).
4. Simulate the ir uit C
0
with pre ision Æ=3, using one opy of j
n;1
i per
gate.
5. Reverse Steps 1{3.
To produ e multiple opies of the state j
n;1
i, we an use the equation
j
n;k
i
m
=W
�1
�
j
n;0
i
(m�1)
j
n;k
i
�
;
W : jx
1
; : : : ; x
m�1
; x
m
i 7! jx
1
; : : : ; x
m�1
; x
1
+ � � �+ x
m
i:
The operatorsW andW
�1
are realized using the onstru tion from the proof
of Theorem 7.3. This involves the addition of m n-digit numbers, whi h is
done by a Boolean ir uit of size O(nm) and depth O(log n + logm) (see
Problem 2.13a). In our ase m = L
0
= O(L). The overall size and depth of
the resulting quantum ir uit are as required. �
[3!℄ Problem 13.4. Let q � 1, n = dlog
2
qe, Æ = 2
�l
. Realize the Fourier
transform on the group Z
q
with pre ision Æ by a quantum ir uit of size
poly(n; l) and depth poly(log n; log l) over the standard basis. Estimate the
size and the depth of the ir uit more a urately. (For de�nition of the
quantum Fourier transform, see Problem 9.4 .)
13.8. The hidden subgroup problem for Z
k
. The algorithms dis ov-
ered by Simon and Shor an be generalized to a rather broad lass of prob-
lems onne ted with Abelian groups. The most general of these is the hidden
subgroup problem for Z
k
[12℄, to whi h the hidden subgroup problem in an
arbitrary �nitely generated Abelian group G an be redu ed. (Indeed, G
an be represented as a quotient group of Z
k
for some k.)
A \hidden subgroup" D � Z
k
(as de�ned on page 117) has �nite index:
the order of the group E = Z
k
=D does not ex eed 2
n
. Therefore D
�
=
Z
k
. From the omputational viewpoint, D is given by a basis (g
1
; : : : ; g
k
)
whose binary representation has length poly(k; n). Any su h basis gives a
solution to the problem. (The equivalen e of two bases an be veri�ed by a
polynomial algorithm.)
The problem of omputing the period is a spe ial ase of the hidden
subgroup problem in Z. Re all that per
q
(a) = minft � 1 : a
t
� 1 (mod q)g.
The fun tion f : x 7! a
x
(mod q) satis�es ondition (13.1), where D =
fm per
q
(a) : m 2 Zg. This fun tion is polynomially omputable, hen e
an arbitrary polynomial algorithm for �nding a hidden subgroup an be
transformed into a polynomial algorithm for al ulating the period.
The well-known problem of al ulating the dis rete logarithm an be
redu ed to the hidden subgroup problem for Z
2
. The smallest positive in-
teger s su h that �
s
= a, where � is a generator of the group (Z=qZ)
�
, is
136 2. Quantum Computation
alled the dis rete logarithm of a number a at base �. Consider the fun tion
f : (x
1
; x
2
) 7! �
x
1
a
x
2
mod q. This fun tion also satis�es ondition (13.1),
where D =
�
(x
1
; x
2
) 2 Z
2
: �
x
1
a
x
2
� 1 mod q
. If we know a basis of the
subgroup D � Z
2
, it is easy to �nd an element of the form (s;�1) 2 D.
Then �
s
= a, i.e., s is the dis rete logarithm of a at base �.
Let us des ribe a quantum algorithm that solves the hidden subgroup
problem for G = Z
k
. It is analogous to the algorithm for the ase G = (Z
2
)
k
,
but instead of the operator H
k
we use the pro edure for measuring the
eigenvalues. Instead of a basis for the group D we will look for a system
of generators of the hara ter group E
�
= Hom(E;U(1)) (the transition
from E
�
to D is realized by a polynomial algorithm; see, for example, [54,
Volume 1℄). The hara ter
(g
1
; : : : ; g
k
) 7! exp
�
2�i
X
j
'
j
g
j
�
is determined by the set '
1
; : : : ; '
k
of numbers modulo 1. These are rational
numbers with denominators not ex eeding jE
�
j � 2
n
.
If we produ e l = n+ 3 uniformly distributed random hara ters ('
(1)
1
,
. . . , '
(1)
k
), . . . , ('
(l)
1
, . . . , '
(l)
k
), then they will generate the entire group E
�
with probability � 1 � 1=2
l�n
= 1 � 1=8 (see Problem 13.1). It suÆ es to
know ea h '
(r)
j
with pre ision Æ and the probability of error � ", where
(13.9) Æ �
1
2
2n+1
; " �
1
5kl
:
The last ondition guarantees that the total probability of error does not
ex eed 1=8 + 1=5 < 1=3.
Let us hoose a suÆ iently large number M = 2
m
(a on rete estimate
an be obtained by analyzing the algorithm). We will work with integers
between 0 to M � 1.
Let us prepare, in one quantum register of length km, the states
j�i =M
�k=2
X
g2�
jgi; where � = f0; : : : ;M � 1g
k
:
In another register we put j0
n
i. We apply the quantum ora le (13.2) and
then dis ard the se ond register. We obtain the mixed state
� = Tr
[km+1;:::;km+n℄
�
U
�
j�ih�j j0
n
ih0
n
j
�
U
y
�
=M
�k
X
g;h2�:g�h2D
jgihhj:
Now we are going to measure the eigenvalues of the shift (mod M) op-
erators
V
j
:
�
g
1
; : : : ; g
j
; : : : ; g
k
�
7!
�
g
1
; : : : ; (g
j
+ 1) modM; : : : ; g
k
�
13. Quantum algorithms for Abelian groups 137
(only the j-th omponent hanges). These operators ommute, so that they
have a ommon basis of eigenve tors, and therefore we an determine their
eigenvalues simultaneously. The eigenvalues have the form e
2�is
j
=M
. The
orresponding eigenve tors are
j�
s
1
;:::;s
k
i = M
�k=2
X
(g
1
;:::;g
k
)2�
exp
�
�2�i
k
X
j=1
g
j
s
j
M
�
jg
1
; : : : ; g
k
i:
The probability that a given set (s
1
; : : : ; s
k
) will be realized equals
P(�; L
s
1
;:::;s
k
) = h�
s
1
;:::;s
k
j � j�
s
1
;:::;s
k
i
=M
�2k
X
g;h2Z
k
�
D
(g � h)�
�
(g)�
�
(h) exp
�
2�i
k
X
j=1
(g
j
� h
j
)s
j
M
�
;
where �
A
(�) denotes the hara teristi fun tion of the set A: The Fourier
transform of the produ t is the onvolution of the Fourier transforms of the
fa tors. Therefore,
(13.10) P(�; L
s
1
;:::;s
k
) =
1
jE
�
j
X
('
1
;:::;'
k
)2E
�
p
'
1
;:::;'
k
(s
1
; : : : ; s
k
);
where
p
'
1
;:::;'
k
(s
1
; : : : ; s
k
) =
k
Y
j=1
�
sin(M�(s
j
=M � '
j
))
M sin(�(s
j
=M � '
j
))
�
2
:
For a given element ('
1
; : : : ; '
k
) 2 E
�
the fun tion p
'
1
;:::;'
k
is a probabil-
ity distribution. Therefore equation (13.10) an be modeled by the following
pro ess: �rst, a random uniformly distributed element ('
1
; : : : ; '
k
) 2 E
�
is
generated; se ond, the parameters s
1
; : : : ; s
k
are set a ording to the ondi-
tional probabilities p
'
1
;:::;'
k
(s
1
; : : : ; s
k
). The onditional probabilities have
the following property:
Pr
h
js
j
=M � '
j
j > �
i
�
1
M�
for any � > 0. If we estimate the quantities s
j
=M with pre ision � and
error probability � 1=M�, we obtain the values of '
1
; : : : ; '
k
with pre ision
Æ = 2� and error probability� " = 2=M�. It remains to hoose the numbers
M and � so that inequality (13.9) be satis�ed.
Complexity of the algorithm. We need O(n) queries to the ora le, ea h
query being of length O(k(n + log k)). The size of the quantum ir uit is
estimated as O(kn
3
) poly(log k; log n).
138 2. Quantum Computation
14. The quantum analogue of NP: the lass
BQNP
It is possible to onstru t quantum analogues not only for the lass P, but
also for other lassi al omplexity lasses. This is not a routine pro ess, but
suitable generalizations often ome up naturally. We will onsider the lass
NP as an example. (For another example | the lass IP and its quantum
analogue QIP | see [72, 38℄. We also mention that the quantum analogue
of PSPACE equals PSPACE [71℄.)
14.1. Modi� ation of lassi al de�nitions. Quantum omputation, as
well as probabilisti omputation, is more naturally des ribed using partially
de�ned fun tions. Earlier we made do without this on ept so as not to
ompli ate matters by the in lusion of extraneous detail, but now we need
it.
A partially de�ned Boolean fun tion is a fun tion
F : B
n
! f0; 1; \unde�ned"g:
In this se tion it will be ta itly understood that by Boolean fun tion we
mean partially de�ned Boolean fun tion.
One more omment regarding notation: we have used the symbol P
both for the lass of polynomially omputable fun tions and for the lass
of polynomially de idable predi ates; now we a t analogously, using the
notations P, NP, et . for lasses of partially de�ned fun tions.
P, of ourse, denotes the lass of polynomially omputable partially
de�ned fun tions. We introdu e a modi�ed de�nition of the lass NP.
De�nition 14.1. A fun tion F : B
n
! f0; 1; \unde�ned"g belongs to the
lass NP if there is a partially de�ned fun tion R 2 P in two variables su h
that
F (x) = 1 =) 9 y
�
(jyj < q(jxj)) ^ (R(x; y) = 1)
�
F (x) = 0 =) 8 y
�
(jyj < q(jxj))) (R(x; y) = 0)
�
:
As before, q(�) is a polynomial.
What would hange if in De�nition 14.1 we repla ed the ondition R 2 P
by the ondition R 2 BPP? First of all, we would get a di�erent, broader,
lass, whi h we ould denote by BNP. However, for this lass there is an-
other, standard, notation | MA, indi ating that it falls into a hierar hy
of lasses de�ned by Arthur-Merlin games. We have mentioned Arthur and
Merlin in onne tion with the de�nition of NP. We have also dis ussed
games orresponding to other omplexity lasses (see Se tion 5.1). Tradi-
tionally, the term \Arthur-Merlin games" is used for probabilisti games in
whi h Arthur is a polynomial Turing ma hine whereas Merlin is all-powerful;
14. The quantum analogue of NP: the lass BQNP 139
before ea h move Arthur ips oins so that both players see them. The or-
der of the letters in the symbol MA indi ates the order of the moves: at
�rst Merlin ommuni ates y , then Arthur he ks the truth of the predi-
ate R(x; y), by a polynomial probabilisti omputation. The message y is
sometimes alled a \proof"; it may be hard to �nd but easy to he k.
14.2. Quantum de�nition by analogy.
De�nition 14.2. A fun tion F : B
n
! f0; 1; \unde�ned"g belongs to the
lass BQNP if there exists a polynomial lassi al algorithm that omputes
a fun tion x 7! Z(x), where Z(x) is a des ription of a quantum ir uit,
realizing an operator U
x
: B
N
x
! B
N
x
su h that
F (x) = 1 =) 9 j�i 2 B
m
x
P
�
U
x
j�i j0
N
x
�m
x
i; M
�
� p
1
;
F (x) = 0 =) 8 j�i 2 B
m
x
P
�
U
x
j�i j0
N
x
�m
x
i; M
�
� p
0
:
Here M = C
�
j1i
�
B
(N
x
�1)
, and p
0
; p
1
satisfy the ondition p
1
� p
0
�
(n
��
) for some onstant � � 0. The quanti�ers of j�i in lude only ve tors
of unit length. (We will use an analogous onvention further on in this
se tion, pushing numeri fa tors outside the j�i sign.)
The ve tor j�i plays the role of y in the previous de�nition. Note that
m
x
� N
x
� jZ(x)j = poly(jxj) sin e the algorithm is polynomial.
In e�e t, the very same game of Merlin with Arthur is taking pla e,
but now it is governed by the laws of quantum me hani s. Merlin sends a
quantum message (the state j�i) to Arthur, who he ks it by applying the
operator U
x
. A suitable message will onvin e Arthur that F (x) = 1 (if
this is a tually so) with probability � p
1
. But if F (x) = 0, Merlin annot
su eed in onvin ing Arthur to the ontrary with probability higher than p
0
,
whatever message he sends. Instead of a pure state j�i, we an allow Merlin
to send an arbitrary density matrix | the maximum of the probability is
a hieved on a pure state anyway.
In De�nition 14.2, we have the same exibility in hoosing the threshold
probabilities p
0
and p
1
as in the de�nitions of BPP and BQP.
Lemma 14.1 (ampli� ation of probabilities). If F 2 BQNP, then it
likewise satis�es a variant of De�nition 14.2 in whi h the numbers p
0
, p
1
(p
1
� p
0
= (n
��
)) are repla ed by
p
0
1
= 1� "; p
0
0
= "; " = exp(�(n
�
));
where � is an arbitrary positive onstant.
Proof. The general idea of amplifying the probabilities remains as before:
we onsider k = poly(n) opies of the ir uit realizing the operator U = U
x
.
140 2. Quantum Computation
To the results of their work we apply a variant of the majority fun tion,
with the threshold value adjusted so as to separate p
0
from p
1
:
(14.1) G(z
1
; : : : ; z
k
) =
(
1 if
P
k
j=1
z
j
� pk;
0 if
P
k
j=1
z
j
< pk;
where p = (p
0
+ p
1
)=2. But now there appears an additional diÆ ulty:
Merlin may attempt to de eive Arthur by sending him a message that is not
fa torable into the tensor produ t.
Let us grant Merlin greater freedom, allowing him to submit any density
matrix � 2 L(B
km
). The probability of obtaining out omes z
1
; : : : ; z
k
by
applying k opies of U to the message � is as follows:
(14.2) P(z
1
; : : : ; z
k
j �) = Tr
�
X
(z
1
)
� � � X
(z
k
)
�
�
;
where
(14.3) X
(a)
= Tr
[m+1;:::;N ℄
�
U
y
�
(a)
1
U
�
I
B
m j0
N�m
ih0
N�m
j
�
�
:
Here �
(a)
1
is the proje tion onto the subspa e of states having a in the �rst
qubit (i.e., C
�
jai
�
B
(N�1)
).
If F (x) = 1, Merlin an simply send the state � = �
k
x
, where �
x
=
j�
x
ih�
x
j is the message that would onvin e Arthur with probability � p
1
in
the original version of the game (with a single opy of U). By the general
properties of quantum probability, formula (14.2) takes the form
P(z
1
; : : : ; z
k
j �) =
k
Y
j=1
Tr(X
(z
j
)
�
x
) =
k
Y
j=1
P(z
j
j�
x
):
We will derive a lower bound for this quantity from a more general analysis
given below.
In the opposite ase, F (x) = 0, we will obtain an upper bound for the
probability P(z
1
; : : : ; z
k
j�) over all density matri es �.
Let us sele t an orthonormal basis in the spa e B
m
, in whi h the oper-
ator X
(1)
is diagonalized (this operator is learly Hermitian). The operator
X
(0)
= I � X
(1)
is diagonal in the same basis. We de�ne a set of \ ondi-
tional probabilities" p(zjd) = P
�
z
�
�
jdihdj
�
= hdjX
(z)
jdi, where jdi is one of
the basis ve tors. It is obvious that p(zjd) � 0 and p(0jd) + p(1jd) = 1. In
this notation, the quantity P(z
1
; : : : ; z
k
j�) be omes
P(z
1
; : : : ; z
k
j �) =
X
d
1
;:::;d
k
p
d
1
;:::;d
k
p(z
1
jd
1
) � � � p(z
k
jd
k
);
X
d
1
;:::;d
k
p
d
1
;:::;d
k
= 1;
where p
d
1
;:::;d
k
= hd
1
; : : : ; d
k
j�jd
1
; : : : ; d
k
i.
This formula has the following interpretation. Consider the set of prob-
abilities P(z
1
; : : : ; z
k
j�) for all sequen es (z
1
; : : : ; z
k
) 2 B
k
as a ve tor in
14. The quantum analogue of NP: the lass BQNP 141
a 2
k
-dimensional real spa e; we denote this ve tor by
�!
P(�) 2 R
B
k
. We
have shown that for a general density matrix � the ve tor
�!
P(�) belongs
to the onvex hull of su h ve tors orresponding to produ t states, namely,
jd
1
ihd
1
j� � �jd
k
ihd
k
j. Therefore the probability of the event G(z
1
; : : : ; z
k
) =
1,
Pr
�
G(z
1
; : : : ; z
k
) = 1
�
�
�
�
=
X
z2B
k
G(z)P(zj�) =
�
�!
G;
�!
P (�)
�
;
a hieves its maximum at a density matrix of this spe ial type.
In the ase where G is the threshold fun tion (14.1),
p
max
def
= max
�
Pr
�
G(z
1
; : : : ; z
k
) = 1
�
�
�
�
=
X
j�l
�
k
j
�
p
j
�
(1� p
�
)
k�j
;
where p
�
= max
j�i
h�jX
(1)
j�i. The number p
max
equals the probability of
getting � pk \heads" for k oins tossed, Pr
h
k
�1
P
k
j=1
v
j
� p
i
, where
Pr
�
v
j
= 1
�
= p
�
. This probability an be estimated using Cherno�'s in-
equality (13.4). Thus we obtain
11
p
max
� exp(�2(p� p
�
)
2
k) if p � p
�
;
p
max
� 1� exp(�2(p� p
�
)
2
k) if p � p
�
:
A ording to the assumptions of the lemma, p
�
� p
0
if F (x) = 0, and
p
�
� p
1
if F (x) = 1. We have hosen p so that p � p
0
� (n
��
) and
p
1
� p � (n
��
). Choosing k = n
2�+�
(for a suitable onstant ), we get
exa tly the estimate whi h is required,
p
max
� exp(�(n
�
)) if F (x) = 0;
p
max
� 1� exp(�(n
�
)) if F (x) = 1:
�
Remark 14.1. An important point in the proof was the fa t that X
(0)
and
X
(1)
are diagonalized over the same basis. In general, the ampli� ation of
probability for nontrivial omplexity lasses (both lassi al and quantum) is
a rather subtle thing.
14.3. Complete problems. Similarly to the lass NP, the lass BQNP
has omplete problems. Completeness is understood with respe t to the
same polynomial redu tion that we onsidered earlier (i.e., Karp redu tion;
see De�nition 3.4). Here is the simplest example.
11
We have omitted the fa tor 2 from (13.4) be ause it in ludes both ases that are now
onsidered separately (see the proof of Cherno�'s inequality in the solution to Problem 13.3).
This is an unimportant fa tor anyway.
142 2. Quantum Computation
Problem 0. Consider a fun tion F whi h is de�ned on a subset of the
words of this form:
z =
�
(des ription of a quantum ir uit U); p
0
; p
1
�
;
where by des ription of a ir uit we mean its approximate realization in the
standard basis, and p
0
; p
1
are su h that p
1
� p
0
� (n
��
) (n is the size of
the ir uit, � > 0 is a onstant). The fun tion F is de�ned as follows:
F (z) = 1 () there exists a ve tor j�i, on whi h we get 1 in the �rst
bit with probability greater than p
1
;
F (z) = 0 () for all j�i the probability of getting 1 in the �rst bit is
smaller than p
0
.
The ompleteness of Problem 0 is obvious: by saying that the problem
is omplete we just rephrase De�nition 14.2.
We now onsider more interesting examples. To start, we de�ne a quan-
tum analog of 3-CNF | the lo al Hamiltonian (lo ality is the analogue of
the ondition that the number of variables in ea h lause is bounded).
De�nition 14.3. An operator H : B
n
! B
n
is alled a k-lo al Hamil-
tonian if it is expressible in the form
H =
X
j
H
j
[S
j
℄;
where ea h term H
j
2 L(B
jS
j
j
) is a Hermitian operator a ting on a set of
qubits S
j
, jS
j
j � k.
In addition, we put a normalization ondition, namely, 0 � H
j
� 1,
meaning that both H
j
and I �H
j
are nonnegative.
Problem 1: the lo al Hamiltonian. Let
z =
�
des ription of a k-lo al Hamiltonian H; a; b
�
;
where k = O(1), 0 � a < b, b� a = (n
��
) (� > 0 is a onstant). Then
F (x) = 1 () H has an eigenvalue not ex eeding a;
F (x) = 0 () all eigenvalues of H are greater than b:
Proposition 14.2. The problem lo al Hamiltonian belongs to BQNP.
Proof. At the outset we des ribe the general idea. We onstru t a ir uit
W that an be applied to a state j�i 2 B
n
so as to produ e a result 1 or 0
(\yes" or\no"): it says whether Arthur a epts the submitted state or not.
The answer \yes" will o ur with probability p = 1 � r
�1
h�jHj�i, where
r is the number of terms in the Hamiltonian H. If j�i is an eigenve tor
14. The quantum analogue of NP: the lass BQNP 143
orresponding to an eigenvalue � � a, then the probability of the answer
\yes" is
p = 1� r
�1
h�jHj�i = 1� r
�1
� � 1� r
�1
a;
and if every eigenvalue of H ex eeds b, then
p = 1� r
�1
h�jHj�i � 1� r
�1
b:
At �rst we onstru t su h a ir uit for a single term. This will be just a
realization of the POVM measurement orresponding to the de omposition
I = H
j
+ (I � H
j
). We ould use the general result about POVM mea-
surements (see Problem 11.8), but let us give an expli it onstru tion from
s rat h.
Let H
j
=
P
s
�
s
j
s
ih
s
j. This operator a ts on a bounded number of
qubits, jS
j
j � k. Therefore we an realize the operator
W
j
: j
s
; 0i 7! j
s
i
�
p
�
s
j0i+
p
1� �
s
j1i
�
by a ir uit of onstant size. It a ts on the set of qubits S
j
[ f\answer"g,
where \answer" denotes the qubit that will ontain the measurement out-
ome.
We ompute the probability that the out ome is 1. Let j�i =
P
s
y
s
j
s
i
be the expansion of j�i in the orthogonal system of eigenve tors of H
j
. We
have, by de�nition of the probability,
P
j
(1) = h�; 0jW
y
j
�
I j1ih1j
| {z }
answer
�
W
j
j�; 0i
=
X
s
y
�
s
h
s
; 0j
!
W
y
j
�
I j1ih1j
| {z }
answer
�
W
j
X
t
y
t
j
t
; 0i
!
=
X
s;t
p
1� �
s
y
�
s
p
1� �
t
y
t
h
s
j
t
i =
X
s
(1� �
s
)y
�
s
y
s
= 1� h�jH
j
j�i:
The general ir uit W hooses the integer j randomly and uniformly,
after whi h it applies the orresponding operator W
j
. This pro edure an
be realized by the measuring operator
P
j
jjihjj W
j
, applied to the initial
ve tor
�
1
p
r
P
j
jji
�
j�; 0i. (Here jji denotes a basis ve tor in an auxiliary
r-dimensional spa e.) The probability of getting the out ome 1 is
P(1) =
X
j
1
r
P
j
(1) =
X
j
1
r
�
1� h�jH
j
j�i
�
= 1� r
�1
h�jHj�i:
�
144 2. Quantum Computation
14.4. Lo al Hamiltonian is BQNP- omplete.
Theorem 14.3. The problem lo al Hamiltonian is BQNP- omplete
with respe t to the Karp redu tion.
The rest of this se tion onstitutes a proof of this theorem. The main
idea goes ba k to Feynman [24℄: repla ing a unitary evolution by a time
independent Hamiltonian (i.e., transition from the ir uit to a lo al Hamil-
tonian).
Thus, suppose we have a ir uit U = U
L
� � �U
1
of size L. We will assume
that U a ts on N qubits, the �rst m of whi h initially ontain Merlin's
message j�i, the rest being initialized by 0. The gates U
j
a t on pairs of
qubits.
14.4.1. The Hamiltonian asso iated with the ir uit. It a ts on the
spa e
L = B
N
C
L+1
;
where the �rst fa tor is the spa e on whi h the ir uit a ts, whereas the
se ond fa tor is the spa e of a step ounter ( lo k). The Hamiltonian onsists
of three terms whi h will be de�ned later,
H = H
in
+H
prop
+H
out
:
We are interested in the minimum eigenvalue of this Hamiltonian, or the
minimum of the ost fun tion f
�
j�i
�
= h�jHj�i over all ve tors j�i of unit
length. We will try to arrange that the Hamiltonian has a small eigenvalue
if and only if there exists a quantum state j�i 2 B
m
ausing U to output
1 with high probability. In su h a ase, the minimizing ve tor j�i will be
related to that j�i in the following way:
j�i =
1
p
L+ 1
L
X
j=0
U
j
� � �U
1
j�; 0i jji:
In onstru ting the terms of the Hamiltonian, we will try to \enfor e" this
stru ture of the ve tor j�i by imposing \penalties" that in rease the ost
fun tion whenever j�i deviates from the indi ated form.
The term H
in
orresponds to the ondition that, at step 0, all the qubits
but m are in state j0i. Spe i� ally,
(14.4) H
in
=
N
X
s=m+1
�
(1)
s
!
j0ih0j;
where �
(�)
s
is the proje tion onto the subspa e of ve tors for whi h the s-th
qubit equals �. The se ond fa tor in this formula a ts on the spa e of the
ounter. (Informally speaking, the term �
(1)
s
j0ih0j \ olle ts a penalty" by
14. The quantum analogue of NP: the lass BQNP 145
adding 1 to the ost fun tion whenever the s-th qubit is in state j1i while
the ounter being in state j0i.)
The term H
out
orresponds to the �nal state and equals
(14.5) H
out
= �
(0)
1
jLihLj:
Here we assume that the bit of the result has number 1. (That is, at step L
the �rst qubit should be in state j1i, or a penalty will be imposed.)
And, �nally, the term H
prop
des ribes the propagation of a quantum
state through the ir uit. It onsists of L terms, ea h of whi h orresponds
to the transition from j � 1 to j:
H
prop
=
L
X
j=1
H
j
;(14.6)
H
j
= �
1
2
U
j
jjihj�1j �
1
2
U
y
j
jj�1ihjj +
1
2
I
�
jjihjj + jj�1ihj�1j
�
:
Ea h term H
j
a ts on two qubits of the spa e B
N
, as well as on the spa e
of the ounter (the latter is not represented by qubits yet).
14.4.2. Change of basis. We e�e t the hange of basis given by the op-
erator
W =
L
X
j=0
U
j
� � �U
1
jjihjj:
(It is worth mentioning that W is a measuring operator with respe t to the
the value of the ounter j.) The hange of basis means that we represent
the ve tor j�i in the form j�i =W je�i; from now on, we are dealing with je�i
instead of j�i. Under su h a hange, the Hamiltonian is transformed into its
onjugate,
e
H =W
y
HW . We onsider how the onjugation by the operator
W a ts on the terms of H.
On the term H
in
the onjugation has no e�e t:
(14.7)
e
H
in
=W
y
H
in
W = H
in
:
The a tion on the term H
out
is:
(14.8)
e
H
out
=W
y
H
out
W =
�
U
y
�
(0)
1
U
�
jLihLj:
146 2. Quantum Computation
Ea h operator H
j
in (14.6) is the sum of three terms. Let us write the
a tion of the onjugation on the �rst of them:
W
y
�
U
j
jjihj�1j
�
W
=
X
p;t
�
U
p
� � �U
1
jpihpj
�
y
�
U
j
jjihj�1j
� �
U
t
� � �U
1
jtihtj
�
=
�
(U
j
� � �U
1
)
y
U
j
(U
j�1
� � �U
1
)
�
�
�
jjihjj
�
y
jjihj�1j
�
jj�1ihj�1j
�
�
= I jjihj�1j:
Conjugation of the two other terms pro eeds analogously, so that we obtain
e
H
j
=W
y
H
j
W
= I
1
2
�
jj�1ihj�1j � jj�1ihjj � jjihj�1j+ jjihjj
�
= I E
j
;
(14.9)
e
H
prop
=W
y
H
prop
W = I E;
where
E =
L
X
j=1
E
j
=
0
B
B
B
B
B
B
B
�
1
2
�
1
2
0
�
1
2
1 �
1
2
�
1
2
1 �
1
2
�
1
2
.
.
.
.
.
.
0 .
.
.
.
.
.
1
C
C
C
C
C
C
C
A
:
14.4.3. Existen e of a small eigenvalue if the answer is \yes".
Suppose that the ir uit U gives the out ome 1 (\yes") with probability
� 1� " on some input ve tor j�i. This, by de�nition, means that
P(0) =
�; 0
�
�
U
y
�
(0)
1
U
�
�
�; 0
�
� ":
We want to prove that in this ase
e
H (and so also H) has a small
eigenvalue. For this, it is suÆ ient to �nd a ve tor je�i su h that he�j
e
Hje�i is
small enough (the minimum of this expression as a fun tion of je�i is attained
at an eigenve tor).
In the spa e of the ounter we hoose the ve tor
(14.10) j i =
1
p
L+ 1
L
X
j=0
jji:
We set je�i = j�; 0i j i and estimate he�jHje�i.
It is lear that Ej i = 0. Therefore
he�j
e
H
prop
je�i = 0 = he�j
e
H
j
je�i:
14. The quantum analogue of NP: the lass BQNP 147
Sin e all auxiliary qubits are initially set to 0, we immediately obtain from
the de�ning formula (14.4) that
he�j
e
H
in
je�i = 0:
It remains to estimate the last term
he�j
e
H
out
je�i = he�j
�
U
y
�
(0)
1
U jLihLj
�
je�i = P(0)
1
L+ 1
�
"
L+ 1
:
Thus we have proved that
he�j
e
H je�i �
"
L+ 1
;
so that H itself has an eigenvalue with the very same upper bound.
14.4.4. Lower bound for the eigenvalues if the answer is \no".
Suppose that for any ve tor j�i the probability of the out ome 1 does not
ex eed ", i.e.,
h�; 0jU
y
�
(0)
1
U j�; 0i � 1� ":
We will prove that, in this ase, all eigenvalues of H are � (1�
p
")L
�3
,
where is some onstant.
The proof is rather long, so we will outline it �rst. We represent the
Hamiltonian in the form
e
H = A
1
+ A
2
, where A
1
=
e
H
in
+
e
H
out
, and A
2
=
e
H
prop
. Both terms are nonnegative. It is easy to show that the null subspa es
ofA
1
andA
2
have trivial interse tion (i.e., L
1
[L
2
= f0g), hen e the operator
A
1
+ A
2
is positive de�nite. But this is not enough for our purpose, so we
obtain lower bounds for nonzero eigenvalues of A
1
and A
2
, namely, 1 for A
1
,
and
0
L
�2
for A
2
. In order to estimate the smallest eigenvalue of A
1
+A
2
, we
also need to know the angle between the null subspa es. The angle #(L
1
;L
2
)
between subspa es L
1
and L
2
with trivial interse tion is given by
(14.11) os#(L
1
;L
2
) = max
j�
1
i2L
1
j�
2
i2L
2
�
�
h�
1
j�
2
i
�
�
; 0 < #(L
1
;L
2
) <
�
2
:
Lemma 14.4. Let A
1
, A
2
be nonnegative operators, and L
1
, L
2
their null
subspa es, where L
1
\L
2
= f0g. Suppose further that no nonzero eigenvalue
of A
1
or A
2
is smaller than v. Then
A
1
+A
2
� v � 2 sin
2
#
2
;
where # = # (L
1
;L
2
) is the angle between L
1
and L
2
.
The notation A � a (A an operator, a a number) must be understood as
an abbreviation for A�aI � 0. In other words, if A � a, then all eigenvalues
of A are greater than, or equal to, a.
148 2. Quantum Computation
In our ase we will get the estimates 1 and
0
L
�2
for the nonzero eigen-
values of A
1
and A
2
(as already mentioned), and sin
2
# � (1�
p
") =(L+1)
for the angle. From this we derive the desired inequality
H �
�
1�
p
"
�
L
�3
:
Proof of Lemma 14.4. It is obvious that A
1
� v(I � �
L
1
) and A
2
�
v(I��
L
2
), so it is suÆ ient to prove the inequality (I��
L
1
)+(I ��
L
2
) �
2 sin
2
(#=2). This, in turn, is equivalent to
(14.12) �
L
1
+�
L
2
� 1 + os#:
Let j�i be an eigenve tor of the operator �
L
1
+�
L
2
orresponding to an
eigenvalue � > 0. Then
�
L
1
j�i = u
1
j�
1
i; �
L
2
j�i = u
2
j�
2
i; u
1
j�
1
i+ u
2
j�
2
i = �j�i;
where j�
1
i 2 L
1
and j�
2
i 2 L
2
are unit ve tors, and u
1
, u
2
are nonnegative
real numbers. From this we �nd
� =
�
�
�
�
�
L
1
+�
L
2
�
�
�
�
�
= u
2
1
+ u
2
2
;
�
2
=
�
u
1
h�
1
j+ u
2
h�
2
j
��
u
1
j�
1
i+ u
2
j�
2
i
�
= u
2
1
+ u
2
2
+ 2u
1
u
2
Reh�
1
j�
2
i:
Consequently,
(1 + x)�� �
2
= x(u
1
� u
2
)
2
� 0; where x =
�
�
Reh�
1
j�
2
i
�
�
:
Thus � � 1 + x � 1 + os#. �
We will now obtain the above-mentioned estimates. The subspa es A
1
and A
2
an be represented in the form
(14.13)
L
1
=
�
B
m
j0
N�m
i j0i
�
�
�
B
N
C
�
j1i; : : : ; jL� 1i
�
�
�
�
U
y
�
j1i B
(N�1)
�
jLi
�
(the last fa tor in all three terms pertains to the ounter), and
(14.14) L
2
= B
N
j i;
where the ve tor j i was de�ned by formula (14.10).
For the estimate
(14.15) A
1
L
?
1
� 1
it suÆ es to note that A
1
is the sum of ommuting proje tions, so that all
eigenvalues of this operator are nonnegative integers.
For the estimate of A
2
L
?
2
we need to �nd the smallest positive eigenvalue
of the matrix E. The eigenve tors and eigenvalues of E are
j
k
i = �
k
L
X
j=0
os
�
q
k
�
j +
1
2
�
�
jji; �
k
= 1� os q
k
;
14. The quantum analogue of NP: the lass BQNP 149
where q
k
= �k=(L+ 1) (k = 0; : : : ; L). From this it follows that
(14.16) A
2
L
?
2
� 1� os
�
�
L+ 1
�
�
0
L
�2
:
Finally, we need to estimate the angle between the subspa es L
1
and L
2
.
We will estimate the square of the osine of this angle,
(14.17) os
2
# = max
j�
1
i2L
1
j�
2
i2L
2
�
�
h�
1
j�
2
i
�
�
2
= max
j�
2
i2L
2
h�
2
j�
L
1
j�
2
i:
Sin e the ve tor j�
2
i belongs to L
2
, it an be represented in the form j�
2
i =
j�i j i ( f. (14.14)). A ording to formula (14.13), the proje tion onto
L
1
breaks into the sum of three proje tions. It is easy to al ulate the
ontribution of the se ond term; it equals (L � 1)=(L + 1). The �rst and
third terms add up to
1
L+ 1
�
�
�
�
�
K
1
+�
K
2
�
�
�
�
�
�
1 + os'
L+ 1
;
where K
1
= B
N
, K
2
= U
y
�
j1i B
(N�1)
�
and ' is the angle between
these two subspa es. (Here we have used inequality (14.12), obtained in the
ourse of the proof of Lemma 14.4.)
The quantity os
2
' equals the maximum probability for the initial ir-
uit to produ e the out ome 1. By hypothesis this probability is not greater
than ". So we an ontinue the estimate (14.17):
h�
2
j�
L
1
j�
2
i �
L� 1
L+ 1
+
1 +
p
"
L+ 1
= 1�
1�
p
"
L+ 1
:
Consequently, sin
2
# = 1� os
2
# � (1�
p
") =(L+ 1) as asserted above.
14.4.5. Realization of the ounter. We wrote a ni e Hamiltonian al-
most satisfying the required properties. It has but one short oming | the
ounter is not a qubit. We ould, of ourse, represent it by O(logL) qubits,
but then the Hamiltonian would be only O(logL)-lo al, not O(1)-lo al.
This short oming an be removed if we embed the ounter spa e in a
larger spa e. We take L qubits, enumerated from 1 to L. The suitable
embedding C
L+1
! B
L
is
jji 7! j 1; : : : ; 1
| {z }
j
; 0; : : : ; 0
| {z }
L�j
i:
150 2. Quantum Computation
The operators on the spa e C
L+1
used in the onstru tion of the Hamiltonian
H are repla ed in a ordan e with the following s heme:
(13.23)
j0ih0j on �
(0)
1
; j0ih1j on
�
j0ih1j
�
1
�
(0)
2
;
jjihjj on �
(1)
j
�
(0)
j+1
; jj�1ihjj on �
(1)
j�1
�
j0ih1j
�
j
�
(0)
j+1
;
jLihLj on �
(1)
L
; jL�1ihLj on �
(1)
L�1
�
j0ih1j
�
L
:
Now they are 3-lo al (and the Hamiltonian itself, a ting also on the qubits
of the initial ir uit, is 5-lo al).
To be more pre ise, we have repla ed the Hamiltonian H, a ting on the
spa e L = B
N
C
L+1
, by a new Hamiltonian H
ext
, de�ned on the larger
spa e L
ext
= B
N
B
L
. The operator H
ext
maps the subspa e L � L
ext
into itself and a ts on it just as H does.
Now a new problem arises: what to do with the extra states in the
extended spa e of the ounter? We will ope with this problem by adding
still another term to the Hamiltonian H
ext
:
H
stab
= I
B
N
L�1
X
j=1
�
(0)
j
�
(1)
j+1
:
The null subspa e of the operator H
stab
oin ides with the old working spa e
L, so that the supplementary term does not hange the upper bound for the
minimum eigenvalue for the answer \yes".
For the answer \no" the required lower bound for the eigenvalues of the
operator H
ext
+ H
stab
an be re overed in the following way. Both terms
leave the subspa e L invariant, so that we an also examine the a tion of
H
ext
+ H
stab
on L and on its orthogonal omplement L
?
independently.
On L we have H
ext
� (1 �
p
")L
�3
and H
stab
= 0, and on L
?
we have
H
ext
� 0 and H
stab
� 1. (Here we use the fa t that ea h of the terms
of the Hamiltonian, (14.4), (14.5) and (14.6), remains nonnegative for the
hange (13.23).) In both ases
H
ext
+H
stab
� (1�
p
")L
�3
:
This ompletes the proof of Theorem 14.3.
14.5. The pla e of BQNP among other omplexity lasses. It fol-
lows dire tly from the de�nition that the lass BQNP ontains the lass MA
(and so also BPP and NP). Nothing more de�nitive an be said at present
about the strength of \nondeterministi quantum algorithms".
12
Nor an mu h more be said about its \weakness".
12
Caveat: in the literature there is also a di�erent de�nition of \quantum NP", for whi h a
omplete hara terization in terms of lassi al omplexity lasses an be obtained ( f. [2, 73℄).
15. Classi al and quantum odes 151
Proposition 14.5. BQNP � PSPACE.
Proof. The maximum probability that Merlin's message will be a epted
by Arthur is equal to the maximum eigenvalue of the operator X = X
(1)
( f. formula (14.3)). We will need to ompute this quantity with pre ision
O(n
��
), � > 0.
We note that 0 � X � 1. For the estimate of the maximum eigenvalue
we will use the following asymptoti equality:
ln�
max
= lim
d!1
lnTrX
d
d
:
Let �
max
= �
1
� �
2
� � � � � �
2
m
be the eigenvalues of the operator X (here
m = poly(n) is the length of the message). We have the inequality
ln�
max
�
lnTrX
d
d
=
ln
P
2
m
j=1
�
d
j
d
� ln�
max
+
m
d
ln 2:
Therefore, in order to estimate �
max
with polynomial pre ision, it suÆ es
to ompute the tra e of the d-th power of X, with d polynomial in m.
The omputation of the quantity TrX
d
is a hieved with polynomial
memory by the same means that was used to simulate a quantum ir uit. �
Remark 14.2. The result obtained an be strengthened: BQNP � PP.
The proof is ompletely analogous to the solution of Problem 9.5.
Remark 14.3. We have limited ourselves to the ase of Arthur-Merlin
games in whi h only one message is ommuni ated. In general, Arthur and
Merlin an play several rounds, sending messages ba k and forth. Re ently
it has been shown [72, 38℄ that su h a quantum game with three messages
(i.e., 1:5 rounds) has the same omplexity as the game with polynomially
many rounds. The orresponding omplexity lass is alled QIP; it on-
tains PSPACE. This ontrasts with the properties of lassi al Arthur-Merlin
games. In the lassi al ase, the game with polynomially many rounds yields
PSPACE [58, 59℄. But in wide ir les of narrow spe ialists the opinion pre-
vails that no �xed number of rounds would suÆ e.
13
15. Classi al and quantum odes
In this se tion we explain the on ept of error- orre ting ode, in its lassi al
and quantum formulations. Our exposition does not go beyond de�nitions
and basi examples; we do not address the problem of �nding odes with
optimal parameters. The interested reader is referred to [47℄ ( lassi al odes)
and [16℄ (quantum odes).
13
The game with an arbitrary onstant number of rounds orresponds to a omplexity lass
AM � �
2
[6℄. It is widely believed that \the polynomial hierar hy does not ollapse", i.e.,
o-NP = �
1
� �
2
� �
3
� � � � � PSPACE (the in lusions are stri t).
152 2. Quantum Computation
First, a bit of motivation. As dis ussed earlier, quantum omputation
is \not too" sensitive to errors in the realization of unitary operators: er-
rors a umulate linearly (see, e.g., the result of Problem 9.2). Therefore, a
physi al implementation of elementary gates with pre ision Æ will allow one
to use ir uits of size L � Æ
�1
. But this is not enough to make quantum
omputation pra ti al. Therefore a question arises: is it possible to avoid
a umulation of errors by using ir uits of some spe ial type? More spe i�-
ally, is it possible to repla e an arbitrary quantum ir uit by another ir uit
that would realize the same unitary operator (or ompute the same Boolean
fun tion), but in an error-resistant fashion?
The answer to this question is aÆrmative. In fa t, the new ir uit will
resist not only ina urate realization of unitary gates but also some intera -
tion with the environment and sto hasti errors (provided that they o ur
with small probability). The rough idea is to en ode (repla e) ea h qubit
used in the omputation (logi al qubit) by several physi al qubits; see Re-
mark 8.3 on page 74. The essential fa t is that errors usually a�e t only
few qubits at a time, so that en oding in reases the stability of a quantum
state.
Organization of omputation in a way that prevents a umulation of er-
rors is alled fault-tolerant omputation. In the lassi al ase, fault-toleran e
an be a hieved by the use of the repetition ode: 0 is en oded by (0; : : : ; 0)
(n times), and 1 is en oded by (1; : : : ; 1). Su h a simple ode does not work
in the quantum ase, but more ompli ated odes do. The �rst method of
fault-tolerant quantum omputation was invented by P. Shor [65℄ and im-
proved independently by several authors [42, 36℄. Alternative approa hes
were suggested by D.Aharonov and M.Ben-Or [3℄ and A.Kitaev [35℄.
Fault-toleran e is a rather diÆ ult subje t, but our goal here is more
modest. Suppose we have a quantum state of n qubits that is subje ted
to an error. Under what ondition is it possible to re over the original
state, assuming that the exe ution of the re overy pro edure is error-free?
(The fault-tolerant omputation deals with the more realisti situation where
errors o ur onstantly, though at a small rate.) Of ourse, error re overy
is not possible for a general state j�i 2 B
n
. However, it an be possible
for states j�i 2 M, where M � B
n
is a suitable �xed subspa e. Likewise,
in the lassi al ase we should onsider states that belong to a �xed subset
M � B
n
.
De�nition 15.1. A lassi al ode of type (n;m) is a subset M � B
n
whi h
onsists of 2
m
elements (where m | the number of en oded bits | is not
ne essarily integral). Elements of M are alled odewords.
A quantum ode of type (n;m) is a subspa e M � B
n
of dimension
2
m
. Elements of M are alled odeve tors.
15. Classi al and quantum odes 153
Remark 15.1. In the theory of fault-tolerant quantum omputation, a
slightly di�erent kind of odes is used. Firstly, an en oding must be spe i-
�ed, i.e., the subspa e M must be identi�ed with a �xed spa e L; usually,
L = B. In other words, an en oding is an isometri embedding V : L ! B
n
su h that M = ImV . Se ondly, sometimes one needs to onsider one-to-
many en odings (be ause errors happen and get orre ted onstantly, so at
any moment there are some errors that have not been orre ted yet). A
one-to-many en oding is an isometri embedding V : L F ! B
n
, where
F is some auxiliary spa e.
Besides the ode, we need to de�ne an error model. It is also alled om-
muni ation hannel : one may think that errors o ur when a state ( las-
si al or quantum) is transferred from one lo ation to another. Intuitively,
this should be something like a multivalued map B
n
! B
n
0
or B
n
! B
n
0
(where n
0
is the number of bits at the output of the hannel; usually, n
0
= n).
We begin with the lassi al ase and then onstru t the quantum de�nition
by analogy.
15.1. Classi al odes. There are two models of errors: a more realisti
probabilisti model and a simpli�ed set-theoreti version. A ording to the
probabilisti model, a ommuni ation hannel is given by a set of onditional
probabilities p(yjx) for re eiving the word y upon transmission of the word x.
We will onsider the ase of independently distributed errors, where n
0
= n,
and the onditional probabilities are determined by the probability p
1
of an
error (bit ip) in the transmission of a single bit:
(15.1) p(yjx) = p
d(x;y)
1
(1� p
1
)
n�d(x;y)
:
Here d(x; y) is the Hamming distan e | the number of distin t bits.
There is a standard method for simplifying a probabilisti error model by
lassifying errors as \likely" and \unlikely". Let us estimate the probability
that in the model de�ned above, more than k bit ips o ur (as is lear from
formula (15.1), this probability does not depend on x). Suppose that n and
k are �xed, whereas p
1
! 0. Then
(15.2) Pr
�
number of bit ips > k
�
=
X
j>k
�
n
j
�
p
j
1
(1� p
1
)
n�j
= O(p
k+1
1
):
Thus the probability that more than k bit ip is small. So we say that
this event is unlikely; we an greatly simplify the model by assuming that
su h an event never happens. We will suppose that, upon transmission of
the word x, some word y is re eived su h that d(x; y) � k. This simpli�ed
model only de�nes a set of possible (or \likely") out omes but says nothing
about their probabilities.
We introdu e the notation:
154 2. Quantum Computation
N = B
n
| set of inputs,
N
0
= B
n
0
| set of outputs,
E � N �N
0
| set of transitions
(i.e., set of errors),
E(n; k) =
�
(x; y) : d(x; y) � k
.
De�nition 15.2. A ode M orre ts errors from a set E � N � N
0
if for
any x
1
; x
2
2 M , (x
1
; y
1
) 2 E, (x
2
; y
2
) 2 E, the ondition x
1
6= x
2
implies
that y
1
6= y
2
.
In the parti ular ase E = E(n; k), we say that the ode orre ts k errors.
Remark 15.2. The term \error orre ting ode" is impre ise. It would
be more a urate to say that the ode o�ers the possibility for orre ting
errors. An error- orre ting transformation is a map P : N
0
! N su h that,
if (x; y) 2 E and x 2M; then P (y) = x.
Example 15.1. The repetition ode of type (3; 1):
M
3
=
�
(0; 0; 0); (1; 1; 1)
� B
3
:
Su h a ode will orre t a single error.
An obvious generalization of this example leads to lassi al odes M
n
of
type (n; 1) whi h orre t k = b(n � 1)=2 errors (see below). We will also
onstru t more interesting examples of lassi al odes. To start, we give yet
another standard de�nition.
De�nition 15.3. The ode distan e is
d(M) = min
�
d(x
1
; x
2
) : x
1
; x
2
2M; x
1
6= x
2
:
For the ode M
3
of Example 15.1 the ode distan e is 3.
Proposition 15.1. A ode M orre ts k errors if and only if d(M) > 2k.
Proof. A ording to De�nition 15.2, the ode does not orre t k errors
if and only if there exist x
1
; x
2
2 M (x
1
6= x
2
) and y 2 B
n
su h that
d(x
1
; y) � k and d(x
1
; y) � k. For �xed x
1
; x
2
, su h a y exists if and only if
d(x
1
; x
2
) � 2k. �
15.2. Examples of lassi al odes.
1. The repetition ode M
n
of type (n; 1) and distan e n:
M
n
=
�
(0; : : : ; 0
| {z }
n
); (1; : : : ; 1
| {z }
n
)
:
This ode an be used with the obvious en oding: we repeat a single bit
n times. To restore the odeword after an error, we repla e the value of
15. Classi al and quantum odes 155
the bits with the value that o urs most frequently. This series of odes,
as will be shown later, does not generalize to the quantum ase.
2. Parity he k: this is a ode of type (n; n� 1) and distan e 2. It onsists
of all even words, i.e., of words ontaining an even number of 1s.
3. The Hamming ode H
r
. This ode is of type (n; n�r), where n = 2
r
�1.
It is de�ned as follows.
Elements of B
n
are sequen es of bits x = (x
�
: � = 1; : : : ; n).
In turn, the index of ea h bit an be represented in binary as � =
(�
1
; : : : ; �
r
). We introdu e a set of he k sums �
j
: B
n
! B (j =
1; : : : ; r) and de�ne the Hamming ode by the ondition that all the
he k sums are equal to 0:
�
j
(x) =
X
�:�
j
=1
x
�
mod 2;
H
r
=
�
x 2 B
2
r
: �
1
(x) = � � � = �
r
(x) = 0
:
For example, the Hamming odeH
3
is de�ned by the system of equations
x
100
+ x
101
+ x
110
+ x
111
= �
1
(x) = 0;
x
010
+ x
011
+ x
110
+ x
111
= �
2
(x) = 0;
x
001
+ x
011
+ x
101
+ x
111
= �
3
(x) = 0:
where (mod 2) arithmeti is assumed.
We will see that the Hamming ode has distan e d(H
r
) = 3 for any
r � 2.
15.3. Linear odes. The set N = B
n
an be regarded as the n-dimensio-
nal linear spa e over the two-element �eld F
2
. A linear ode is a linear
subspa e M � F
n
2
. All the examples given above are of this kind. A linear
ode of type (n;m) an be de�ned by a dual basis, i.e., a set of n � m
linearly independent linear forms ( alled he k sums) whi h vanish on M .
The oeÆ ients of the he k sums onstitute rows of the he k matrix. For
example, the he k matrix of the Hamming ode H
3
is
2
4
0 0 0 1 1 1 1
0 1 1 0 0 1 1
1 0 1 0 1 0 1
3
5
:
Proposition 15.2. The ode distan e of a linear ode equals the minimum
number of distin t olumns of the he k matrix that are linearly dependent.
Proof. A linear dependen y between the olumns of the he k matrix is a
nonzero odeword. If a subset S � f1; : : : ; ng of olumns is dependent, the
orresponding word x 2M has nonzero symbols only at positions � 2 S (and
vi e versa). Thus, if k olumns are dependent, then there is x 2M , x 6= 0,
su h that d(x; 0) � k. Therefore d(M) � k. Conversely, if d(x
1
; x
2
) � k for
156 2. Quantum Computation
some x
1
; x
2
2 M (x
1
6= x
2
), then x = x
2
� x
1
2 M , x 6= 0, d(x; 0) � k,
hen e k (or fewer) olumns are linearly dependent. �
The olumns of the he k matrix of the Hamming ode H
r
orrespond to
the nonzero elements of F
r
2
. Any two olumns are di�erent, hen e they are
linearly independent. On the other hand, the sum of the �rst three olumns
is 0 (for r � 2). Therefore the ode distan e is 3.
15.4. Error models for quantum odes. Firstly, we will de�ne a quan-
tum analogue of the transition set E � N �N
0
. This is an arbitrary linear
subspa e E � L(N ;N
0
), alled an error spa e. There is also an analogue
of the set E(n; k). Let us assume that N = N
0
= B
n
. For ea h subset of
qubits A � f1; : : : ; ng, let E [A℄ be the set of linear operators that a t only
on those qubits and do not a�e t the remaining qubits (su h an operator
has the form X[A℄). Then we de�ne the spa e
E(n; k) =
X
A: jAj�k
E [A℄;
where we take the sum of linear subspa es:
P
j
L
j
=
n
P
j
X
j
: X
j
2 L
j
o
.
In the sequel we will be interested in the possibility of orre ting errors from
the spa e E(n; k).
Next, we will introdu e a physi al model of quantum errors, whi h has
no lassi al analogue. Consider a system of n qubits whi h intera t with
the environment ( hara terized by a Hilbert spa e F). We assume that the
intera tion is des ribed by the Hamiltonian
(15.3) H = H
0
+ V; H
0
= I
B
n Z; V =
n
X
j=1
X
�2fx;y;zg
�
�
j
B
j�
:
Here �
�
j
= �
�
[j℄ denotes the appli ation of the Pauli matrix �
�
(see (8.4))
to the j-th qubit; Z;B
j�
2 L(F) are Hermitian operators a ting on the
environment: B
j�
des ribes intera tion of the environment with the j-th
qubit, whereas Z is the Hamiltonian of the environment itself. It is an
important assumption that qubits intera t with the environment by small
groups. The Hamiltonian (15.3) ontains only one-qubit terms (�
�
j
B
j�
),
but we ould also in lude two-qubit (�
�
j
�
�
l
B
(2)
j�l�
) and higher-order terms,
up to some onstant order.
15. Classi al and quantum odes 157
If the intera tion lasts for some time � , it results in the evolution of the
quantum state by the unitary operator
U = exp(�i�H) = e
�i�(H
0
+V )
= lim
N!1
�
e
�i
�
N
H
0
�
1� i
�
N
V
��
N
= lim
N!1
e
�i�H
0
�
1� i
�
N
V
�
N�1
N
�
�
�
� � �
�
1� i
�
N
V
�
1
N
�
�
� �
1� i
�
N
V (0)
�
;
where V (t) = e
itH
0
V e
�itH
0
. We an expand this expression in powers of
V , repla ing sums by intergrals in the N ! 1 limit. Thus we obtain the
following result:
(15.4)
U = exp(�i�H) =
1
X
k=0
X
k
; X
k
2 E(n; k) L(F);
X
k
= e
�i�H
0
0
�
(�i)
k
Z
� � �
Z
0<t
1
<���<t
k
<�
V (t
k
) � � � V (t
1
) dt
1
� � � dt
k
1
A
;
where
V (t) = e
itH
0
V e
�itH
0
=
X
j;�
�
�
j
B
j�
(t); B
j�
(t) = e
itZ
B
j�
e
�itZ
:
Suppose that the intera tion of the qubits with the environment is small,
(15.5) kB
j�
k �
Æ
3�
:
Then we an obtain an upper bound for the norm of ea h term X
k
in (15.4),
kX
k
k �
n
k
k!
Æ
k
. Therefore U is approximated by an operator U
(k)
su h that
(15.6) kU � U
(k)
k � O(Æ
k+1
); U
(k)
2 E(n; k) L(F):
Namely, U
(k)
=
P
k
l=0
X
l
(note that U
(k)
is not unitary). If errors from
the spa e E(n; k) are re overable (assuming that the initial state belongs to
MF , where M is a suitable ode), then the error- orre ting pro edure
will an el the e�e t of U with pre ision O(Æ
k+1
).
Finally, we will de�ne a quantum version of the model of independent
errors. Let us assume that the quantum state of n qubits undergoes the
transformation that is des ribed by the physi ally realizable superoperator
T = T
n
1
, where kT
1
� Ik
}
� Æ (see Se tion 11.5 for the de�nition of the
superoperator norm k � k
}
). Let T
1
� I = R; then T = (I + R)
n
. This is
essentially a spe ial ase of the model des ribed by formulas (15.3){(15.4):
ea h qubit intera ts with its own pie e of environment, whi h is initially
not entangled with the rest of the system and is dis arded after the a tion
of the operator U . However, the ondition kT
1
� Ik
}
� Æ is di�erent from
ondition (15.5), so we need to onsider this model separately.
158 2. Quantum Computation
One an obtain an estimate that is analogous to (15.2) or (15.6). Let us
write the de omposition
T = (I +R)
n
=
X
A: jAj�k
R
A
I
(f1;:::;ngnA)
| {z }
T
(k)
+
X
A: jAj>k
R
A
I
(f1;:::;ngnA)
| {z }
P
:
The �rst term T
(k)
an be represented as
P
p
X
p
�Y
y
p
, where X
p
; Y
p
2 E(n; k).
So we may write symboli ally T
(k)
2 E(n; k) � E(n; k)
y
. Using the properties
of the superoperator norm, we estimate the norm of the remaining term,
kPk
}
�
X
j>k
�
n
j
�
kRk
j
}
= O(Æ
k+1
):
The model of independent errors in ludes two extreme ases. If T =
U � U
y
(where U is unitary, kU � Ik � Æ=2 ), the errors are alled oherent.
In the ase where
T = (1� p)I � I +
X
j
p
j
U
j
� U
y
j
; U
y
j
U
j
= I; p =
X
j
p
j
� Æ=2;
the errors are alled sto hasti , indi ating that they an be des ribed in
terms of probability rather than operator or superoperator norms.
15.5. De�nition of quantum error orre tion. Following the lassi-
al analogy, we would like to say that quantum errors are re overable if
they take distin t odeve tors to distin t odeve tors. However, the general
philosophy of quantum me hani s suggests that we repla e \distin t" by
\orthogonal".
De�nition 15.4. A quantum ode (a subspa e M � N ) orre ts errors
from E � L(N ;N
0
) if
(15.7) 8j�
1
i; j�
2
i 2 M 8X;Y 2 E
�
h�
2
j�
1
i = 0
�
)
�
h�
2
jY
y
Xj�
1
i = 0
�
:
In the ase where E = E(n; k), one says that the ode orre ts k errors.
De�nition 15.5. Let M � N and E � L(N ;N
0
). A physi ally realizable
superoperator P : L(N
0
)! L(M) is alled an error- orre ting transforma-
tion for the ode M and the error spa e E if
8T 2 E � E
y
9 = (T ) 8� 2 L(M) PT� = (T )�:
Note that if T is tra e-preserving, then (T ) = 1.
Theorem 15.3. If the odeM orre ts errors from E, then an error- orre t-
ing transformation exists.
A proof will be given below. The onverse assertion is proved in [36℄.
15. Classi al and quantum odes 159
Example 15.2. Trivial ode of type (n;m): let M = B
m
j0
n�m
i and
E = E [m+ 1; : : : ; n℄, i.e., the �rst m qubits are used for the oding whereas
the errors a t on the other qubits. Condition (15.7) is learly satis�ed. For
the role of error- orre ting transformation we an take P = I
L(B
m
)
R,
where R : X 7! (TrX) j0
n�m
ih0
n�m
j. The transformation P is realized very
simply: we dis ard the last n �m qubits and repla e them by new qubits
in the state j0i. There is, of ourse, little pra ti al use for su h a ode. It
is interesting, however, that any error- orre ting quantum ode has, in a
ertain sense, the same stru ture as the trivial one ( f. Lemma 15.5 below).
Example 15.3. We examine a quantum analog of the repetition ode:
M
z
n
= C
�
j0; : : : ; 0i; j1; : : : ; 1i
�
. It omes with the standard en oding V
z
n
:
jai 7! ja; : : : ; ai. (The index z in the notation indi ates that we opy a
qubit relative to the basis whi h onsists of the eigenve tors of �
z
, i.e., the
standard basis.) Consider two ve tors, j�
1
i = j0; : : : ; 0i + j1; : : : ; 1i and
j�
2
i = j0; : : : ; 0i � j1; : : : ; 1i, and two errors X;Y 2 E(n; 1), namely, X = I,
Y = �
z
[j℄ (where j is arbitrary). It is obvious that Y j�
2
i = Xj�
1
i = j�
1
i 6=
0, whi h ontradi ts ondition (15.7). We see that the repetition ode of
any size does not prote t against a one-qubit error.
Thus the existen e of nontrivial quantum odes is far from obvious. See
formula (15.11) for the simplest example.
In De�nition 15.4 the statement was only about pairs of orthogonal
states. However, we an infer a onsequen e regarding an arbitrary pair of
states. Let us �x X;Y 2 E and set Z = Y
y
X. Then
(15.8) 8j�
1
i; j�
2
i 2 M h�
2
jZj�
1
i = (Z)h�
2
j�
1
i;
where (Z) is some omplex number, independent of j�
1
i; j�
2
i. Indeed, let
j�
1
i; : : : ; j�
m
i be an orthonormal basis for the subspa e M. By De�ni-
tion 15.4, h�
j
jZj�
k
i = 0 when j 6= k. It also follows that h�
j
jZj�
j
i does
not depend on j, sin e
h�
j
jZj�
j
i � h�
k
jZj�
k
i = h�
j
� �
k
jZj�
j
+ �
k
i+ h�
k
jZj�
j
i � h�
j
jZj�
k
i = 0:
(All three terms on the right-hand side of the equality are equal to zero,
sin e the ve tors that enter into them are orthogonal.)
Remark 15.3. To understand the meaning of ondition (15.8), let us put
it into this form:
(15.9) 8j�i 2 M �
M
Zj�i = (Z)j�i:
We may repla e the proje tor �
M
by a measurement that distinguishes
\good" states (j i 2 M) from \bad" states (j i ? M). Loosely speaking,
formula (15.9) says that if the odeve tor j�i is a ted upon by Z and still
a epted as \good" (whi h happens with probability j (Z)j
2
), it remains
160 2. Quantum Computation
inta t. Thus any possible damage to the odeve tor aused by the error Z
is being dete ted.
We summarize the above dis ussion as follows.
De�nition 15.6. A quantum ode M� N dete ts an error
14
Z 2 L(N ) if
there exists some = (Z) 2 C su h that
8j�
1
i; j�
2
i 2 M h�
2
jZj�
1
i = (Z)h�
2
j�
1
i:
The ode distan e is the smallest number d = d(M) for whi h the ode does
not dete t errors from the spa e E(n; d).
Proposition 15.4. A ode M � N orre ts errors from E � L(N ;N
0
) if
and only if it dete ts errors from the spa e
E
y
E =
n
X
p
Y
y
p
X
p
: X
p
; Y
p
2 E
o
:
In parti ular, a ode M� B
n
orre ts k errors if and only if d(M) > 2k.
The �rst part of the proposition has been already proved. The se ond
part follows from the fa t that E(n; k)
y
E(n; k) = E(n; 2k).
We now pro eed to the proof of Theorem 15.3.
Lemma 15.5. Let a quantum ode M � N orre t errors from a subspa e
E � L(N ;N
0
). Then there exist a Hilbert spa e F , an isometri embedding
V :MF ! N
0
and a linear map f : E ! F su h that
(15.10) 8X 2 E 8j�i 2 M Xj�i = V
�
j�i jf(X)i
�
:
Proof. Let E
0
=
�
X 2 E : 8j�i 2 M Xj�i = 0
. Consider the quotient
spa e F = E=E
0
together with the natural map f : E ! F . The very
de�nition of F and f implies the existen e of a linear map V :MF ! N
0
that satis�es (15.10). It is only ne essary to he k that V is an isometry.
An inner produ t on the spa e F an be de�ned with the aid of the
fun tion from property (15.8) of the ode: if j�
1
i = jf(X)i and j�
2
i =
jf(Y )i, then h�
2
j�
1
i = (Y
y
X). It is lear that this quantity depends only
on j�
1
i and j�
2
i rather than on the parti ular hoi e of X and Y . It is also
lear that h�j�i > 0 if j�i 6= 0. Formula (15.8) shows at on e that the map
V is an isometry. �
Proof of Theorem 15.3. We de ompose the spa e N
0
into the sum of two
orthogonal subspa es: N
0
= (Im V )�K, where V is the map of the pre eding
14
Admittedly, this terminology may be onfusing. For example, the identity operator is
\dete ted" by any ode. As explained above, it is more adequate to say that the ode dete ts the
error-indu ed damage, if any. But this sounds too awkward.
15. Classi al and quantum odes 161
lemma. Let W : K ! N
0
be the in lusion map, and R : L(K) ! L(M) an
arbitrary physi ally realizable superoperator. Then we de�ne
P : � 7! Tr
F
(V
y
�V ) +R(W
y
�W ); : X � Y
y
7! hf(Y )jf(X)i:
The fun tion extends to the spa e E � E
y
by linearity. �
Lemma 15.5 and the proof of Theorem 15.3 an be explained in the
following way. An error- orre ting ode is hara terized by the property
that the error does not mix with the en oded state, i.e., it remains in the
form of a separate tensor fa tor j�i = f(X) 2 F . (Using the terminology of
Remark 15.1, we may say that the original state j�i 2 M gets en oded with
the one-to-many en oding V .) The orre ting transformation extra ts the
\built-in" error j�i and deposits it in the trash bin.
[1!℄ Problem 15.1. Let the odeM� B
n
dete t errors from E(A). Prove
that the state � 2 L(M) an be restored without using qubits from the set
A.
15.6. Shor's ode. Following Shor [64℄, we de�ne a series of quantum
odes with arbitrary large distan e. The r-th member of this series en odes
one logi al qubit into r
2
physi al qubits; the distan e of this ode equals r.
The idea is to �x the repetition ode M
z
n
de�ned above. We have seen
that it fails to orre t a one-qubit error of the form �
z
j
= �
z
[j℄. Operators
generated by �
z
1
; : : : ; �
z
n
(i.e., ones that preserve the basis ve tors up to a
phase), are alled phase errors. Still, the repetition ode prote ts against
lassi al errors | operators of the form �
x
j
and their produ ts (as well as
linear ombinations of su h produ ts). Spe i� ally, the n-qubit repetition
ode M
z
n
orre ts lassi al errors that a�e t at most b(n � 1)=2 qubits.
However, phase errors and lassi al errors are onjugate to ea h other, sin e
�
x
= H�
z
H. Therefore the \dual repetition ode" M
x
n
, de�ned by the
en oding
V
x
n
: j�
a
i 7! j�
a
i � � � j�
a
i; j�
a
i = Hjai;
jai 7! 2
�(n�1)=2
P
y
1
;:::;y
n
y
1
+���+y
n
�a (mod 2)
jy
1
; : : : ; y
n
i
(a = 0; 1);
will prote t against phase errors (but not lassi al errors).
We may try to ombine the two odes: �rst we en ode the logi al qubit
with V
x
r
; then we en ode ea h of the resulting qubit with V
z
r
. (Su h om-
position of two odes is alled a on atenated ode.) Thus we obtain the
en oding V = (V
z
r
)
r
V
x
r
: B ! B
r
2
. Inasmu h as the number of physi-
al qubits is an exa t square, it is onvenient to write basis ve tors of the
162 2. Quantum Computation
orresponding spa e as matri es, e.g.,
�
�
�
�
x
11
� � � x
1r
: : : : : : : : : : : : :
x
r1
� � � x
rr
�
. In this notation the
en oding V assumes the form jai 7! j�
a
i, where
(15.11) j�
a
i = 2
�(r�1)=2
X
y
1
;:::;y
r
2F
2
y
1
+���+y
r
=a
�
�
�
�
�
y
1
� � � � � � y
1
y
2
� � � � � � y
2
: : : : : : : : : : : : : : : :
y
r
� � � � � � y
r
+
(a = 0; 1):
To analyze the Shor ode, we will de ompose errors over a basis of op-
erators built on Pauli matri es. More pre isely, the basis of the spa e L(B)
onsists of the identity operator I and the three Pauli matri es. We intro-
du e nonstandard notation for these matri es:
�
00
=
�
1 0
0 1
�
= I; �
01
=
�
1 0
0 �1
�
= �
z
;
�
10
=
�
0 1
1 0
�
= �
x
; �
11
=
�
0 �i
i 0
�
= �
y
:
These operators are remarkable in that they are unitary and Hermitian
at the same time. The indexing we have introdu ed allows one to onve-
niently express the multipli ation rules and ommutation relations between
the basis operators:
�
��
�
�
0
�
0
= i
e!(�;�;�
0
;�
0
)
�
���
0
;���
0
; �
��
�
�
0
�
0
= (�1)
��
0
��
0
�
�
�
0
�
0
�
��
;
where e!(�; �
0
;�
0
; �) 2 Z
4
(see (15.16) below for an expli it formula). The
set of indi es forms the Abelian group G = Z
2
� Z
2
, whi h an also be
onsidered as a 2-dimensional linear spa e over the �eld F
2
.
The basis for L(B
n
) onsists of 4
n
operators,
(15.12) �(f) = �(�
1
; �
1
; �
2
; �
2
; : : : ; �
n
; �
n
)
def
= �
�
1
;�
1
�
�
2
;�
2
� � ��
�
n
;�
n
:
Here f 2 G
n
= F
2n
2
.
We now examine the error- orre ting properties of the Shor ode. Our
goal is to prove that its distan e is at least r, i.e., the ode dete ts r � 1
errors (in fa t, the distan e is pre isely r). By the linearity of the de�nition
it suÆ es to study errors of the form �(f). Su h an error an be de omposed
into a lassi al omponent and a phase omponent,
�(f) = �(f
(x)
)�(f
(z)
); f
(x)
= (�
1
; 0; �
2
; 0; : : : ); f
(z)
= (0; �
1
; 0; �
2
; : : : );
where is a phase fa tor. Sin e we assume that jf j < r (where jf j denotes
the number of nonzero pairs (�
j
; �
j
) ), we have jf
(x)
j; jf
(z)
j < r. It suÆ es
to show that for Z = �(f)
(15.13) h�
1
jZj�
0
i = 0; h�
1
jZj�
1
i = h�
0
jZj�
0
i:
Let us onsider two ases.
15. Classi al and quantum odes 163
1. f
(x)
6= 0. The error Z = �(f
(x)
)�(f
(z)
) takes a basis ve tor to a basis
ve tor. It ips s = jf
(x)
j bits; in our ase 0 < s < r. The odeve tors of
the Shor ode are linear ombinations of spe ial basis ve tors: all bits
in ea h row are equal. Flipping s bits breaks this spe ial form; therefore
h�
a
jZj�
b
i = 0.
2. f
(x)
= 0. The error Z = �(f
(z)
) =
Q
j;k
(�
z
jk
)
�
jk
multiplies ea h basis
ve tor by �1. The spe ial basis ve tors in (15.11) are transformed as
follows:
Z
�
�
�
�
�
y
1
� � � � � � y
1
y
2
� � � � � � y
2
: : : : : : : : : : : : : : : :
y
r
� � � � � � y
r
+
= (�1)
P
j
�
j
y
j
�
�
�
�
�
y
1
� � � � � � y
1
y
2
� � � � � � y
2
: : : : : : : : : : : : : : : :
y
r
� � � � � � y
r
+
;
where �
j
=
P
k
�
jk
2 F
2
. Let j�j denote the number of nonzero ompo-
nents in the ve tor � = (�
1
; : : : ; �
r
) 2 F
r
2
. Then j�j � jf
(z)
j < r. There
are three possibilities:
a) � = (0; : : : ; 0). In this ase Zj�
b
i = j�
b
i, i.e., the error does not
a�e t odeve tors. Therefore h�
a
jZj�
b
i = Æ
ab
.
b) � = (1; : : : ; 1). This is a tually impossible sin e j�j < r.
) � 6= (0; : : : ; 0); (1; : : : ; 1). Then
h�
a
jZj�
b
i =
0
B
B
�
2
�(r�1)
X
y
1
;:::;y
r
2F
2
y
1
+���+y
r
=a
(�1)
P
j
�
j
y
j
1
C
C
A
Æ
ab
= 0:
15.7. The Pauli operators and symple ti transformations. The
onstru tion of the Shor ode uses the symmetry between �
x
and �
z
. We
will now study symmetries between �-operators in more detail.
As already mentioned, the Pauli matri es are onveniently indexed by
elements of the group G = (Z
2
)
2
. General �-operators (see (15.12)) are
indexed by = (�
1
; �
1
; : : : ; �
n
; �
n
) 2 G
n
. The �-operators form a basis in
L(B
n
). Moreover, L(B
n
) be omes a G
n
-graded algebra,
L(B
n
) =
M
2G
n
C
�
�( )
�
;
the operation y : X 7! X
y
preserves the grading.
The ommutation rules for the �-operators are as follows:
(15.14)
�(
1
)�(
2
) = (�1)
!(
1
;
2
)
�(
2
)�(
1
);
!(�
1
; �
1
; : : : ; �
n
; �
n
; �
0
1
; �
0
1
; : : : ; �
0
n
; �
0
n
) =
P
n
j=1
(�
j
�
0
j
� �
0
j
�
j
) mod 2:
The multipli ation rules are similar:
(15.15) �(
1
)�(
2
) = i
e!(
1
;
2
)
�(
1
+
2
); e! : G
n
�G
n
! Z
4
:
164 2. Quantum Computation
Obviously, !(
1
;
2
) = e!(
1
;
2
) mod 2.
To obtain an expli it formula for the one-qubit version of the fun tion
e!, we use the equation �
��
= i
��
�
�0
�
0�
. We express �
��
and �
�
0
�
0
this
way, and ommute �
0�
through �
�
0
0
. The result is as follows:
(15.16) e!(�; �;�
0
; �
0
) = �� + �
0
�
0
� (� � �
0
)(� � �
0
) + 2�
0
� mod 4:
Note that the inner sums in (15.16) are taken modulo 2, whereas the outer
sum is taken modulo 4. (For this reason, one annot expand the produ t
(� � �
0
)(� � �
0
) and an el the terms �� and �
0
�
0
.) Su h a mixture of
Z
2
-operations and Z
4
-operations is rather onfusing, so we prefer to write
the formula (15.16) in a di�erent form:
e!(�; �;�
0
; �
0
) = �
2
�
2
+ (�
0
)
2
(�
0
)
2
� (�+ �
0
)
2
(� + �
0
)
2
+ 2�
0
�:
In this ase, the value 0 2 Z
2
an be represented by either 0 2 Z
4
or 2 2 Z
4
,
whereas 1 an be represented by either 1 or 3 | the result will be the same.
Finally, we write down the general formula for the fun tion e!:
(15.17)
e!( ;
0
) = �( ) + �(
0
)� �( +
0
) + 2{( ;
0
);
�(�
1
; �
1
; : : : ; �
n
; �
n
) =
P
n
j=1
�
2
j
�
2
j
2 Z
4
;
{(�
1
; �
1
; : : : ; �
n
; �
n
; �
0
1
; �
0
1
; : : : ; �
0
n
; �
0
n
) =
P
n
j=1
�
0
j
�
j
2 Z
2
:
A unitary transformation is a superoperator of the form T = U � U
y
:
X ! UXU
y
. It is lear that U an be re onstru ted from T up to a phase
fa tor, so the group of unitary transformations on n qubits isU(B
n
)=U(1).
We note that the unitary transformations are pre isely the automorphisms of
the �-algebra L(B
n
) (i.e., the linear maps L(B
n
)! L(B
n
) that ommute
with the operator multipli ation and the operation y ).
We are interested in those transformations whi h preserve the grading of
L(B
n
) by the �-operators, i.e., U�( )U
y
= ( )�(u( )), where u( ) 2 G
n
and ( ) is a phase fa tor. Sin e both U�( )U
y
and �(u( )) are Hermitian,
( ) = �1. Thus we may write
(15.18) U�( )U
y
= (�1)
v( )
�(u( )); u : G
n
! G
n
; v : G
n
! Z
2
:
The group of su h transformations is alled the extended symple ti group
and is denoted by ESp
2
(n). The operators in this group will be alled
symple ti . We give some examples.
1. �-operators: �(f)�( )�(f)
y
= (�1)
!(f; )
�( ). In the ase at hand,
u( ) = .
15. Classi al and quantum odes 165
2. The operators H andK from the standard basis. We immediately verify
that
H�
x
H
y
= �
z
; H�
y
H
y
= ��
y
; H�
z
H
y
= �
x
;
K�
x
K
y
= �
y
; K�
y
K
y
= ��
x
; K�
z
K
y
= �
z
:
Therefore the transformations H �H
y
and K �K
y
belong to ESp
2
(1). It
is easy to see that ESp
2
(1) � U(2)=U(1)
�
=
SO(3) is the group of rota-
tional symmetries of a ube; jESp
2
(1)j = 24. This group is generated by
the above transformations. The operators H andK themselves generate
the Cli�ord group of 24 �8 = 192 elements. Thus ESp
2
(1) is the quotient
of the Cli�ord group by the subgroup
�
i
k=2
I
B
: k = 0; : : : ; 7
�
=
Z
8
.
3. The ontrolled NOT | the operator U = �(�
x
)[1; 2℄. By de�nition we
have U ja; bi = ja; a + bi. The a tion of U on generators of the algebra
L(B
2
) is as follows:
U�
z
1
U
y
= �
z
1
; U�
x
1
U
y
= �
x
1
�
x
2
;
U�
z
2
U
y
= �
z
1
�
z
2
; U�
x
2
U
y
= �
x
2
:
These equations an be veri�ed without diÆ ulty by dire t al ulation.
However, it is useful to bring an explanation to the fore. The operator
�
z
j
is a phase shift that depends on the value of the orresponding qubit.
The equations in the left olumn show how these values hange under
the a tion of U . The �rst equation in the right olumn indi ates that
ipping the �rst qubit before applying U has the same e�e t as ipping
both qubits after the a tion of U . The last equation indi ates that
ipping the se ond qubit ommutes with U .
Let T = U � U
y
be an arbitrary symple ti transformation. The asso i-
ated fun tion u : G
n
! G
n
(see (15.18)) has the following properties:
1. u is linear.
2. u preserves the form !, i.e., !(u(f); u(g)) = !(f; g).
Maps with su h properties, as is known, are alled symple ti ; they form
the symple ti group Sp
2
(n) . It is lear that the orresponden e � : T 7! u,
� : ESp
2
(n)! Sp
2
(n) is a homomorphism of groups.
Theorem 15.6. Im � = Sp
2
(n), Ker � = G
n
(the kernel is the set of �-
operators). Therefore, ESp
2
(n)=G
n
�
=
Sp
2
(n).
For an understanding of the proof it is desirable to know something about
ohomology and extensions of groups [57℄. For the reader una quainted with
these on epts we have prepared a \roundabout way" (see below).
Proof. The transformation (15.18) must be an automorphism of the �-
algebra L(B
n
). This is the ase if and only if the multipli ation rules (15.15)
166 2. Quantum Computation
are preserved by the a tion of T = U � U
y
(the operation of taking the
Hermitian adjoint ommutes with T automati ally). This means that the
fun tion u has properties indi ated, and v satis�es the equation
(15.19) v(x+ y)� v(x)� v(y) = w(x; y);
where w(x; y) =
e!(u(x);u(y))�e!(x;y)
2
2 Z
2
.
In the ase where u is the identity map, the right-hand side of (15.19)
equals zero. The solutions are all linear fun tions; this proves that Ker � =
G
n
.
The assertion that Im � = Sp
2
(n) is equivalent to equation (15.19) having
a solution for any u 2 Sp
2
(n). To prove the existen e of the solution, we
note that the fun tion w has the following properties:
w(y; z) � w(x+ y; z) + w(x; y + z)� w(x; y) = 0;(15.20)
w(x; y) = w(y; x);(15.21)
w(x; x) = 0:(15.22)
Formula (15.20) is the o y le equation. It indi ates that the fun tion w
yields a group stru ture on the Cartesian produ t G
n
� Z
2
with the multi-
pli ation rule (x; p)�(y; q) = (x+y; p+q+w(x; y)). The group we obtain (we
denote it by E) is an extension of G
n
with Z
2
, i.e., there is a homomorphism
� : E ! G
n
with kernel Z
2
; the homomorphism is de�ned by � : (x; p) 7! x.
Equation (15.21) indi ates that the group E is Abelian. Finally, (15.22)
means that ea h element of the group E has order 2 or 1. Consequently,
E
�
=
(Z
2
)
2n+1
.
It follows that the extension E ! G
n
is trivial: there exists a homomor-
phism � : G
n
! E su h that �� = id
G
n
. Writing this homomorphism in
the form � : x 7! (x; v(x)); we obtain a solution to equation (15.19). �
There is another, somewhat ad ho way of proving that Im � = Sp
2
(n).
Consider the following symple ti transformations:
�
H �H
y
�
[j℄,
�
K �K
y
�
[j℄
and
�
�(�
x
) � �(�
x
)
y
�
[j; k℄. Their images under the homomorphism � gen-
erate the whole group Sp
2
(n). Idea of the proof: it is possible, using
these transformations, to take an arbitrary pair of ve tors
1
;
2
2 G
n
su h
that !(
1
;
2
) = 1 into (1; 0; 0; 0; : : : ) and (0; 1; 0; 0; : : : ). (See the proof of
Lemma 15.9 for an implementation of a similar argument.)
In fa t, in this way we an obtain another interesting result. The spe -
i�ed elements of the group ESp
2
(n) generate all the �-operators, i.e., the
kernel of the homomorphism �. Consequently, the following statement is
true:
15. Classi al and quantum odes 167
Proposition 15.7. The group ESp
2
(n) is generated by the elements
�
H �H
y
�
[j℄;
�
K �K
y
�
[j℄;
�
�(�
x
) � �(�
x
)
y
�
[j; k℄:
15.8. Symple ti (stabilizer) odes. These are analogous to the las-
si al linear odes. The role of he k sums is played by the �-operators
�(�
1
; �
1
; : : : ; �
n
; �
n
).
For example, the Shor ode an be represented in this way. Re all
that this is a two-dimensional subspa e M � B
r
2
spanned by the ve -
tors (15.11).
What equations do ve tors of M satisfy?
1. For ea h j = 1; : : : ; r and k = 1; : : : ; r�1 we have �
z
jk
�
z
j(k+1)
j�i = j�i.
That is, j�i is a linear ombination of spe ial basis ve tors: ea h row
onsists of the repetition of a single bit.
2. For ea h j = 1; : : : ; r�1 we have
�
Q
r
k=1
�
�
x
jk
�
x
j(k+1)
�
�
j�i = j�i. What
does this rule mean? The operator
Q
r
k=1
�
�
x
jk
�
x
j(k+1)
�
ips all the bits
in the j-th and j+1-th rows,
�
�
�
�
�
: : : : : : : : : : :
y
j
: : : y
j
y
j+1
: : : y
j+1
: : : : : : : : : : :
+
7�!
�
�
�
�
�
: : : : : : : : : : : : : : : : :
y
j
+1 : : : y
j
+1
y
j+1
+1 : : : y
j+1
+1
: : : : : : : : : : : : : : : : :
+
(the bits in the other rows do not hange). These two basis ve tors must
enter j�i with the same oeÆ ients.
It is lear that if j�i satis�es onditions 1 and 2, then j�i =
0
j�
0
i+
1
j�
1
i,
where j�
a
i (a = 0; 1) are de�ned by (15.11).
We now give the general de�nition of symple ti odes, also alled stabi-
lizer odes. They were introdu ed (without name) in [17℄. We will �rst give
a noninvariant de�nition, whi h is easier to understand.
A symple ti quantum ode is a subspa e of the form
M =
n
j�i 2 B
n
: 8j X
j
j�i = j�i
o
; where
(15.23) X
j
= (�1)
�
j
�(f
j
); f
j
2 G
n
; �
j
2 Z
2
:
The operators X
j
must ommute with ea h other. They are alled he k
operators.
The requirement that the he k operators ommute is equivalent to the
ondition !(f
j
; f
k
) = 0. Indeed, the general ommutation relation for op-
erators of the form (15.23) is X
j
X
k
= (�1)
!(f
j
;f
k
)
X
k
X
j
. (Note that if
!(f
j
; f
k
) 6= 0 for some j and k, the subspa e M is empty; this is why we
168 2. Quantum Computation
have ex luded this ase from onsideration.) Without loss of generality we
may assume that the f
j
are linearly independent.
Note that di�erent hoi es of he k operators may orrespond to the
same ode. In fa t, the ode depends only on the subspa e F � G
n
spanned
by the ve tors f
j
, and on the fun tion � : F ! Z
2
, interpolating the values
�(f
j
) = �
j
so that the operators X
�
(f) = (�1)
�(f)
�(f) satisfy the ondition
X
�
(f + g) = X
�
(f)X
�
(g) = X
�
(g)X
�
(f):
Therefore it is preferable to use the following invariant de�nition.
De�nition 15.7. Let F � G
n
be an isotropi subspa e, i.e., !(f; g) = 0 for
any f; g 2 F . Also let a fun tion � : F ! Z
2
satisfy the equation
(15.24) �(f + g)� �(f)� �(g) = �(f; g); where �(f; g) =
e!(f; g)
2
2 Z
2
:
(The fun tion � is de�ned on pairs (f; g) for whi h !(f; g) = 0.) Then the
orresponding symple ti ode is
(15.25) SympCode(F; �)
def
=
n
j�i 2 B
n
: 8f 2 F �(f)j�i = (�1)
�(f)
j�i
o
:
Note that the restri tion of � to the subspa e F satis�es equations anal-
ogous to (15.20){(15.22); therefore equation (15.24) has a solution. In fa t,
there are 2
dimF
solutions; any two solutions di�er by a linear fun tion. We
all the orresponding odes ongruent.
Theorem 15.8. dim(SympCode(F; �)) = 2
n�dimF
. The ongruent odes
form an orthogonal de omposition of B
n
.
Lemma 15.9. By symple ti transformations, an arbitrary symple ti ode
SympCode(F; �) an be redu ed to a trivial one, for whi h the he k opera-
tors are �
z
[1℄; : : : ; �
z
[s℄ (s = dimF ).
Proof. Let f
1
2 F be a nonzero ve tor. The group Sp
2
(n) a ts transitively
on nonzero ve tors, so there is an element S
1
2 Sp
2
(n) su h that S
1
f
1
=
(0; 1; 0; 0; : : : ) = g
1
. The isotropi spa e S
1
F onsists of ve tors of the form
(0; �
1
; �
2
; �
2
; : : : ). Let F
1
� S
1
F onsist of those ve tors for whi h �
1
= 0.
Then S
1
F = F
2
(g
1
)� F
1
and F
1
� G
n�1
.
Iterating this argument, we �nd an S 2 Sp
2
(n) su h that
SF =
�
(0; �
1
; 0; �
2
; : : : ; 0; �
s
; : : : ; 0; 0; : : : ) : �
1
; : : : ; �s 2 F
2
:
By Theorem 15.6, S orresponds to some symple ti transformation
U � U
y
. It takes the ode F to a symple ti ode given by the he k op-
erators ��
z
[j℄, (j = 1; : : : ; s). All the signs an be hanged to \+" by
applying a supplementary transformation of the form �(f) � �(f)
y
. �
15. Classi al and quantum odes 169
We now examine whether a symple ti ode SympCode(F; �) is apable
of dete ting k-qubit errors. Re all that the property of a ode to dete t an
error Z is given by formula (15.9). By linearity, it is suÆ ient to onsider
errors of the form Z = �(g), jgj � k.
Let j�i 2 SympCode(F; �), i.e., �(f)j�i = (�1)
�(f)
j�i for any f 2 F .
We denote j i = �(g)j�i. As we will now show, the ve tor j i belongs to
one of the ongruent odes SympCode(F; �
0
). Indeed,
�(f)j i = �(f)�(g)j�i = (�1)
!(f;g)
�(g)�(f)j�i = (�1)
!(f;g)+�(f)
�(g)j�i
= (�1)
�
0
(f)
j i;
where �
0
(f) = �(f) + !(f; g). We note that �
0
= � if and only if g 2 F
+
,
where
F
+
=
�
g 2 G
n
: 8f 2 F !(f; g) = 0
:
Obviously, F � F
+
.
For the error Z = �(g) there are three possibilities:
1. g =2 F
+
. Then �
0
6= �, hen e j i ? SympCode(F; �). The ode dete ts
su h an error.
2. g 2 F . In this ase j i = �(g)j�i = (�1)
�(g)
j�i. Su h an error is
indistinguishable from the identity operator, sin e it does not alter the
odeve tor (up to the onstant phase fa tor (�1)
�(g)
). Condition (15.9)
is ful�lled.
3. g 2 F
+
nF . As in the previous ase, �
0
= �, hen eM = SympCode(F; �)
is an invariant subspa e for the operator Z = �(g). However, the a tion
of Z on M is not multipli ation by a onstant. (Otherwise Z or �Z
ould be added to the set of he k operators without redu ing the ode
subspa e, whi h is impossible.) The ode does not dete t su h an error.
These onsiderations prove the following theorem.
Theorem 15.10. The ode M = SympCode(F; �) has distan e
d(M) = min
�
jf j : f 2 F
+
n F
:
We observe a di�eren e from lassi al linear odes. There the minimum
is taken over a subspa e with 0 ex luded, whereas for symple ti odes 0 is
repla ed by the nontrivial subspa e F .
Example 15.4. A symple ti ode of type (5; 1) that orre ts 1 error: the
subspa e F is generated by the rows of the matrix
2
6
6
4
1 0 1 0 0 1 0 1 0 0
1 0 0 1 1 0 0 0 0 1
0 1 0 0 0 1 1 0 0 1
0 1 0 1 0 0 0 1 1 0
3
7
7
5
:
170 2. Quantum Computation
It an be veri�ed that !(f
j
; f
k
) = 0 for any two rows f
j
, f
k
. We note that
the olumns of the matrix ome in pairs. If we take any two pairs, then the
orresponding four olumns are linearly independent. Consequently, for any
g 6= 0 supported by these olumns, there is a row f
j
su h that !(f
j
; g) 6= 0.
Thus the ode distan e is greater than 2. (In fa t, the distan e is 3 sin e
the �rst 6 olumns are linearly dependent.)
[3!℄ Problem 15.2. Prove that no quantum ode of type (4; 1) is apable
of orre ting a single error.
15.9. Tori ode. We introdu e an important example of a symple ti
ode. It is onstru ted as follows. Consider an r � r latti e on the torus.
We put a qubit on ea h of its edges. In this way we have 2r
2
qubits. The
he k operators will be of two types.
s
u
Fig. 15.1. Stabilizer operators for the tori ode.
Type I operators are given by verti es. We hoose some vertex s and
asso iate with it the he k operator
A
x
s
= �(f
x
s
) =
Y
j2star(s)
�
x
j
:
Type II operators are given by fa es. We hoose some fa e u and
asso iate with it the he k operator
A
z
u
= �(f
z
u
) =
Y
j2boundary(u)
�
z
j
:
The operators A
x
s
and A
z
u
ommute, sin e the number of ommon edges
between a vertex star and a fa e boundary is either 0 or 2. (The inter-
hangeability of operators within ea h type is obvious.)
15. Classi al and quantum odes 171
Although we have indi ated r
2
+ r
2
= 2r
2
he k operators (one per a
fa e or a vertex), there are relations between them:
Y
s
A
x
s
=
Y
u
A
z
u
= I:
(It an be shown that these are the only relations.) Therefore dimF =
2r
2
� 2 and dimM = 2
2
, so two logi al qubits an be en oded.
We will now determine the ode distan e.
For the tori ode we have the natural de ompositions
F = F
(x)
� F
(z)
; F
+
= F
(x)
+
� F
(z)
+
into subspa es of operators that onsist either only of �
x
j
or only of �
z
j
. Su h
odes are alled CSS odes (by the names of their inventors, Calderbank,
Shor [16℄ and Steane [68℄).
In the ase of the tori ode, the subspa es F
(z)
, F
(x)
, F
(z)
+
, F
(x)
+
have
a topologi al interpretation. Ve tors of the form (0; �
1
; : : : ; 0; �
n
) 2 G
n
an be regarded as 1- hains, i.e., formal linear ombinations of edges with
oeÆ ients �
1
; : : : ; �
n
2 Z
2
. The basis elements of F
(z)
are the boundaries
of 2- ells. Therefore F
(z)
is the spa e of 1-boundaries. Likewise, ve tors of
the form (�
1
; 0; : : : ; �
n
; 0) are regarded as 1- o hains; F
(x)
is the spa e of
1- oboundaries.
Let us take an arbitrary element g 2 F
+
, g = g
(x)
+ g
(z)
. The ommu-
tativity between g and F an be written as follows:
!(f
x
s
; g
(z)
) = 0; !(f
z
u
; g
(x)
) = 0; for ea h vertex s and fa e p.
To satisfy !(f
x
s
; g
(z)
) = 0 it is ne essary that ea h star ontain an even
number of edges from g
(z)
. In other words, g
(z)
is a 1- y le. Analogously,
g
(x)
must be a 1- o y le.
Thus, the spa es F
(z)
+
, F
(x)
+
onsist of 1- y les and 1- o y les (with
Z
2
oeÆ ients), and the spa es F
(z)
, F
(x)
onsist of 1-boundaries and 1-
oboundaries. The sets F
(z)
+
nF
(z)
and F
(x)
+
nF
(x)
are formed by y les and
o y les that are not homologous to 0. Consequently the ode distan e is
the minimum size (the number of nonzero oeÆ ients) of su h a y le or
o y le. It is easy to see that this minimum equals r. This shows that the
tori ode orre ts b(r � 1)=2 errors.
Remark 15.4. The family of tori odes (with r = 1; 2; : : : ) provides an
example of lo al he k odes. Spe i� ally, the following onditions are satis-
�ed:
{ ea h he k operator a ts on a uniformly bounded number of qubits;
{ ea h qubit enters a uniformly bounded number of he k operators;
172 2. Quantum Computation
j�i
�
.
.
.
�
#
"
!
X = �(g)
�
.
.
.
�
j i
�
.
.
.
�
#
"
!
measurement
of
syndrome
�
.
.
.
�
g
0
: syndrome(g
0
) = syndrome(g)
g
0
= g + f; f 2 F
?
6
�
.
.
.
�
�
�
�
�
�(g
0
)
y
�
.
.
.
�
�(f)j�i = j�i
Fig. 15.2. Error orre tion pro edure for symple ti odes.
{ the family ontains odes with arbitrarily large distan e.
Su h odes are interesting in that syndrome measurement (an important
part of error orre tion; see below) an be realized by a onstant depth ir-
uit. Therefore an error in the exe ution of this ir uit will a�e t only a
bounded number of qubits | a useful property for fault-tolerant omputa-
tion.
15.10. Error orre tion for symple ti odes. De�nition 15.4 and
Theorem 15.3 indi ate only an abstra t possibility for restoring the initial
odeve tor. We will show how to realize an error orre tion pro edure for a
symple ti ode M = SympCode(F; �).
We examine a spe ial ase where the error is a �-operator, W = �(g).
Let X
j
= (�1)
�
j
�(f
j
) be the he k operators, and F the orresponding
isotropi subspa e. The sequen e of bits �(g) = (!(f
1
; g); : : : ; !(f
s
; g)) is
alled the syndrome of g. Ea h of these bits an be measured by measuring
the eigenvalue of X
j
on the quantum state j i = W j�i (in fa t, X
j
j i =
(�1)
!(f
j
;g)
j i ). The measurement of one bit does not hange the values of
the other, be ause the he k operators ommute.
Suppose that jgj � k and the ode orre ts k errors (i.e., d(M) > 2k). Is
it possible to re onstru t the error from its syndrome? Two errors g; g
0
2 G
n
have the same syndrome if and only if g
0
� g 2 F
+
. The error orre tion
property of the ode implies that
8 g; g
0
�
jgj � k; jg
0
j � k
�
=)
�
(g
0
� g 2 F ) _ (g
0
� g =2 F
+
)
�
:
Therefore we may take a di�erent error g
0
for g, but only if g
0
� g 2 F .
It is now lear how we should orre t errors. After the syndrome is
determined, we re onstru t the error (up to an element f = g
0
� g 2 F ) and
apply the operator whi h is inverse to the operator of the supposed error.
Thus we obtain a state that di�er from the initial one by a phase fa tor,
�(g
0
)
�1
�(g)j�i = j�i.
15. Classi al and quantum odes 173
We have onsidered the ase of an error of type �(g). But, a tually,
it is required that an error- orre ting transformation prote t against all
superoperators of the form
T =
X
jhj�k;jh
0
j�k
b
h;h
0
�(h) � �(h
0
)
y
:
As an exer ise the reader is en ouraged to verify how the above pro edure
works in this general ase.
[3℄ Problem 15.3. Constru t a polynomial algorithm for re onstru ting
an error from its syndrome for the tori ode.
15.11. Anyons (an example based on the tori ode). Using the
onstru tion of the tori ode, we will try to give a better idea of Abelian
anyons mentioned in the Introdu tion (non-Abelian anyons are onsiderably
more ompli ated).
On e again, we onsider a square latti e on the torus (or on the plane
| now we are only interested in a region with trivial topology). As earlier,
asso iated to ea h vertex s and ea h fa e u are the operators
A
x
s
=
Y
j2star(s)
�
x
j
; A
z
u
=
Y
j2boundary(u)
�
z
j
:
The odeve tors are hara terized by the onditions A
x
s
j�i = j�i, A
z
u
j�i =
j�i. There is a di�erent way to impose the same onditions. Consider the
following Hamiltonian | the Hermitian operator
(15.26) H =
X
s
(I �A
x
s
) +
X
u
(I �A
z
u
):
This operator is nonnegative, and its null spa e oin ides with the ode
subspa e of the tori ode. Thus, the ve tors of the ode subspa e are the
ones that possess the minimal energy (i.e., they are the eigenve tors orre-
sponding to the smallest eigenvalue of the Hamiltonian). In physi s, su h
states are alled ground states, and ve tors in the orthogonal omplement
are alled ex ited states.
Ex ited states an be lassi�ed by the set of onditions they violate.
Spe i� ally, the states violating a parti ular set of onditions form a sub-
spa e; su h subspa es form an orthogonal de omposition of the total state
spa e. Note that the number of violated onditions of ea h type is even sin e
Q
s
A
x
s
=
Q
u
A
z
u
= I.
Consider an ex ited state j�i with the smallest nonzero energy. Su h a
state violates pre isely two onditions, for instan e, at two verti es, s and
p. (In fa t, the states violating di�erent pairs of onditions may form linear
174 2. Quantum Computation
ombinations, but we will assume that s and p are �xed.) Then for these
parti ular verti es
A
x
s
j�i = �j�i; A
x
p
j�i = �j�i;
whereas the onditions with the \+" sign for the other verti es remain in
for e. We say that in the state j�i there are two quasiparti les (elementary
ex itations) lo ated at the verti es s and p. Thus quasiparti le is a mental
devi e for lassifying ex ited states. It is a spe ial property of Hamilton-
ian (15.26) that states with ertain quasiparti le positions are also eigen-
states. However, the lassi� ation of low energy ex ited states by quasipar-
ti le positions, though approximate, works amazingly well for most physi al
media.
15
How an we get the state j�i from the ground state j�i? We join p
and s by a latti e path C
1
(see Figure 15.3a) and a t on j�i by the opera-
tor W =
Q
j2C
1
�
z
j
. This operator ommutes with A
x
k
for all internal ver-
ti es of the path C
1
, but at the ends they anti- ommute: WA
x
s
= �A
x
s
W ,
WA
x
p
= �A
x
p
W . We set j�i =W j�i and show that j�i satis�es the required
properties. For the vertex s (and analogously for p) we have
A
x
s
j�i = A
x
s
W j�i = �WA
x
s
j�i = �W j�i = �j�i:
s
p
u
C
2
C
1
s
p
u
C
0
a) b)
Fig. 15.3. Creating pairs of quasiparti les (a) and moving them around (b).
An arbitrary state of the system an be des ribed as a set of quasi-
parti les of two types, one of whi h \lives" on verti es, the other on fa es.
Mathemati ally, a quasiparti le is simply a violated ode ondition, but now
we think of it as a physi al obje t. Parti les-ex itations an move, be re-
ated and annihilated. A pair of vertex quasiparti les is reated by the a tion
15
The position un ertainty is mu h larger than the latti e spa ing, but mu h smaller than the
distan e between the parti les. We annot go into more details here. Thus, reader's a quaintan e
with onventional quasiparti les (su h as ele trons and holes in a semi ondu tor, or spin waves)
would be very helpful.
15. Classi al and quantum odes 175
of the operator W ; a pair of fa e quasiparti les is reated by the operator
V =
Q
j2C
2
�
x
j
, where C
2
is a path on the dual latti e.
What will happen if we move a fa e quasiparti le around a vertex quasi-
parti le (see Figure 15.3b)? The initial state j i ontains two quasiparti les
of ea h type. The movement of the fa e quasiparti le around a losed path
C
0
is expressed by the operator U =
Q
j2C
0
�
x
j
=
Q
s
A
x
s
, where s runs over
all fa es inside C
0
. It is obvious that A
x
k
j i = j i for all k 6= p. As a result
we get
U j i =
Y
j2C
0
�
x
j
j i = A
x
p
j i = �j i:
Thus the state ve tor gets multiplied by �1. This indi ates some sort of
long range intera tion between the parti les: the moving parti le somehow
\knows" about the se ond parti le without ever tou hing it! However, the
intera tion is purely topologi al: the state evolution depends only on the
isotopy lass of the braid the parti le world lines form in spa e-time. In the
ase at hand, the evolution is just the multipli ation by a phase fa tor; su h
parti les are alled Abelian anyons.
On the torus we an move parti les over two di�erent y les that form
a basis of the homology group. For instan e, reate a pair of parti les
from the ground state, move one of them around a y le, and annihilate
with the se ond one. Now it be omes important that the ground state is
not unique. Re all that there is a 4-dimensional spa e of ground states |
the ode subspa e. The pro ess we have just des ribed a�e ts an operator
a ting on this subspa e. We an think of four di�erent operators of this
kind: Z
1
; Z
2
are produ ts of �
z
j
over the basis y les (they orrespond to
moving a vertex quasiparti le), whereas X
1
;X
2
are produ ts of �
x
j
over the
homologous y les on the dual latti e. The ommutation relations between
these operators are as follows:
(15.27)
Z
j
Z
k
= Z
k
Z
j
; X
j
X
k
= X
k
X
j
(j; k = 1; 2);
X
1
Z
1
= Z
1
X
1
; X
2
Z
2
= Z
2
X
2
;
X
1
Z
2
=�Z
2
X
1
; X
2
Z
1
=�Z
1
X
2
:
Thus we an identify the operators Z
1
and X
2
with �
z
and �
x
a ting on
one en oded qubit. Correspondingly, Z
2
and X
1
a t on the se ond en oded
qubit.
Part 3
Solutions
In this part we o�er either omplete solutions to problems or hints whi h
the interested reader an use to work out a rigorous solution.
S1. Problems of Se tion 1
1.1. The idea is simple: the ma hine moves symbols alternately from
left to right and from right to left until it rea hes the enter of the input
string, at whi h point it stops.
Now we give a formal des ription of this ma hine.
We assume that the external alphabet A is f0; 1g. The alphabet S =
f ; 0; 1; �; 0
0
; 1
0
g onsists of the symbols of the external alphabet, the empty
symbol , and three auxiliary marks used to indi ate the positions from
whi h a symbol is taken and a new one should be dropped.
The set of states is
Q = fq
0
; q
f
; r
0
; r
1
; l
0
; l
1
; l
0
0
; l
1
0
g:
The letters r and l indi ate the dire tion of motion, and the subs ripts at
these letters refer to symbols being transferred.
Now we des ribe the transition fun tion.
Beginning of work:
(q
0
; 0) 7! (r
0
; �;+1); (q
0
; 1) 7! (r
1
; �;+1);
(q
0
; ) 7! (q
0
; ;�1):
The �rst line indi ates that the ma hine pla es a mark in the �rst position
and moves the symbol that was there to the right. The se ond line indi ates
that the ma hine stops immediately at the empty symbol.
177
178 3. Solutions
Transfer to the right:
(r
0
; 0) 7! (r
0
; 0;+1); (r
1
; 0) 7! (r
1
; 0;+1);
(r
0
; 1) 7! (r
0
; 1;+1); (r
1
; 1) 7! (r
1
; 1;+1):
The ma hine moves to the right until it en ounters the end of the input
string or a mark.
A hange in the dire tion of motion from right to left onsists of two
a tions: remove the mark (provided this is not the empty symbol)
(r
0
; 0
0
) 7! (l
0
0
; 0;�1); (r
1
; 0
0
) 7! (l
1
; 0;�1);
(r
0
; 1
0
) 7! (l
0
0
; 1;�1); (r
1
; 1
0
) 7! (l
1
0
; 1;�1);
(r
0
; ) 7! (l
0
0
; ;�1); (r
1
; ) 7! (l
1
0
; ;�1)
and pla e it in the left adja ent position
(l
0
0
; 0) 7! (l
0
; 0
0
;�1); (l
1
0
; 0) 7! (l
0
; 1
0
;�1);
(l
0
0
; 1) 7! (l
1
; 0
0
;�1); (l
1
0
; 1) 7! (l
1
; 1
0
;�1):
Transfer to the left:
(l
0
; 0) 7! (l
0
; 0;�1); (l
1
; 0) 7! (l
1
; 0;�1);
(l
0
; 1) 7! (l
0
; 1;�1); (l
1
; 1) 7! (l
1
; 1;�1):
Change of dire tion from left to right:
(l
0
; �) 7! (q
0
; 0;+1); (l
1
; �) 7! (q
0
; 1;+1):
The ompletion of work depends on the parity of the word length: for
even length, the ma hine stops at the beginning of the motion to the right
(q
0
; 0
0
) 7! (q
f
; 0;�1); (q
0
; 1
0
) 7! (q
f
; 1;�1);
and for odd length, | at the beginning of the motion to the left
(l
0
0
; �) 7! (q
f
; 0;�1); (l
1
0
; �) 7! (q
f
; 1;�1):
The transition fun tion is unde�ned for the state q
f
; therefore the ma-
hine stops after swit hing to this state.
1.2. S hemati ally, this is done as follows: to the se ond summand we
add one by one the bits of the �rst summand, the added bit being erased.
Adding one bit takes time that does not ex eed the binary length of the
se ond summand, so that the total working time of the ma hine depends
quadrati ally on the length of the input.
1.3. The proof is by ontradi tion. Suppose that there is su h an algo-
rithm, i.e., that there exists a ma hine B whi h, for the input ([M ℄; x), gives
the answer \yes" if the ma hine M stops at input x and gives the answer
\no" otherwise. (Re all that [M ℄ denotes a des ription of the ma hine M .)
S1. Problems of Se tion 1 179
Let us de�ne another ma hine B
0
that, given an input y, simulates the
work of B for the input (y; y). If the answer of the ma hine B is \yes", then
B
0
begins moving the head to the right and does not stop. If the answer of
B is \no", then B
0
stops.
Does B
0
stop for the input [B
0
℄?
If we suppose that it stops, then B gives the answer \yes" for the input
([B
0
℄; [B
0
℄). Then, by de�nition of the ma hine B
0
, it does not stop for the
input [B
0
℄. This is exa tly the opposite of our assumption.
If B
0
does not stop for the input [B
0
℄, then B gives the answer \no" for
the input ([B
0
℄; [B
0
℄). But this implies that B
0
stops for the input [B
0
℄, a
ontradi tion.
Remark S1.1. This kind of proof is fairly ommon in mathemati al logi ;
it is often alled diagonalization. The idea was �rst used by Cantor to
show that the set of real numbers (or in�nite 0-1 sequen es) is un ountable.
We remind the reader Cantor's argument to explain the name \diagonaliza-
tion". Suppose that all 0-1 sequen es are ounted (i.e., assigned numbers
0; 1; 2; : : : ), so that we an think of them as rows of an in�nite table,
x
00
x
01
x
02
: : :
x
10
x
11
x
12
: : :
x
20
x
21
x
22
: : :
.
.
.
.
.
.
.
.
.
.
.
.
Let us look at the diagonal of this table and build the sequen e
y = (1� x
00
; 1� x
11
; 1� x
22
; : : : ):
It is lear that this sequen e annot be a row of the original table, whi h
proves that all 0-1 sequen es annot be ounted.
Note, however, that the unsolvability of the halting problem is a bit more
subtle: the proof is based on the existen e of a universal Turing ma hine
(we used it impli itly when onstru ting B
0
).
1.4. First of all, we show that the elements of an enumerable set an
a tually be produ ed one by one by an algorithmi pro ess. Suppose that
X is the set of all possible outputs of a Turing ma hine E. Let us try all
pairs of the form (x; n) (where x is a string, and n is a natural number) and
simulate the �rst n steps of E for the input x. If E terminates during the
�rst n steps, we in lude its output in the list; otherwise we pro eed to the
next pair (x; n). This way, all elements of X are in luded (possibly, with
repetitions).
180 3. Solutions
Theorem S1.1. A partial fun tion F : A
�
! f0; 1g is omputable if and
only if the sets X
0
= fx : F (x) = 0g and X
1
= fx : F (x) = 1g are both
enumerable.
Proof. Suppose that X
0
and X
1
are enumerable. Given an input string
y 2 X
0
[X
1
, we run the enumerating pro esses for X
0
and X
1
in parallel.
Sooner or later, y will be produ ed by one of the pro esses. If it is the
pro ess for X
0
, we announ e 0 as the result, otherwise the result is 1.
Conversely, if F is omputable, then there is a TM that presents its
input x as the output if F (x) = 0, and runs forever if F (x) is 1 or unde�ned.
Therefore X
0
is enumerable. (Similarly, X
1
is enumerable.) �
Now, let us turn to the original problem. We are interested in the �rst
set of these two:
X
0
=
�
[M ℄ : M does not halt for the empty input
;
X
1
=
�
[M ℄ : M halts for the empty input
:
Note that the se ond set, X
1
, is enumerable. Indeed, we an onstru t a
ma hine E that, given an input x = [M ℄, simulatesM and outputs the same
x when the simulation ends. (If M does not stop, or if the input string is
not a valid des ription of a TM, then E runs forever.) Therefore, if X
0
were
also enumerable, there would exist an algorithm for determining whether a
given Turing ma hine M halts for the empty input.
But then the halting problem (for a Turing ma hine T and an arbitrary
input x) would also be solvable. Indeed, for ea h pair ([T ℄; x) a ma hine M
that �rst writes x on the tape and then simulates the work of T is easily
onstru ted. We have arrived at a ontradi tion: a ording to Problem 1.3,
there is no algorithm for the solution of the halting problem.
1.5. The idea of a solution is as follows. Let b be an arbitrary om-
putable fun tion. For any n there exists a ma hine M
n
that writes n on
the tape, then omputes nb(n) and ounts down from nb(n) to zero. This
ma hine has O(log n) states and a onstant number of symbols. (It is easy
to see that O(log n) states are enough to write n in binary form on the tape.)
1.6. We des ribe brie y a single-tape ma hineM
1
that simulates a two-
tape ma hine M
2
. The alphabet of M
1
is rather large: one symbol en odes
four symbols of M
2
, as well as four additional bits. The symbol in the k-th
ell of M
1
represents the symbols in the k-th ells on the input tape, the
output tape and both work tapes ofM
2
; the additional bits are used to mark
the pla es where the heads ofM
2
are lo ated. The ontrol devi e ofM
1
keeps
the state of the ontrol devi e of M
2
and some additional information.
S1. Problems of Se tion 1 181
The ma hine M
1
works in y les. Ea h y le imitates a single step of
M
2
. At the beginning of ea h y le the head of M
1
is lo ated above the
leftmost ell. Ea h y le onsists of two passes. First M
1
moves from left to
right until it �nds all of the four marks (i.e., the heads of M
2
); the ontents
of the orresponding ells are stored in the ontrol devi e. On the way ba k
a tions imitating one step of M
2
are arried out. Ea h su h a tion requires
O(1) steps (and �nite amount of memory in the ontrol devi e).
Ea h y le takes O(s) steps of the ma hine M
1
, where s is the length of
the used portion of the tape. Sin e s � T (n) + 1, the ma hine M
1
works in
time O(sT (n)) = O(T
2
(n)).
1.7. The main problem in eÆ ient simulation of a multitape TM on
an ordinary TM is that the heads of the simulated ma hine may be far
from ea h other. Therefore the simulating head must move ba k and forth
between them to imitate a single step of the multitape TM.
However, if we have a se ond work tape, it an be used to move blo ks
of size n by distan e m along the �rst tape in O(n +m) steps. Indeed, we
an opy the blo k onto the se ond tape, then move the head on the �rst
tape and then opy the blo k ba k. Therefore we an build a \ a he" that
ontains neighborhoods of the heads aligned a ording to head positions:
O
a
1
a
2
a
3
a
4
a
5
a
6
a
7
a
8
a
9
O
b
1
b
2
b
3
b
4
b
5
b
6
b
7
b
8
b
9
7�!
O
a
1
a
2
a
3
a
4
a
5
O
b
5
b
6
b
7
b
8
b
9
After simulating several omputation steps in the a he, we an opy the
result ( hanged a he ontents) ba k. Spe i� ally, to simulate t steps, we
need to opy the t-neighborhoods of the heads (of size 2t + 1). We all it
t- a he.
To get a bound T (n) log T (n), we need to use re ursion and multilevel
a hing (by powers of 2). Suppose we have already simulated T = 2
k
steps
of the 3-tape ma hineM
3
on a 2-tape ma hineM
2
and want to ontinue the
simulation for another T steps. The urrent state of M
3
is represented by
the �rst T +1 ells on the �rst tape of M
2
. We extend this \main memory"
to size 2T+1 and perform the simulation in two portions, using a T=2- a he.
To arry out the omputation in this a he, we use a T=4 a he, and so on.
(All a hes are allo ated on the �rst tape.)
Main memory (T - a he) T=2- a he T=4- a he � � � 1- a he
Thus, ea h level of re ursion onsists in simulating t steps of M
3
in the
t- a he. This is done by a pro edure F that onsists of the the following
operations:
182 3. Solutions
1. opy the t=2-neighborhoods of the heads into the t=2- a he;
2. simulate t=2 steps of M
3
re ursively (by applying the same pro edure F
to the t=2- a he);
3. opy the result ba k to the t- a he;
4. opy the t=2-neighborhoods of the new head positions into the t=2- a he;
5. simulate the remaining t=2 steps;
6. write the result ba k.
To implement the re ursion, we augment ea h a he by a spe ial ell that
indi ates the operation being done at the urrent level. Ca hes are allo ated
as they are needed, and freed (and �lled with blanks) when returning to the
previous re ursion level. (This is a standard implementation of re ursion
using a sta k.)
The re urren e relation T (t) � 2T (t=2) +O(t) implies T (t) = O(t log t).
1.8. Loosely speaking, to opy a string of length n we need to move n
bits by distan e n. Sin e a TM has a �nite number of states, it arries only
O(1) bits by distan e 1 at ea h step; therefore (n
2
) steps are needed.
Here is a more formal argument. For ea h k we onsider a rossing
sequen e at boundary k. It is the sequen e of states of the TM during its
moves from the k-th ell to the (k + 1)-th ell. Note that the behavior of
the TM in the zone to the right of the boundary ( ells k + 1, k + 2, et .) is
determined by the initial ontents of that zone and the rossing sequen e.
(The only information that ows into the zone is arried by the head and is
re orded in the rossing sequen e. Note also that we do not worry how long
the head stays outside the zone and do not in lude this information in the
rossing sequen e.)
Di�erent input strings should generate di�erent rossing sequen es. The-
refore, most rossing sequen es are long (of size (n)). Sin e there are (n)
possible values of k, we have (n) rossing sequen es of length (n). Thus
the sum of their lengths is (n
2
), whi h is a lower bound for the omputation
time.
Here are the details. For simpli ity we assume that n is even (n = 2m)
and onsider inputs of the form x = v0
m
, where v is a binary string of length
m. Let k be a number between m and 2m, and let Q(v; k) be the rossing
sequen e for the omputation on v0
m
at boundary k. As we have said,
di�erent strings v lead to di�erent rossing sequen es (otherwise di�erent
strings would have identi al opies); therefore there are at least 2
m
rossing
sequen es. Sin e the number of states is O(1), some rossing sequen es have
length (m). Moreover, it is easy to see that the average length of the
rossing sequen e Q(v; k) (taken over all strings v 2 f0; 1g
m
for a �xed k) is
S2. Problems of Se tion 2 183
(m). Therefore, the average value of
X
m�k�2m
�
length of Q(v; k)
�
is (m
2
). But for ea h v this sum does not ex eed the omputation time;
therefore the average is also a lower bound for the omputation time.
On the other hand, it is easy to onstru t a Turing ma hine M whi h
will dupli ate some strings (e.g., 0
n
) in time T
0
(n) = O(n logn). First M
he ks whether the input onsists only of zeros. If this is the ase, M ounts
them and then produ es the same number of zeros. (To ount zeros, we
need a portable ounter of size O(log n), whi h is arried along the tape.) If
the input has nonzero symbols, then M just opies it in the usual way (in
O(n
2
) steps). But the minimal time is still O(n log n). One an he k that
this bound is tight (i.e., (n log n) is a lower bound).
1.9. Hint. We need to simulate an arbitrary TM. Sin e the values of
the variables an be arbitrarily large, we an store the entire tape of the
ma hine in one variable (tape ontent is the jSj-ary representation of some
number, where S is the ma hine alphabet). The head position is another
integer variable, and the state of the ontrol devi e is yet another.
Changes in these variables after one omputation step are des ribed in
terms of simple arithmeti operations (addition, multipli ation, exponentia-
tion, division, remainder) and omparison of numbers. All these operations
an be redu ed to the in rement and de rement statements and the om-
parison with 0 (using several auxiliary variables).
The transition table be omes a nested if-then-else onstru t.
S2. Problems of Se tion 2
2.1. Let us �nd all fun tions in two variables that an be expressed
by formulas in the basis A. We begin with two proje tions, namely, the
fun tions p
1
(x; y) = x and p
2
(x; y) = y. Then the following pro edure is
applied to the set F of already onstru ted fun tions. We add to the set F
all fun tions of the form f
�
g
1
(x
1
; x
2
); g
2
(x
3
; x
4
); : : : ; g
k
(x
2k�1
; x
2k
)
�
, where
x
j
2 fx; yg, g
j
2 F , f 2 F . If the set F in reases, we repeat the pro edure.
Otherwise there are two possibilities: either we have obtained all fun tions
in two variables (then the basis is omplete), or not (then the basis is not
omplete).
We estimate the working time of this algorithm. Only 16 Boolean fun -
tions in two variables exist; therefore the set F an be enlarged at most 14
times. At ea h step we must he k at most 16
m
� jAj possibilities, where
m is the maximum number of arguments of a basis fun tion. Indeed, for
184 3. Solutions
ea h basis fun tion f ea h of (at most) m positions an be o upied by any
fun tion from F , and jFj � 16. The length of the input (en oded basis) is
at least 2
m
(be ause the table for a fun tion in m Boolean variables has 2
m
entries). So, the working time of the algorithm is polynomially bounded in
the length of the input.
2.2. An upper bound O(n2
n
) < 2:01
n
(for large n) follows immediately
from the representation of the fun tion in disjun tive normal form (see for-
mula (2.1) on page 19).
To obtain a lower bound we ompare the number of Boolean fun tions in
n variables (i.e., 2
2
n
) and the number of all ir uits of a given size. Assume
that the standard omplete basis is used. For the k-th assignment of the
ir uit there are at most O((n + k)
2
) possibilities (two arguments an be
hosen among n input and k�1 auxiliary variables). Therefore, the number
N
s
of di�erent ir uits of size s does not ex eed
O
��
(n+ s)
2
�
s
�
= 2
2s(log(n+s)+O(1))
:
But the number of Boolean fun tions in n variables equals 2
2
n
. If
(S2.1) 2
n
> 2s
�
log(n+ s) +O(1)
�
;
there are more fun tions than ir uits, so that
n
> s. If s = 1:99
n
, then
inequality (S2.1) is satis�ed for suÆ iently large n.
2.3. We re all the onstru tion of the unde idable predi ate belonging
to P=poly (see Remark 2.1 on page 22).
For any fun tion ' : N ! f0; 1g the predi ate f
'
(x) = '(length(x))
belongs to P/poly. Now let ' be a omputable fun tion that is diÆ ult to
ompute: no TM an produ e output '(n) in polynomial (in n) time. More
pre isely, we use a omputable fun tion ' su h that for any TM M and any
polynomial p with integer oeÆ ients there exists n su h that M(1
n
) does
not produ e '(n) after p(n) steps.
It remains to onstru t su h a fun tion '. This an be done by \diago-
nalization" ( f. Remark S1.1): we onsider pairs (M;p) one by one; for ea h
pair we sele t some n for whi h '(n) is not de�ned yet and de�ne '(n) to
be di�erent from the result of p(n) omputation steps of M on input 1
n
. (If
omputation does not halt after p(n) steps, the value '(n) an be arbitrary.)
2.4. Ea h output depends on O(1) wires; ea h of them depends on O(1)
other wires, et . Therefore, the total number of used wires and gates is
O(1)
depth
= 2
O(log(m+n))
= poly(m+ n).
2.5. A ir uit onsists of assignments of the form y
j
:= f
j
(u
1
; : : : ; u
r
).
We an perform this assignments symboli ally, by substituting formulas for
S2. Problems of Se tion 2 185
u
1
; : : : ; u
r
. If these formulas have depth � h, then y
j
be omes a formula of
depth � h+ 1.
2.6.
A formula an be represented by a tree (input variables are leaves, in-
ternal verti es orrespond to subformulas, whereas the root is the formula
itself.)
It is easy to see that any formula X of size L has a subformula Z of
size between M and 2M (in lusive), where M = bL=3 . Indeed, we start
looking for su h a subformula at the root of the tree and ea h time hoose
the largest bran h (of one or two). When the size of the subformula be omes
� 2M , we stop.
Repla ing the subformula Z by a new variable z, we obtain a formula
Y (z) (of size from L� 2M to L�M) su h that X = Y (Z). Note that both
Z and Y have size � d2L=3e.
Suppose that Z and Y an be onverted into equivalent formulas Z
0
and
Y
0
of depth � h. Then the following formula of depth � h+3 will ompute
the same fun tion as X does:
X
0
= (Y (0) ^ :Z) _ (Y (1) ^ Z):
Thus, we have the re urren e relation
h(L) � h
�
d2L=3e
�
+ 3;
where h(L) is the maximum (over all formulas X of size L) of the minimal
depth of a formula X
0
that is equivalent to X. It follows that h(L) =
O(logL).
2.7. Let us use the disjun tive normal form. If we onstru t a ir uit
by formula (2.1) using AND gates and OR gates with arbitrary fan-in (the
number of inputs), only three layers are needed: one layer for negations, one
for onjun tions, and one for the disjun tion.
2.8. Let L be the size of a ir uit omputing the fun tion f = PARITY.
First, we onvert the ir uit into a formula of size L
0
� L
3
(see Problem 2.5).
Using De Morgan's identities
:
_
x
j
=
^
:x
j
; :
^
x
j
=
_
:x
j
;
we an ensure that negations are applied only to the input variables.
Without loss of generality, we may assume that the output is produ ed
by an OR gate. (Otherwise, we apply De Morgan's identities again and
obtain a ir uit for the fun tion :PARITY = PARITY�1; the following
arguments work for this fun tion as well.) We may also assume that the
inputs to the �nal OR gate are produ ed by AND gates. (If some input is
186 3. Solutions
a tually produ ed by an OR gate, this gate an be merged with the �nal
one. If it is produ ed by a NOT gate, we an insert a dummy AND gate
and still have depth 3.)
Now we have
f(x
1
; : : : ; x
n
) = t
1
_ � � � _ t
m
;
where ea h t
i
has the form t
i
= u
1
^ � � � ^ u
k
, and ea h u
k
is either a
disjun tion (a single variable is a spe ial ase of that) or the negation of a
variable. Note that a variable annot appear in a negation and a disjun tion
at the same time, or else the formula an be simpli�ed. For example, if
t
i
= (x
1
_ � � � _ x
k
) ^ � � � ^ :x
1
^ � � � ;
then x
1
an be deleted from the disjun tion (x
1
_ � � � _ x
k
). Therefore, ea h
t
i
has the form
t
i
= :x
j
1
^ � � � ^ :x
j
p
^ (monotone fun tion in the other variables):
Now we use a spe ial property of the fun tion f = PARITY: if f(x) = 1,
and x
0
di�ers from x in just one bit (say, x
j
), then f(x
0
) = 0. This ondition
should be true for all subformulas t
i
. It follows that ea h t
i
is the onjun tion
of n literals (i.e., input variables or their negations). Therefore, t
i
(x) = 1
for exa tly one value of x. Hen e the number of the subformulas t
i
is not
less than the number of points x where f(x) = 1, i.e., 2
n�1
.
Remark. It an be shown that ir uits of �xed depth omputing the fun -
tion PARITY and made of gates NOT, OR and AND with arbitrary fan-in,
always have exponential size. The proof (quite nontrivial!) is by indu tion,
starting with ir uits of depth 2 and 3 (the ase dis ussed above).
The proof of this assertion an be found in [14℄. We give a short exposi-
tion of the main idea. Let us note that OR-gates and AND-gates in a ir uit
of minimal size must alternate. One an try to swit h two layers by repla ing
ea h disjun tion of onjun tions by a onjun tion of disjun tions; this will
redu e the ir uit depth by 2. However, this transformation in reases the
size of the ir uit, and we do not get any satisfa tory bound for the new size.
Still, a reasonable bound an be obtained if we allow further transformation
of the resulting ir uit. We �rst assign random values to some of the input
variables. Thus we obtain a fun tion of a smaller number of variables, whi h
is either PARITY or its negation. As far as the ir uit is on erned, some
of its auxiliary variables an be evaluated using only the values of the input
variables we have just �xed. And with nonnegligible probability our ir uit
be omes simpler, so that the transposition of onjun tions and disjun tions
does not lead to a large in rease in the ir uit size.
S2. Problems of Se tion 2 187
2.9. There are three possible results of the omparison of two numbers
x and y: x > y, x = y, or x < y. We onstru t a ir uit whi h yields a
two-bit answer en oding these three possibilities.
We may assume without loss of generality that n is a power of two. (It
is possible to add several zeros on the left so that the total number of bits
be omes a power of two. This at most doubles the number of inputs bits.)
x
n�1
: : : : : : : : : : : : : : : : : : x
n=2
y
n�1
: : : : : : : : : : : : : : : : : : y
n=2
Cir uit Cmp
n=2
e
0
= 1 if the numbers are equal
g
0
= 1 if x > y
x
n=2�1
: : : : : : : : : : : : : : : : : : x
0
y
n=2�1
: : : : : : : : : : : : : : : : : : y
0
Cir uit Cmp
n=2
e
00
= 1 if the numbers are equal
g
00
= 1 if x > y
e := e
0
^ e
00
g := g
0
_ (e
0
^ g
00
)
e = 1 if the numbers are equal
g = 1 if x > y
Fig. S2.1. Cir uit Cmp
n
for omparison of n-bit numbers
(size of ir uit = O(n), depth = O(log n)).
A ir uit for the omparison of n-bit numbers is onstru ted re ursively.
We ompare the �rst n=2 (high) bits and the last n=2 (low) bits separately,
and then ombine the results, see Figure S2.1. (This \divide and onquer"
method will be used for the solution of many other problems.)
We estimate the size L
n
and depth d
n
of this ir uit. It is easy to see
that
L
n
= 2L
n=2
+ 3; d
n
= d
n=2
+ 2:
Therefore, L
n
= O(n) and d
n
= O(log n).
Remark. The inequality x > y holds if and only if the number x+(2
n
�1�y)
is greater than 2
n
� 1, whi h an be he ked by looking at the n-th bit of
this number. Note that 2
n
� 1� y is omputed very easily (by negating all
bits of y), so the omparison an be redu ed to addition (see Problem 2.12).
However, the solution we gave is simpler.
2.10. a) Let j = j
l�1
� � � j
0
. We will gradually narrow the table by
taking into a ount the values of j
l�1
, j
l�2
, and so on. For example, if
j
l�1
= 0, we sele t the �rst half of the table; otherwise we sele t the se ond
half. It is lear that the hoi e is made between x
0j
l�2
���j
0
and x
1j
l�2
���j
0
for
188 3. Solutions
ea h ombination of j
l�2
; : : : ; j
0
. Su h hoi es are des ribed by the fun tion
f(a; b; ) =
�
b if a = 0;
if a = 1;
whi h is applied simultaneously to all pairs of table entries. The operation is
then repeated with the resulting table, so f is used 2
l�1
+2
l�2
+� � �+1 = O(2
l
)
times in l parallel steps. Note that the fun tion f an be omputed by a
ir uit of size O(1).
However, before we an a tually apply f multiple times, we need to
prepare 2
p
opies of j
p
for ea h p (re all the bounded fan-out ondition).
This requires O(2
l
) trivial gates arranged in O(l) layers.
b) The solution is very similar to that of Problem 2.9. Let us onstru t
a ir uit Sear h
n
that outputs l = log
2
n opies of y = x
0
_ � � � _ x
n�1
, as
well as the smallest j su h that x
j
= 1 (if su h j exists). The ir uit Sear h
n
an be obtained from two opies of Sear h
n=2
applied to the �rst and the
se ond half of the string x, respe tively. Let the results of these appli ation
be y
0
; : : : ; y
0
; j
0
and y
00
; : : : ; y
00
; j
00
. Then we make one additional opy of y
0
and y
00
and ompute
y = y
0
_ y
00
; j =
�
j
0
if y
0
= 1;
n=2 + j
00
if y
0
= 0
by a ir uit of size O(l) and depth O(1) (ea h opy of y
0
ontrols a single
bit of j).
2.11. Let
�
f
j
(q); g
j
(q))
def
= D(q; x
j
�
. Then the intermediate states of
the automaton and its output symbols are
q
j+1
= f
j
f
j�1
� � � f
0
(q
0
) ( omposition of fun tions); y
j
= g
j
(q
j
):
The solution of the problem is divided into 4 stages.
1. We tabulate the fun tions f
j
and g
j
(by making m opies of the table
of D and narrowing the j-th opy a ording to x
j
; see the solution to
Problem 2.10a). This is done by a ir uit of size exp(O(k))m and depth
O(k + logm).
2. We ompute a ertain omposition of the fun tions f
0
; : : : ; f
m�1
(see
diagram and text below).
3. We ompute q
j
in a parallel fashion (see below).
4. We apply the fun tions g
j
; this is done by a ir uit of size exp(O(k))m
and depth O(k).
S2. Problems of Se tion 2 189
q
8
�
f
7
|{z}
F
0;7
q
7
�
f
6
|{z}
F
0;6
| {z }
F
1;3
q
6
�
f
5
|{z}
F
0;5
q
5
�
f
4
|{z}
F
0;4
| {z }
F
1;2
| {z }
F
2;1
q
4
�
f
3
|{z}
F
0;3
q
3
�
f
2
|{z}
F
0;2
| {z }
F
1;1
q
6
�
f
1
|{z}
F
0;1
q
1
�
f
0
|{z}
F
0;0
| {z }
F
1;0
| {z }
F
2;0
| {z }
F
3;0
q
0
�
At stages 2 and 3 we assume that m = 2
l
(sin e we an always augment
the sequen e f
0
; : : : ; f
m�1
by identity fun tions). We organize the fun tions
f
p
into a binary tree and ompute their ompositions F
r;p
(the ase l = 3 is
shown in the diagram). First we de�ne F
0;p
= f
p
. At step 1 we ompute the
ompositions F
1;0
= f
1
f
0
through F
1;m=2�1
= f
m�1
f
m�2
; we ontinue this
pro ess for l steps until we get the fun tion F
l;0
= f
m�1
� � � f
0
. The general
formula for step r is as follows:
F
r;p
= F
r�1; 2p+1
F
r�1; 2p
; r = 1; : : : ; l; 0 � p < 2
l�r
:
In this omputation fun tions are represented by value tables. Compos-
ing two fun tions, say u and v, amounts to omputing u(v(q)) for all values
of q; this is done by a ir uit of size exp(O(k)) and depth O(k).
With the fun tions F
r;p
, the transition from q
0
to q
j
be omes mu h
qui ker. For example, if j lies between 2
l�1
and 2
l
, we an get to q
2
l�1
in
one leap; then we make smaller jumps until we stop at the right pla e. Doing
the same thing for all j, we ompute q
j
in the following order (for l = 3):
q
0
step 0 : q
8
�
step 1 : q
4
�
step 2 : q
6
�
q
2
�
step 3 : q
7
�
q
5
�
q
3
�
q
1
�
? ? ? ? ? ? ? ? ?
In general, the omputation onsists of l + 1 steps. At step s we obtain the
values of q
j
for every j of the form j = 2
l�s
(2p + 1), using the re urren e
relation
q
2
l�s
(2p+1)
= F
l�s; 2p
�
q
2
l�s+1
p
�
; s = 0; : : : ; l; 0 � 2p+ 1 � 2
s
:
The omputation of F
r;p
and q
j
is performed by a 2
l
exp(O(k))-size,
O(lk)-depth ir uit. (Note that ea h q
j
is used only on e at ea h of the fol-
lowing steps, s+1; : : : ; l, so that we an make opies as we need them, while
keeping the fan-out bounded. Therefore, the bounded fan-out ondition an
be satis�ed without in rease in depth.)
190 3. Solutions
2.12. Suppose we want to ompute z = x + y. Let x = x
n�1
� � � x
0
,
y = y
n�1
� � � y
0
, z = z
n
� � � x
0
. The standard addition algorithm is des ribed
by the formulas
q
0
:= 0; q
j+1
:=
�
0 if x
j
+ y
j
+ q
j
< 2
1 if x
j
+ y
j
+ q
j
� 2
(j = 0; : : : ; n� 1);
z
j
:= x
j
� y
j
� q
j
; z
n
:= q
n
;
where q
0
; : : : ; q
n
are the arry bits. This sequen e of assignments is a ir uit
of size and depth O(n). Note that it also orresponds to a �nite-state au-
tomaton with the input alphabet A
0
= B
2
(pairs of input bits), the output
alphabet A
00
= B (bits of the result) and the state set B (the value of the
arry bit). Hen e the result of Problem 2.11 applies.
2.13. Part b) follows from Part a), so it is enough to solve a). The
�rst idea is obvious | we need to organize the input numbers into a binary
tree of depth dlog
2
me. Addition of two numbers an be done with depth
O(logn), hen e we get a ir uit of depth O(logm log n). However, this is
not exa tly what we want.
Fortunately, addition of two numbers an be done with depth O(1) if
we use a di�erent en oding for the numbers. Spe i� ally, let us represent a
number by the sum of two numbers (obviously, su h a representation is not
unique).
Lemma. There exists a ir uit of size O(n) and depth O(1) that onverts
the sum of three n-digit numbers into the sum of two numbers, i.e., omputes
a fun tion
F : (x; y; z) 7! (u; v) su h that u+ v = x+ y + z for any x; y; z:
(We will �nd a parti ular F su h that u has n+ 1 digits, whereas v has n
digits. If x has n+1 digits instead of n, then both u and v will be n+1-digit
numbers.)
Proof. Let x = x
n
x
n�1
� � � x
0
, y = y
n�1
� � � y
0
, z = z
n�1
� � � z
0
. We an
perform the addition bitwise, without arrying bits between binary pla es.
Thus at ea h pla e j we get the number w
j
= x
j
+ y
j
+ z
j
2 f0; 1; 2; 3g (for
j = 0; : : : ; n� 1). This number an be represented by two digits: w = u
j
v
j
.
Then we put
u = u
n�1
� � � u
0
0; v = x
n
v
n�1
� � � v
0
:
�
Now let (x
1
; x
2
) and (y
1
; y
2
) be pairs ofn-digit numbers that represent
x = x
1
+ x
2
and y = y
1
+ y
2
. Applying the lemma twi e, we get a pair
of n + 1-digit numbers, (z
1
; z
2
), su h that z
1
+ z
2
= x + y. Therefore we
an build a ir uit of size O(nm) and depth O(logm) that adds m n-digit
S2. Problems of Se tion 2 191
numbers represented by pairs. This way, we obtain two n
0
-digit numbers,
where n
0
= n+dlog
2
me. At the end, we need to a tually add these numbers
so that the result appear in the usual form. This an be done by a ir uit
of size O(n
0
) and depth O(log n
0
) (see Problem 2.12).
2.14. The standard division algorithm an be des ribed as a sequen e
on n subtra tions, ea h of whi h is skipped under ertain onditions. This
orresponds to a ir uit of size O(n
2
) and depth O(n logn) (assuming that
ea h subtra tion is performed in a parallel fashion, as in Problem 2.12).
Unfortunately, this algorithm annot be parallelized further, not even with
tri ks like in the previous problem. So, we will use a ompletely di�erent
method whi h allows parallelization at the ost of some in rease in ir uit
size (by a fa tor of order O(log n)).
a) Let y = 1�x=2 (it an be represented as 0:0y
1
y
2
� � �, where y
j
= 1�x
j
).
Note that 0 � y � 1=2, so we an express x
�1
by a rapidly onvergent series:
x
�1
=
1
2
(1� y)
�1
=
1
2
m�1
X
k=0
y
k
+ r
m
!
; where 0 � r
m
=
y
m
1� y
� 2
�(m�1)
:
We may set m = 2
l
, l = dlog
2
(n+ 2)e, so that r
m
� 2
�(n+1)
and
m�1
X
k=0
y
k
= (1 + y)(1 + y
2
)(1 + y
4
) � � � (1 + y
2
l�1
);
let us denote the last expression by u
m
.
We need to ompute x
�1
=
1
2
(u
m
+r
m
) with pre ision 2
�n
. By negle ting
r
m
, we introdu e an error � 2
�(n+2)
in the result. An additional error,
bounded by 2
�(n+1)
, omes from the ina urate knowledge of x. Therefore,
it suÆ es to ompute u
m
with pre ision 2
�(n+1)
. This al ulation involves
O(l) = O(log n) multipli ations and additions. In doing it, we must take
into a ount that round-o� errors may a umulate; therefore we need to
keep n+�(logn) digits. Ea h multipli ation or addition an be done by a
ir uit of size O(n
2
) and depth O(logn), hen e the total size and depth are
O(n
2
log n) and O((log n)
2
), respe tively.
b) First, we �nd an integer s su h that 2
s
� b < 2
s+1
(see Prob-
lem 2.10b).
The next step is to �nd an approximate value of ba=b , whi h is done
by a ir uit of size O(k
2
log k) and depth O((log k)
2
). We set x = 2
�s
b
and ompute x
�1
with pre ision 2
�k�3
by the ir uit des ribed above. (The
omputed value, as well as the exa t value, does not ex eed 1.) Similarly,
we approximate the number y = 2
�s
a < 2
k+1
with pre ision 2
�2
. Now we
an al ulate a=b = yx
�1
with the overall error < 1=2 and repla e the result
by the losest integer q.
192 3. Solutions
Let r = a � qb. It is lear that either q = ba=b and r = (a mod b), or
q = ba=b + 1 and r = (a mod b) � b. We an determine whi h possibility
has realized by he king if r is negative. If it is, we repla e q by q � 1, and
r by r + b.
2.15. To ompute the fun tion MAJ we ount 1s among the inputs,
i.e., ompute the sum of the inputs. This an be done by the ir uit from
Problem 2.13a. Then we ompare the result with dn=2e (see Problem 2.9).
2.16. We begin with some elementary graph theory. Graphs an be
des ribed by their adja en y matri es. Rows and olumns of the adja en y
matrix A(G) of a graph G are indexed by the verti es of the graph. If (j; k)
is an edge of G, then a
jk
= 1; otherwise a
jk
= 0. We regard the matrix
elements as Boolean variables.
We an de�ne the operations _ and ^ on Boolean matri es by analogy
with the usual matrix addition and multipli ation:
(P _Q)
uv
= P
uv
_Q
uv
;
(P ^Q)
uv
=
_
w
(P
uw
^Q
wv
):
Then, P
k
is a short notation for P ^ � � � ^ P (k times).
What is the meaning of the matrix A
k
, where A = A(G) is the adja en y
matrix of a graph G? Ea h element of this matrix, (A
k
)
uv
, says whether
there is a path of length k (i.e., a hain of k edges) between the verti es
u and v. Similarly, ea h element of the matrix (A _ I)
k
, where I is the
\identity matrix" (I
uv
= Æ
uv
), orresponds to the existen e of a path of
length at most k. Note that if there is a path between u and v, then there is
a path of length � n, where n is the number of verti es. Therefore, to solve
the problem, it suÆ e to ompute B
k
, where B = A _ I and k � n.
Multipli ation of (n � n)-matri es (i.e., the operation ^) an be per-
formed by a ir uit of depth O(log n) and size O(n
3
). Let l = dlog
2
ne,
k = 2
l
. All we need is to ompute B
2
; B
4
; : : : ; B
2
l
by repeated squaring.
This an be done by a ir uit of size O(n
3
log n) and depth O((log n)
2
).
2.17. As with many problems, a detailed solution of this one is tedious
(we would have to hoose a parti ular en oding for ir uits in the �rst pla e),
but the idea is rather simple. To begin with, we des ribe an algorithm for
evaluating a formula of depth d. Then we will extend that algorithm to
ir uits of depth d with one output variable (the size of su h a ir uit is
exp(O(d)) ). The generalization to several output variables is trivial.
The algorithm is simply an implementation of re ursion on a Turing
ma hine. Suppose we need to ompute a subformula A = f(A
0
; A
1
), where
f denotes an arbitrary gate from the basis. (The spe i� hoi e of f is
S2. Problems of Se tion 2 193
determined by an ora le query.) Let us assume that the omputation of
A
0
and A
1
an be done with spa e s. We ompute A
0
�rst and keep the
result (whi h is a single bit), while freeing the rest of the spa e used in
the omputation. Then we ompute A
1
. Finally, we �nd A and free the
spa e o upied by the value of A
0
. Thus the omputation of A requires only
s+ O(1) bits. Likewise, ea h level of re ursion requires a onstant number
of bits, hen e O(d) bits will suÆ e for the d levels.
Now let C be a ir uit of depth d with one output variable. A ording
to Problem 2.5, su h a ir uit an be \expanded" into a formula F of the
same depth. Spe i� ally, subformulas of F are in one-to-one orresponden e
with paths from the output variable to nodes (i.e., input and auxiliary vari-
ables) of the ir uit C. Note that we do not have enough spa e to hold the
expansion. Instead, we need to look up for a path in the ir uit C ea h
time we would a ess a subformula in F a ording to the original algorithm.
Thus the algorithm for ir uits involves traversing all paths. Note that spa e
O(d) is suÆ ient to hold a des ription of a path. We also need to allo ate
additional O(d) bits to build an ora le query on this des ription.
2.18. The idea is to onsider the ma hine M running in spa e s as an
automaton with exp(O(s)) states. This is not quite straightforward, be ause
the ma hine a tively reads bits from random pla es rather than just re eiving
a sequen e of symbols. However, the di�eren e is not so dramati . We
an arrange that the automaton repeatedly re eives the same input string,
xxx � � � , ea h time waiting for a needed bit to ome and skipping the others.
Let V be the set of on�gurations of the ma hineM with spa e s. Denote
by v
k
the initial on�guration when the ma hine is asked to ompute the
k-th bit of f
n;m;s
(x). (Note that v
k
does not depend on x.) We will onstru t
an automaton with the input and output alphabet A
0
= A
00
= B and the
state set
Q = f0; : : : ;m� 1g � V � f0; : : : ; jV j � 1g � f0; : : : ; n� 1g;
elements of whi h are denoted by (k; v; t; j). The initial state is q
0
=
(0; v
0
; 0; 0). What follows is a des ription of the automaton transition fun -
tion.
The variable j ounts bits of the input string; ea h time it is in remented
by one. Whenever j mat hes the ontents of the ma hine supplementary
tape (whi h is a part of v), the urrent input bit is a epted as the an-
swer from the ora le; otherwise it is ignored. When j \turns over", i.e.,
hanges from n�1 to 0, the Turing ma hine lo k ti ks, meaning that t gets
in remented and the ma hine on�guration v hanges a ording to its tran-
sition fun tion. Finally, whenever t turns over, the output bit is set to the
194 3. Solutions
omputation result ( ontained in v), k gets in remented, and the ma hine
on�guration is set to v
k+1
.
Let x be a binary string of length n. If we feed the string x
mjV j
= xx � � � x
(mjV j times) to the automaton, and sele t every l-th bit of the output
(l = jV jn), then we obtain the value of the desired fun tion, y = f
n;m;s
(x).
Therefore we an apply the result of Problem 2.11. Our automaton has
exp(O(s)) states and re eives mjV jn = exp(O(s)) symbols, hen e it an be
simulated by a ir uit of size exp(O(s)) and depth O(s
2
).
2.19. The solution onsists of the following steps.
1. We transform C into an equivalent formula � whi h operates with el-
ements of a �xed �nite group G rather than 0's and 1's. The basis
of � onsists of the group operations, i.e., MULTIPLICATION and
INVERSION. The Boolean value 0 is represented by the unit element
e of the group, whereas 1 is represented by an element u 6= e. More
formally, let ' : B ! G be the fun tion de�ned by '(0) = e, '(1) = u.
Then � omputes a fun tion F (g
1
; : : : ; g
N
) (N = exp(O(d))) su h that
'
�
f(x
1
; : : : ; x
n
)
�
= F (g
1
; : : : ; g
N
); where g
j
= '(x
p
j
) or g
j
= onst:
Ea h variable g
j
is used in the formula � only on e.
2. Using the identity (ab)
�1
= b
�1
a
�1
, we transform � to a form in whi h
the inversion is applied only to input variables. Due to the asso iativity
of multipli ation, the tree stru ture of the formula does not matter.
Thus we arrive at the equation
'
�
f(x
1
; : : : ; x
n
)
�
= h
1
� � � h
n
;
where h
j
= '(x
t
j
) or h
j
= '(x
t
j
)
�1
or h
j
= onst:
3. The produ t of group elements h
1
� � � h
N
is omputed by a �nite-state
automaton with jGj = O(1) states.
4. The work of this automaton is simulated by a ir uit of width O(1) and
size O(N) = exp(O(d)).
Steps 2, 3 and 4 are rather straightforward, so we only explain the �rst step.
Let G = A
5
be the group of even permutations on 5 elements; it onsists
of 5!=2 = 60 elements. (A smaller group will not suÆ e: our onstru tion
works only for unsolvable groups.) We will use the standard y le notation
for permutations, e.g., (245) : 2 7! 4; 4 7! 5; 5 7! 2; 1 7! 1; 3 7! 3.
Lemma. There are elements u; v; w 2 A
5
that are onjugate to ea h other
(i.e., v = aua
�1
, w = bub
�1
) and w = uvu
�1
v
�1
.
Proof. The onditions of the lemma are satis�ed by
u = (12345); v = (13542); w = (14352); a = (235); b = (245):
S3. Problems of Se tion 3 195
�
We will assume that the original formula C is written in the basis f:;^g.
Let us set '(0) = e, '(1) = u and use the following relations:
'(:x) = u'(x)
�1
; '(x ^ y) = b
�1
'(x) a'(y) a
�1
'(x)
�1
a'(y)
�1
a
�1
b:
Thus any Boolean formula of depth d is transformed into a formula over
A
5
with N � N(d) on e-only variables, where N(d) satis�es the following
re urren e relation:
N(d+ 1) = max
�
N(d) + 1; 4N(d) + 6
:
From this we get N(d) = exp(O(d)).
S3. Problems of Se tion 3
3.1. The key to the solution is the following property of Boolean fun -
tions: A _ :B and B _ C imply A _ C (i.e., if the �rst two expressions are
true for given values of A, B, C, then the third one is also true). Applying
this property, alled the resolution rule, we an derive new disjun tions from
old ones. This pro ess terminates at some set F of disjun tions that is not
extendible any more; we all it the losure of the original CNF F . By on-
stru tion, F and F represent the same Boolean fun tion. The presen e of
the empty disjun tion in F indi ates that this fun tion is identi al to 0, i.e.,
F is not satis�able. This way of he king the satis�ability is alled the reso-
lution method. It has polynomial running time for 2-CNFs and exponential
running time in general.
Let us des ribe the resolution method in more detail. Re all that a CNF
is a formula of the form F = D
1
^ � � � ^D
m
, where D
1
; : : : ;D
m
are lauses,
i.e., disjun tions of literals. A literal is either a variable or its negation. To
be pre ise, ea h lause is represented in the standard form, whi h is a set
of literals not ontaining x
j
and :x
j
at the same time. The idea is that a
disjun tion like x
1
_ x
1
is redu ed to x
1
, whereas x
1
_ :x
1
_ x
2
(whi h is
equal to 1) is removed from the list of lauses. The empty set is identi�ed
with the logi al onstant 0. We will regard F as a set of lauses (i.e., ea h
lause enters F at most on e, and the order does not matter).
Suppose F ontains lauses A _ :x
j
and C _ x
j
for some variable x
j
.
Let A _ C 6= 1. Then we an redu e D = A _ C to the standard form
des ribed above. The resolution rule takes F to F
0
= F [fDg. The repeated
appli ation of this rule takes F to its losure F . Note that applying the
resolution rule to 2- lauses and 1- lauses, one an only get 2- lauses, 1-
lauses, or the empty lause. (But if some lauses ontain 3 or more literals,
the size an grow even further.)
Theorem. F is not satis�able if and only if F ontains the empty lause.
196 3. Solutions
Proof. We will prove a more general statement. Let Y = l
1
_ � � � _ l
k
be an
arbitrary lause. Then F implies Y if and only if F ontains some lause D
that implies Y (i.e., D � Y ). The theorem orresponds to the Y = 0 ase
of this assertion.
The \if" part is obvious sin e F implies every lause in F .
To prove the \only if" part, we will use indu tion on k, going from k = n
(the total number of variables) down to 0. If k = n, then the fun tion Y (x)
takes value 0 at exa tly one point x = x
�
. The ondition \F implies Y "
means that F (x
�
) = 0. Therefore D(x
�
) = 0 for some D 2 F � F .
If k < n, let x
j
be a variable that does not enter Y . By the indu tion
hypothesis, the ondition \F implies Y " means that there are some D
1
;D
2
2
F su h that D
1
implies Y
1
= Y _ x
j
, and D
2
implies Y
2
= Y _ :x
j
. If we
regard lauses as sets of literals, this ondition be omes D
1
� Y
1
, D
2
� Y
2
.
If x
j
=2 D
1
or :x
j
=2 D
2
, then D
1
� Y or D
2
� Y , so we are done. Otherwise
we apply the resolution rule to D
1
andD
2
to obtain a new lauseD � Y . �
The resolution method an be optimized for 2-CNFs. To any 2-CNF F
we asso iate a dire ted graph �(F ). The verti es of �(F ) are literals. Ea h
2- lause a_ b is represented by two edges, (:a; b) and (:b; a). Ea h 1- lause
a is represented by the edge (:a; a). Let
e
F be F without the empty lause.
It is easy to see that �(
e
F ) onsists of all pairs of verti es that are onne ted
by paths in �(F ). A ording to the above theorem, F is not satis�able if
and only if
e
F ontains lauses x
j
and :x
j
for some j. This is equivalent to
the ondition that �(F ) ontains paths from x
j
to :x
j
and from :x
j
to x
j
.
Therefore, we an use the algorithm from Problem 2.16.
3.2. A ne essary ondition for the existen e of an Euler y le is the
onne tivity of the graph: any two verti es are onne ted by a path. Another
ne essary ondition: ea h vertex has even degree. (The degree of a vertex in
a graph is the number of edges in ident to that vertex.) Indeed, if an Euler
y le visits some vertex k times, then degree of this vertex is 2k.
Together these two onditions are suÆ ient: if a graph is onne ted and
all verti es have even degrees, it has an Euler y le. To prove this, let us
start at any vertex and extend the path by adding edges that have not been
used before. Sin e all verti es have even degrees, this extension pro ess
terminates only when we ome to the starting point (i.e., get a y le). If
not all edges are used, we repeat the pro ess and �nd another y le (note
the unused edges form a graph where ea h vertex has even degree), et .
After that our onne ted graph is overed by several edge-disjoint y les,
and these y les an be ombined into an Euler y le.
S3. Problems of Se tion 3 197
V
U
a) b) )
Fig. S3.1. Extending a partial mat hing by an alternating path:
a) old mat hing; b) alternating path; ) new mat hing.
It remains to note that it is easy to ompute the degrees of all verti es
in polynomial time. One an also �nd out in polynomial time whether the
graph is onne ted or not.
3.3. Let F (x
1
; : : : ; x
n
) be a propositional formula. How do we �nd a
satisfying assignment? First, we ask the ora le whether su h an assignment
exists. If the answer is \yes", we next learn whether there is a satisfying
assignment with x
1
= 0. To this end, we submit the query F
0
= F ^:x
1
. If
the answer is \no" (but F is satis�able), then there is a satisfying assignment
with x
1
= 1. Then we try to �x the value of x
2
and so forth.
Similar arguments an be used for the Hamiltonian y le problem
(and other NP-problems).
3.4. We will des ribe a polynomial algorithm for a more general prob-
lem: given a bipartite graph �, �nd a maximal partial mat hing, i.e., a set
of graph edges that do not share any vertex.
(A bipartite graph is a graph whose vertex set is divided into two parts,
U and V , so that the edges onne t only verti es from di�erent parts. Thus
the edge set is E � U � V .)
We onstru t a maximal mat hing stepwise. At ea h step we have a set
of edges C � E that provides a one-to-one orresponden e between A � U
and B � V . To extend C by one edge, i.e., to �nd a mat hing C
0
of size
jCj+ 1, we try to �nd a so- alled alternating path (see Figure S3.1).
De�nition. An alternating path is a sequen e of onse utively onne ted
verti es x
0
; : : : ; x
l
su h that
(1) all verti es in the sequen e are distin t;
(2) edges from C and from E n C alternate;
(3) x
0
2 U n A, x
l
2 V n B.
Thus (x
0
; x
1
) and (x
l�1
; x
l
) belong to E n C, and the path length l is odd.
198 3. Solutions
Lemma. The mat hing C an be extended if and only if an alternating path
exists.
Proof. If an alternating path exists, we an extend C by repla ing the edges
(x
2j�1
; x
2j
) 2 C with the edges (x
2j
; x
2j+1
) 2 E nC. There is one more edge
of the se ond type than of the �rst type.
Conversely, suppose a larger mat hing C
0
exists. Let us superimpose C
and C
0
to form the symmetri di�eren e X = C �C
0
= (C nC
0
) [ (C
0
nC).
Ea h vertex is in ident to at most one edge from C n C
0
and at most one
edge from C
0
nC. Therefore the onne ted omponents of X are paths and
y les in whi h edges from C n C
0
and C
0
n C alternate. Sin e C
0
is larger
than C, at least one of the omponents ontains more edges from C
0
n C
than from C n C
0
. Su h a omponent is an alternating path. �
Therefore, if there is no alternating path, then C is maximal, and the
algorithm stops. It remains to explain how to �nd an alternating path when
it exists.
Let us introdu e dire tion on edges of the graph a ording to the follow-
ing rule: the edges in C go from V to U , and all other edges go from U to
V (as is shown in Figure S3.1b). Then the existen e of an alternating path
is equivalent to the existen e of a pair of verti es u 2 U n A and v 2 V n B
su h that there is a dire ted path from u to v. (Indeed, we may assume
that the path from u to v is simple, i.e., it does not visit the same vertex
twi e, and therefore is an alternating path.) The existen e of a dire ted path
between two vertex subsets an be he ked by a slightly modi�ed algorithm
of Problem 2.16 (now the graph is dire ted, but this does not matter). The
algorithm an also be extended to �nd the path.
Thus we have proved that the perfe t mat hing problem belongs to P.
(The algorithm des ribed above is not optimal.)
3.5. It is suÆ ient to solve (b); however, we provide a (somewhat sim-
pler) solution for (a) �rst.
The Clique problem be omes an Independent set problem if we re-
pla e the graph by its omplement. (De�nition of the terms: For ea h graph
G the omplementary graph G has the same verti es, whereas its edges are
nonedges of G. An independent set is a set I of verti es su h that no two
verti es in I are joined by an edge.) Clearly, liques in G are independent
sets in G, and vi e versa.
For any 3-CNF F with n variables and m lauses we onstru t a graph
H su h that H has an independent set of ardinality m if and only if F
is satis�able. The verti es of H are the literals in C. (We assume that
ea h lause is a disjun tion of three literals; therefore H has 3m verti es.)
Literals in one lause are joined by edges. (Therefore an independent set
S3. Problems of Se tion 3 199
of ardinality m should in lude one literal from ea h lause.) Two literals
from di�erent lauses are joined by an edge if they are ontradi tory (i.e.,
the edge goes between x
j
and :x
j
).
If an independent set I of size m exists, it ontains one literal from
ea h lause, and no two literals are ontradi tory. Therefore we an assign
values to the variables so that all literals in I be true. This will be a satis-
fying assignment for the 3-CNF. Conversely, if the 3-CNF has a satisfying
assignment, we hoose a true literal from ea h lause to be a member of I.
This onstru tion, however, does not solve (b), be ause several indepen-
dent sets may orrespond to the same satisfying assignment.
x
i
: x
i
: l
3
l
2
l
1
: l
1
: l
2
variable lause
a) b)
Fig. S3.2
(b) To onstru t a redu tion that pre-
serves ardinality we must be more autious.
Assume that we have a 3-CNF with n vari-
ables and m lauses. For ea h variable x
i
there are two verti es ( alled V-verti es) la-
beled x
i
and :x
i
(see Figure S3.2a); exa tly
one of them will be in the independent set
I of the required size. Informally, x
i
2 I
[:x
i
2 I℄ means that x
i
= 1 [resp. x
i
= 0℄ in
the assignment.
In addition to these 2n verti es, we have 4m verti es alled C-verti es (m
is the number of lauses). Spe i� ally, for ea h lause we have a group of 4
verti es onne ted by edges (so that an independent set an in lude only one
vertex of the four). Figure S3.2b shows the C-verti es (unlabeled verti es
that form a small square in the middle) for a generi lause l
1
_ l
2
_ l
3
. Here
l
1
; l
2
; l
3
are literals, i.e., variables or their negations. (If l
s
is :x
i
, then :l
s
denotes the vertex x
i
.) The �gure also shows edges between C-verti es and
V-verti es (labeled by l
1
; l
2
;:l
1
;:l
2
;:l
3
). Note that in the graph ea h V-
vertex may be onne ted with several groups of C-verti es, as ea h variable
may enter several lauses.
We will be looking for an independent set of size n+m. Su h a set must
in lude exa tly one V-vertex (labeled x
i
or :x
i
) for ea h variable x
i
and
exa tly one of the four C-verti es for ea h lause. Depending on whether
x
i
or :x
i
is in luded, we set x
i
= 1 or x
i
= 0. The hoi e of C-verti es is
determined uniquely by this assignment. Indeed, let us look at Figure S3.2b,
not paying attention to the V-vertex :l
3
. It is easy to he k that for ea h
pair of values of the literals l
1
and l
2
there is exa tly one C-vertex in the
pi ture that an be in luded in the independent set. For example, if l
1
is
true and l
2
is false, then the verti es l
1
and :l
2
are in the independent set
I; therefore only the rightmost C-vertex an be in I.
200 3. Solutions
The vertex :l
3
taken into a ount, the onstru ted set I may turn not
to be independent. This happens when both :l
3
and the top C-vertex (the
one between l
1
and l
2
) belong to I. Then we have l
1
= l
2
= l
3
= 0. But this
is exa tly the ase where the lause l
1
_ l
2
_ l
3
is false.
Therefore, an independent set of size m + n in the graph exists if and
only if there is a satisfying assignment for the given 3-CNF. The pre ed-
ing argument shows that the orresponden e between independent sets and
satisfying assignments is one-to-one.
3.6. (a) See solution for (b) (though in fa t (a) has a simpler solution
that an be found, e.g., in [67℄).
(b) Note that the number of 3- olorings is a multiple of 6 (we an per-
mute olors).
Consider a 3-CNF C in n variables with m lauses. We onstru t a
graph G that has 7m+ 2n+ 3 verti es and admits 3- oloring if and only if
C is satis�able. Moreover, the number of 3- olorings (up to permutations,
i.e., divided by 6) equals the number of satisfying assignments.
Three verti es of G ( alled 0; 1; 2 in the sequel) are onne ted to ea h
other. In any 3- oloring they have di�erent olors; we assume that they
have olors 0; 1; 2 (see Figure S3.3a); this requirement redu es the number
of olorings by a fa tor 1=6.
2
10
2
x
i
: x
i
0 1
l
3
l
2
l
1
: l
1
: l
2
onstants variables lauses
a) b) )
Fig. S3.3
For ea h of the n variables x
i
we have two verti es, whi h are onne ted
to ea h other and to the vertex 2 (see Figure S3.3b, where the vertex 2 is
shown but the verti es 0 and 1 are not). These two verti es are labeled x
i
and :x
i
. In a 3- oloring they will have either olors 1 and 0 (\x
i
is true")
or 0 and 1 (\x
i
is false").
If no other verti es and edges are added, the graph has 2
n
3- olorings
that orrespond to all possible assignments. We now add some \gadgets"
that orrespond to lauses of C. For a given assignment ea h gadget either
has no olorings (if the lause is false for that assignment) or has exa tly
one oloring.
S3. Problems of Se tion 3 201
Figure S3.3 shows su h a gadget that orresponds to the lause l
1
_l
2
_l
3
,
where l
1
; l
2
; l
3
are literals (i.e., variables or their negations).
One an he k that the required property (no olorings or unique olor-
ing, depending on l
1
_ l
2
_ l
3
) is indeed true.
3.7. The tiling problem belongs to NP (if Merlin shows a orre t tiling,
then Arthur an he k its orre tness).
Let us show that 3-SAT an be redu ed to tiling. For any 3-CNF C we
need to onstru t an instan e of the tiling problem (i.e., sele t tile types and
boundary onditions) that has a solution if and only if C is satis�able.
Ea h tile type will be usable in only one pla e of the square (this restri -
tion an be enfor ed by appropriate labeling); for ea h position in the square
we will have several types that are usable in this position. Ea h position an
be regarded as a devi e that re eives information from neighboring devi es,
pro esses it, and sends it further.
Tiles in the bottom row orrespond to the variables of C. For ea h
of them we have two possibilities that orrespond to values 0 and 1; these
values are propagated to the top un hanged. The other rows are used to
ompute the values of lauses, passing intermediate results from left to right.
One additional olumn on the right side is needed to ompute the value of C
(the result appears in the upper right orner). Finally, we add a ondition
asserting that C = 1.
Another possibility is to redu e an arbitrary NP-problem to tiling di-
re tly: the omputation table of a TM is a two-dimensional table that sat-
is�es some lo al rules that an be interpreted as tiling rules.
3.8. Merlin tells Arthur the prime fa tors of x (ea h fa tor may repeat
several times). Sin e the multipli ation of integers an be performed in
polynomial time, Arthur an he k whether or not the fa torization provided
by Merlin is orre t.
3.9. We prove that Primality 2 NP by showing how Merlin onstru ts
a polynomial size \ erti� ate" of primality that Arthur an verify in poly-
nomial time.
Let p > 2 be a prime number. It is enough to onvin e Arthur that
(Z=pZ)
�
is a y li group of order p � 1 (see Appendix A, espe ially The-
orem A.10). Merlin an show a generator g of this group, and Arthur an
verify whether g
p�1
� 1 (mod p) (this requires O(log p) multipli ations; see
Se tion 4.2.2).
This is still not enough sin e the order of g may be a nontrivial divi-
sor of p � 1. If for some reason Arthur knows the fa torization of p � 1
(whi h in ludes prime fa tors q
1
; q
2
; : : : ), he an he k whether g
(p�1)=q
j
6� 1
202 3. Solutions
(mod p). But fa toring is a diÆ ult problem, so Arthur annot ompute the
numbers q
j
himself.
However, Merlin may ommuni ate the fa torization to Arthur. The
only problem is that Merlin has to onvin e Arthur that the fa tors are
indeed prime. Therefore Merlin should re ursively provide erti� ates of
primality for all fa tors.
Thus the omplete erti� ate of primality is a tree. Ea h node of this
tree is meant to ertify that some number q is prime (the root orresponding
to q = p). All leaves are labeled by q = 2. Ea h of the remaining nodes is
labeled by a prime number q > 2 and also arries a generator h of the group
(Z=qZ)
�
; the hildren of this node are labeled by prime fa tors of q � 1.
Let us estimate the total size of the erti� ate. The tree we have just
des ribed has at most n = dlog
2
pe leaves (sin e the produ t of all fa tors
of q � 1 is less than q). The total number of nodes is at most twi e the
number of leaves (this is true for any tree). Ea h node arries a pair of n-bit
numbers (q and h). Therefore the total number of bits in the erti� ate is
O(n
2
).
Now we estimate the erti� ate veri� ation omplexity. For ea h of O(n)
nodes Arthur he ks whether h
p�1
� 1 (mod q). This requires O(log q) =
O(n) multipli ations, whi h is done by a ir uit of size O(n
3
). Similar he ks
are performed for ea h parent- hild pair, but the number of su h pairs is also
O(n). Therefore the erti� ate he k is done by a ir uit of size O(n
4
), whi h
an be onstru ted and simulated on a TM in time poly(n).
S5. Problems of Se tion 5
5.1. If there is an a epting (i.e., ending with \yes") omputational
path, then there is su h a path of length exp(O(s)). Indeed, we may think
of ma hine on�gurations as points, and possible transitions as edges. Thus
an NMT with spa e s is des ribed by a dire ted graph with N = exp(O(s))
verti es, and the an a epting path is just a path between two given verti es.
If there is su h a path, we an eliminate loops from it and get a path of length
� N . Therefore the proof of Theorem 5.2 is still valid for nondeterministi
Turing ma hines.
Thus we redu e the NTM to a polynomial game; then we simulate this
game on a deterministi ma hine (see the �rst part of Theorem 5.2).
5.2. Negations of predi ates from �
k
belong to �
k
and vi e versa, so
that P
�
k
= P
�
k
.
Similarly to P, the lass P
�
k
is losed under negations. Therefore it
remains to prove the in lusion P
�
k
� �
k+1
.
S6. Problems of Se tion 6 203
Consider a predi ate F 2 P
�
k
and a polynomial algorithm A that om-
putes it using an ora le G 2 �
k
. Then F (x) is true if and only if there exists
a (polynomial size) sequen e � of pairs (query to an ora le, answer bit) su h
that (1) it for es A to produ e the answer 1; (2) for ea h pair (x; 1) 2 � the
string x is indeed in G; (3) for ea h pair (x; 0) 2 � the string x does not
belong to G.
Condition (1) has the �
1
-form, ondition (2) belongs to �
k
, and ondi-
tion (3) belongs to �
k
. Standard rules for quanti�ers say that any predi ate
of the type
9x[�
1
(x) ^ �
k
(x) ^�
k
(x)℄
belongs to �
k+1
.
Note that we have used the following property of lasses �
k
and �
k
: if
a predi ate G belongs to �
k
=�
k
, then \any element of a �nite sequen e z =
hz
1
; : : : ; z
n
i belongs to G" is a �
k
/�
k
-property of z. (Indeed, polynomially
many games an be played in parallel.)
S6. Problems of Se tion 6
6.1. Let F be an arbitrary spa e, and F : L � M ! F a bilinear
fun tion. The bilinearity implies that F (u; v) =
P
j;k
u
j
v
k
F (e
j
; f
k
) for any
u =
P
j
u
j
e
j
and v =
P
j
v
j
f
j
. If we set
G
0
�
X
j;k
w
jk
e
j
f
k
1
A
=
X
j;k
w
jk
F (e
j
; f
k
);
then the equation G(uv) = F (u; v) will hold. Conversely, if G
0
: LM!
F is a linear fun tion satisfying G
0
(u v) = F (u; v), then G
0
(e
j
f
k
) =
F (e
j
; f
k
), hen e G
0
= G by linearity.
6.2. Let F = L
0
M
0
and
F : L�M! F ; F (u; v) = A(u)B(v):
Then the required map C is exa tly the fun tion G from the universality
property.
6.3. The abstra t tensor produ t is unique in the following sense. Sup-
pose that two pairs, (N
1
;H
1
) and (N
2
;H
2
), satisfy the universality property.
Then there is a unique linear map G
21
: N
1
! N
2
su h that G
21
(H
1
(u; v)) =
H
2
(u; v) for any u and v. This map is an isomorphism of linear spa es.
The existen e and uniqueness of G
21
follows from the universality of the
pair (N
1
;H
1
): we set F = N
2
, F = H
2
, and get G
21
= G.
Note that if (N
3
;H
3
) also satis�es the universality property, then G
31
=
G
32
G
21
(the omposition of maps). Therefore G
12
G
21
= G
11
and G
21
G
12
=
204 3. Solutions
G
22
. But G
11
and G
22
are the identity maps on N
1
and N
2
, respe tively.
Thus G
12
and G
21
are mutually inverse isomorphisms.
6.4. Let �
2
j
(�
j
> 0) and j�
j
i be the nonzero eigenvalues and the or-
responding (orthonormal) eigenve tors of X
y
X. Then the ve tors j�
j
i =
�
�1
j
Xj�
j
i also form an orthonormal system. Thus we have
Xj�
j
i = �
j
j�
j
i; Xj i = 0 if h j�
j
i = 0 for all j:
This implies (6.2). Finally, we he k that (XX
y
)j�
j
i = �
j
j�
j
i.
6.5.
H[2℄ =
1
p
2
0
B
B
B
B
B
B
B
B
B
B
B
B
�
1 0 1 0 0 0 0 0
0 1 0 1 0 0 0 0
1 0 �1 0 0 0 0 0
0 1 0 �1 0 0 0 0
0 0 0 0 1 0 1 0
0 0 0 0 0 1 0 1
0 0 0 0 1 0 �1 0
0 0 0 0 0 1 0 �1
1
C
C
C
C
C
C
C
C
C
C
C
C
A
;
U [3; 1℄ =
0
B
B
B
B
B
B
B
B
B
B
�
u
00;00
u
00;10
0 0 u
00;01
u
00;11
0 0
u
10;00
u
10;10
0 0 u
10;01
u
10;11
0 0
0 0 u
00;00
u
00;10
0 0 u
00;01
u
00;11
0 0 u
10;00
u
10;10
0 0 u
10;01
u
10;11
u
01;00
u
01;10
0 0 u
01;01
u
01;11
0 0
u
11;00
u
11;10
0 0 u
11;01
u
11;11
0 0
0 0 u
01;00
u
01;10
0 0 u
01;01
u
01;11
0 0 u
11;00
u
11;10
0 0 u
11;01
u
11;11
1
C
C
C
C
C
C
C
C
C
C
A
:
S7. Problems of Se tion 7
7.1. Sin e the onjun tion ^ and the negation : form a omplete basis
for Boolean ir uits, Lemmas 7.1, 7.2 show that it is suÆ ient to realize the
fun tions ^
�
(i.e., the To�oli gate), :
�
: (x; y) 7! (x; x�y�1) and
he
. But
the To�oli gate is already in the basis, :
�
[1; 2℄ = :[2℄
he
[1; 2℄, so it suÆ es
to realize
he
. Let us introdu e an auxiliary bit u initialized by 0. Then the
a tion of
he
[1; 2℄ an be represented as :[u℄ ^
�
[u; 1; 2℄:[u℄.
S8. Problems of Se tion 8
8.1. Any rotation in three-dimensional spa e an be represented as a
omposition of three rotations: through an angle � about the z axis, then
S8. Problems of Se tion 8 205
through an angle � about the x axis, and then through an angle about
the z axis. Therefore any operator a ting on one qubit an be represented
in the form
(S8.1) U = e
i'
e
i( =2)�
z
e
i(�=2)�
x
e
i(�=2)�
z
:
Ea h of the operators on the right-hand side of (S8.1) an be expressed in
terms of H and a ontrolled phase shift:
e
i'
= �(e
i'
)�
x
�(e
i'
)�
x
; e
i'�
z
= �(e
�i'
)�
x
�(e
i'
)�
x
;
�
x
= H�(e
i�
)H; e
i'�
x
= He
i'�
z
H:
8.2. Let U = e
i'
Z, where Z 2 SU(2). Then �(U) = �(e
i'
)�(Z). The
operator �(e
i'
) a ts only on the ontrol qubit, so it remains to realize �(Z).
Any operator Z 2 SU(2) an be represented in the form
(S8.2) Z = A�
x
A
�1
B�
x
B
�1
; A;B 2 SU(2):
Therefore �(Z) is realized by the ir uit shown in Figure S8.1a.
Geometri ally, equation (S8.2) is equivalent to the assertion that any
rotation of the three-dimensional spa e is the omposition of two rotations
through 180
Æ
. The proof of this assertion is shown in Figure S8.1b.
A
�1
B
�1
A B�
x
�
x
�=2
�
180
Æ
180
Æ
a) b)
Fig. S8.1. Realization of the operator �(Z), Z 2 SU(2).
8.3. Let e
i�
be the required phase shift. The operators X = e
i'
1
�(e
i�
)
and Y = e
i'
2
�
x
an be realized over the basisA. Although the phases '
1
, '
2
are arbitrary, we an realize X
�1
= e
�i'
1
�(e
�i�
) and Y
�1
= e
�i'
2
�
x
with
exa tly the same phases by inverting the ir uits for X and Y . Therefore
we an realize the operator
e
i��
z
= X
�1
Y
�1
XY ;
206 3. Solutions
the unknown phases an el out. It remains to apply this operator to an
an illa in the state j0i.
8.4. We an onstru t a ir uit for the operator �(U) using the Fredkin
gate F = �($) | a ontrolled bit ex hange. It is de�ned and realized as
follows:
F ja; b; i
def
=
(
j0; b; i if a = 0;
j1; ; bi if a = 1;
F [1; 2; 3℄ = �
2
(�
x
)[1; 2; 3℄ �
2
(�
x
)[1; 3; 2℄ �
2
(�
x
)[1; 2; 3℄:
Figure S8.2 shows how, given an arbitrary gate U preserving j0i, one an
onstru t a ir uit for �(U). The ontrolled ex hange (shown in re tangles)
is performed with two n-qubit registers by applying n opies of the Fredkin
gate. If the ontrolling qubit ontains 1, the state j�i will be submitted to
U , otherwise the input to U is j0i.
U
$$
j�i
j0i
Fig. S8.2. Realization of the operator �(U), assuming that U j0i = j0i.
8.5 ( f. [7℄). We are going to give a solution with r = 1, but this will
require some preparation. Our �nal ir uit will be built of sub ir uits realiz-
ing �
n
(�
x
) for n < k with r = O(n) an illas. The idea is that if a sub ir uit
operates only on some of the qubits, it an borrow the remaining qubits and
use them as indi ated, i.e., realizing U I
B
r
instead of U . An illas used in
this spe ial way will be alled dummy qubits.
a
m
k �m+ 1
a) �
k
(�
x
) b) )
Fig. S8.3
S8. Problems of Se tion 8 207
Let us introdu e a graphi notation for the operator �
k
(�
x
); see Fig-
ure S8.3a. The key onstru tion is shown in Figure S8.3b. This ir uit
performs the following a tions:
1. The value a of the fourth (se ond to bottom) bit is hanged if and only
if the �rst two bits are 1. (This hange an be ompensated by applying
the To�oli gate one more time.)
2. The bottom bit is altered if and only if the �rst three bits are 1.
The most obvious use of this primitive is shown in Figure S8.3 , where
the operator �
k
(�
x
) is realized by two opies of �
m
(�
x
) and two opies of
�
k�m+1
(�
x
), using one dummy qubit (the se ond to the bottom one).
Now re all the idea we mentioned at the beginning. In implementing
the operators �
n
(�
x
) for n = m and n = k �m + 1 we an use k �m + 1
and m dummy qubits, respe tively. If we set m = dk=2e, then in both ases
we will have at least n� 1 dummy qubits available. Therefore it remains to
realize �
n
(�
x
) using at most n� 1 dummy qubits.
Fig. S8.4. Realization of �
n
(�
x
) with n� 2 dummy qubits.
The ir uit in Figure S8.4 realizes �
n
(�
x
) with n � 2 dummy qubits
(number n+ 1 through 2n� 2). The part of the ir uit on the right (whi h
is applied �rst) hanges the values of the bits as follows:
x
n+j
7! x
n+j
� x
1
x
2
� � � x
j+1
; j = 1; : : : ; n� 2:
The left part a ts similarly, but j runs from 1 to n� 1. These two a tions
an el ea h other, ex ept for the hange in the last bit,
x
2n�1
7! x
2n�1
� x
1
x
2
� � � x
n
:
8.6. The operator �
k
(U) has been realized by the ir uit shown in Fig-
ure 8.5. Ea h of the operators �
j
(Z
k�j
) in that ir uit is represented by
two opies of �
j
(�
x
) (or one opy of �
j
(i�
x
) and one opy of �
j
(�i�
x
) )
and four appli ations of one-qubit gates ( f. Figure S8.1a). Let us examine
208 3. Solutions
this onstru tion on e again. We note that for the realization of the op-
erator �
j
(�
x
) one an use up to k � j dummy qubits (see the solution to
Problem 8.5). Thus for j = 1; : : : ; k� 1 ea h of these operators an be real-
ized by a ir uit of size O(k). The remaining onstant number of operators
�
j
(�i�
x
) are realized by ir uits of size O(k
2
) as suggested in the main text
(see Figure 8.4). The total size of the resulting ir uit is O(k
2
).
8.7. Property (8.10) follows from this hain of inequalities:
XY j�i
� kXk
Y j�i
� kXk kY k
j�i
:
To prove (8.11), we note that the the operators XX
y
and X
y
X have the
same nonzero eigenvalues (see Problem 6.4).
Equation (8.12) follows from the fa t that the eigenvalues of the operator
(X Y )
y
(X Y ) = (X
y
X) (Y
y
Y ) have the form x
j
y
k
, where x
j
and y
k
are the eigenvalues of X
y
X and Y
y
Y , respe tively.
Equation (8.13) is obvious.
8.8. a) The ondition that
~
U approximates U with pre ision Æ is ex-
pressed by inequality (8.16). Multiplying the expression under the norm by
~
U
�1
on the left and by U
�1
on the right, we get kV U
�1
�
~
U
�1
V k � Æ.
b) It is suÆ ient to verify the statement for L = 2:
~
U
2
~
U
1
V � V U
2
U
1
=
~
U
2
(
~
U
1
V � V U
1
) + (
~
U
2
V � V U
2
)U
1
�
~
U
2
(
~
U
1
V � V U
1
)
+
(
~
U
2
V � V U
2
)U
1
�
~
U
2
~
U
1
V � V U
1
+
~
U
2
V � V U
2
U
1
=
~
U
1
V � V U
1
+
~
U
2
V � V U
2
:
8.9. We rephrase the problem as follows. Let Q = U I
B
(N�n)
and
M = B
n
j0
N�n
i. We have
k
~
U�
M
�Q�
M
k � Æ; QM =M;
and we are looking for a unitary operator W su h that W�
M
= Q�
M
and
kW �
~
Uk � O(Æ). Let L =
~
UM. We will try to �nd a unitary operator X
su h that
(S8.3) XL =M; kX � Ik � O(Æ):
If su h an X exists, then the following operator will serve as a solution:
W = Q�
M
+X
~
U(I ��
M
):
To satisfy the onditions (S8.3), we show �rst that L and M are lose
enough:
k�
L
��
M
k =
~
U�
M
~
U
y
�Q�
M
Q
y
� 2Æ;
sin e k
~
U�
M
�Q�
M
k � Æ. It remains to apply the following lemma.
S8. Problems of Se tion 8 209
Lemma. Let L;M � N and k�
L
� �
M
k � " < 1=2. Then there is a
unitary operator X su h that XL =M and kX � Ik = O(").
Proof. Let Y = �
M
�
L
+(I��
M
)(I��
L
). It is immediately evident that
Y takes L into M and L
?
into M
?
. The norm kY � Ik an be estimated
as follows:
kY � Ik = k�
M
�
L
��
M
��
L
+�
M
�
L
k
� k(�
M
��
L
)�
L
k+ k�
M
(�
M
��
L
)k � 2" < 1:
The operator Y is not unitary, but the above estimate shows that it is non-
degenerate. Therefore we an de�ne the operator X = Y (Y
y
Y )
�1=2
, whi h
is unitary. Sin e Y
y
Y leaves the subspa e L invariant, X takes L into M.
To estimate the norm of X we expand (Y
y
Y )
�1=2
into a Taylor series:
(Y
y
Y )
�1=2
= I +
1
2
Z +
3
8
Z
2
+ � � � ; where Z = I � Y
y
Y .
Therefore
(Y
y
Y )
�1=2
� I
� (1 � kZk)
�1=2
� 1 = O("), from whi h we
obtain kX � Ik = O("). �
8.10. Ea h of the onsidered rotations generates an everywhere dense
subset in the group of rotations about a �xed axis. Indeed, this group is
isomorphi to R=Z, and the generated subset onsists of elements of the
form n� mod 1, n 2 Z, where � is irrational (2�� is the rotation angle).
Therefore it remains to prove that the rotations about two di�erent axes
generate SO(3). For this it suÆ es to show that the subgroup generated
by all rotations about two di�erent lines a ts transitively on the sphere.
This fa t be omes obvious by looking at Figure S8.5 (if we an move along
two families of ir les, then from any point of the sphere we an rea h
any other point). A rigorous proof is obtained similarly for the solution to
Problem 8.11.
Fig. S8.5. Rotating about two axes in R
3
.
Remark. This solution is non onstru tive: we annot give an upper bound
for the number of generators X;X
�1
; Y; Y
�1
whose produ t approximates
210 3. Solutions
a given element U 2 SO(3) with a given pre ision Æ, even if X and Y are
�xed. Therefore the implied algorithm for �nding su h an approximation
may exhibit arbitrary long running times. The reason for this non onstru -
tiveness is as follows. Although the number � is irrational, it might be very
losely approximated by rational numbers (this o urs when the oeÆ ients
of the ontinued fra tion expansion of � grow rapidly). Then any r 2 R=Z is
approximated by elements of the form n� (n 2 Z) with arbitrary pre ision
Æ > 0, but the number n an be arbitrarily large: larger than any spe i�ed
fun tion of Æ.
A onstru tive proof and an e�e tive (for �xed X and Y ) algorithm for
�nding an approximation are rather ompli ated; see Se tion 8.3.
8.11. If we set j�
0
i = V
�1
j�i, then H
0
= V
�1
HV is the stabilizer of
C (j�
0
i). Then the problem takes the following form: prove that the union of
the stabilizers of two distin t one-dimensional subspa es generates U(M).
It suÆ es to show that the group G generated by H[H
0
a ts transitively
on the set of unit ve tors. Indeed, suppose that for ea h unit ve tor j i there
exists U
2 G su h that U
j�i = j i. Then
U(M) =
[
j i2M
U
H:
We prove the transitivity of the a tion of the group G. We note that for
any unit ve tor j i,
Hj i = Q(#)
def
=
�
j�i : jh�j�ij = os#
;
H
0
j i = Q
0
(#
0
)
def
=
�
j�i : jh�j�
0
ij = os#
0
;
where # = #(j i), #
0
= #
0
(j i) denote the angles between j i and j�i; j�
0
i,
respe tively:
os# = jh j�ij; os#
0
= jh j�
0
ij; 0 � #; #
0
� �=2:
In further formulas we will also use the angle � between the ve tors j�i and
j�
0
i: os� = jh�j�
0
ij, 0 � � � �=2.
It an be veri�ed that for dimM � 3 the angle between the ve tor
j�i and elements of Q
0
(#
0
) varies from j� � #
0
j to minf�+ #
0
; �=2g. Sim-
ilarly, the angle between j�
0
i and elements of Q(#) varies from j� � #j to
minf�+ #; �=2g. Therefore
HQ
0
(#
0
) =
[
j��#
0
j�#�minf�+#
0
;�=2g
Q(#);
H
0
Q(#) =
[
j��#j�#
0
�minf�+#;�=2g
Q
0
(#
0
):
S8. Problems of Se tion 8 211
It follows that
H
0
j�i = Q
0
(a);
HH
0
j�i =
[
0�#�minf2�;�=2g
Q(#);
H
0
HH
0
j�i =
[
0�#
0
�minf3�;�=2g
Q
0
(#
0
);
and so forth. Hen e, a ting on the ve tor j�i alternately with elements from
H
0
and H suÆ iently many times, we an obtain any unit ve tor j i.
8.12. First, we will realize the operator �
2
(i) I
B
, i.e., the appli ation
of �
2
(i) to the �rst two qubits. To this end, we will use the operators �
2
(�
�
)
(� = x; y; z). We note that �
y
= K�
x
K
�1
, �
z
= H�
x
H, hen e
�
2
(�
y
)[1; 2; 3℄ = K[3℄ �
2
(�
x
)[1; 2; 3℄ K
�1
[3℄;
�
2
(�
z
)[1; 2; 3℄ = H[3℄ �
2
(�
x
)[1; 2; 3℄ H[3℄:
Using the identity �
x
�
y
�
z
= iI
B
, we obtain
�
2
(�
x
)�
2
(�
y
)�
2
(�
z
) = �
2
(iI
B
) = �
2
(i) I
B
:
The ir uit for the realization of �
2
(i) I
B
is shown in Figure S8.6. The
inverse operator, �
2
(�i) I
B
, an be realized in a similar way.
HHK K
�1
�
x
�
x
�
x
Fig. S8.6. Realization of the operator �
2
(i) I
B
over the standard basis.
Now we onsider a new basis,
Q
0
=
�
H; �
2
(i); �
2
(�i)
:
It it suÆ ient to show that the appli ations of elements of this basis to two
qubits generate a dense subset of U(B
2
)=U(1). Let X = �(HKH); this
operator an be realized as follows:
X[1; 2℄ = H[2℄ �(K)[1; 2℄ H[2℄; �(K) = �
2
(i):
We a t with X on B
2
in two possible ways: X
1
= X[1; 2℄ and X
2
= X[2; 1℄.
The operators Y
1
= X
1
X
�1
2
, Y
2
= X
�1
2
X
1
are also realizable over the basis
Q
0
.
The operatorsX
1
, X
2
(and onsequently also Y
1
, Y
2
) preserve the ve tors
j00i and j�i = j01i+ j10i+ j11i. Dire t al ulations show that Y
1
and Y
2
do
212 3. Solutions
not ommute and have eigenvalues 1; 1; �
+
; �
�
, where �
�
= (1�
p
�15)=4 =
e
�i'=2
. The last two eigenvalues hara terize the a tion of Y
1
; Y
2
on the
subspa e L = C
�
j00i; j�i
�
?
. In SO(3)
�
=
U(L)=U(1) an operator with su h
eigenvalues orresponds to a rotation through the angle ' about some axis.
We will show that this angle is in ommensurate with �, so that Y
1
, Y
2
generate an everywhere dense subset in U(L)=U(1) (see Problem 8.10).
If '=� were rational, then �
+
, �
�
would be roots of 1, and hen e al-
gebrai integers. The minimal (over rationals) polynomial for an algebrai
integer � has the form f
�
(x) = x
n
+ a
1
x
n�1
+ � � �+ a
n
, where a
j
2 Z. How-
ever, the minimal polynomial for �
�
is x
2
�
1
2
x + 1; therefore �
�
are not
algebrai integers.
To omplete the proof, we apply the result of Problem 8.11 twi e. The
operators Y
1
, Y
2
generate an everywhere dense subset in U(L)=U(1), and
the operator V = �(K) preserves C (j00i) but not C (j�i), so that Y
1
, Y
2
,
V
�1
Y
1
V , V
�1
Y
2
V generate an everywhere dense set inU
�
L�C (j�i)
�
=U(1).
The operator H[1℄ does not preserve C(j00i); applying the result of Prob-
lem 8.11 on e again, we obtain an everywhere dense subset inU(B
2
)=U(1).
8.13. It is lear that powers of the operator R
4
= exp(4�i��
x
) 2 SU(2)
approximate exp(i'�
x
) for any given ' with any given pre ision. Hen e
powers of R approximate operators of the form
i
s
exp(i'�
x
) = i
s
�
os' i sin'
i sin' os'
�
; s = 0; 1; 2; 3:
For s = 3 and ' = �=2 this expression yields �
x
, so that we obtain the
To�oli gate: (�
2
(R))
k
� �
2
(�
x
) for suitable k. For s = 1; 3 and ' = 0 we
get �
2
(�iI
B
) = �
2
(�i) I
B
(the identity fa tor may be ignored).
Now we show how to eliminate unne essary ontrol qubits:
�
x
[1℄ �(X) �
x
[1℄ �(X) = I
B
X:
Thus we an realize �(�
x
), K = �(i), K
�1
= �(�i) and any operator
of the form exp(i'�
x
). A ording to Problem 8.2, these operators form a
omplete basis.
8.14. a) Let �
0
� � be a maximal subset with the property that the
distan e between any pair of distin t elements is greater than �Æ
0
, where
Æ
0
= Æ=(1� �) (\maximal" means that any larger subset does not have this
property). Then �
0
is an �Æ
0
-net for �. But � is a Æ-net for R, so �
0
is an
�Æ
0
+ Æ-net for R. Note that �Æ
0
+ Æ = Æ
0
. The interse tion of �
0
with the
Æ
0
-neighborhood of R is an �-sparse Æ
0
-net for R.
b) The proof is based on a volume onsideration. For our purposes it
is suÆ ient to onsider the volume on the ambient spa e P = L(C
M
) �
SU(M) rather than the intrinsi volume on SU(M). We regard P as a
S8. Problems of Se tion 8 213
2M
2
-dimensional real spa e endowed with a norm. The volume of the a-
neighborhood of any point in P is a
2M
2
(up to an arbitrary onstant fa tor).
Let a = �r=(2q), b = r + r=q + a. Then the a-neighborhoods of the
elements of the net do not overlap and are ontained in the b-neighborhood
of I. Therefore the number of elements does not ex eed (b=a)
2M
2
. But
b
a
=
q
�
�
2 +
2 + �
q
�
�
5q
�
;
sin e � < 1 and q > 1.
8.15. The in lusion [R
a
; R
b
℄ � R
2ab
is almost obvious: if kXk � a,
kY k � b, then
[X;Y ℄
= kXY � Y Xk � 2kXk kY k � 2ab:
So, we need to prove that R
ab=4
� [R
a
; R
b
℄. Without loss of generality we
may assume that a = b = 2. Let us onsider an arbitrary Z 2 su(M) su h
that kZk � 1. Our goal is to represent Z as [X;Y ℄, where kXk; kY k � 2.
Let us hoose an arbitrary basis in C
M
in whi h Z is diagonal,
Z = i
0
B
�
z
1
0
.
.
.
0 z
M
1
C
A
; z
1
; : : : ; z
M
2 R;
M
X
j=1
z
j
= 0; jz
j
j � 1:
Lemma. There exists a permutation � : f1; : : : ;Mg ! f1; : : : ;Mg su h
that 0 �
P
k
j=1
z
�(j)
� 2 for k = 0; : : : ;M .
Proof. We will pi k elements from the set fz
1
; : : : ; z
M
g one by one. Suppose
that k�1 elements z
�(1)
; : : : ; z
�(k�1)
has been already hosen so that w
k�1
=
P
k�1
j=1
z
�(j)
satis�es the inequality 0 � w
k�1
� 2. The sum of the remaining
elements equals �w
k�1
, so there are some z
p
� �w
k�1
and z
q
� 0 among
them. We set �(k) = p if w
k�1
< 1, and �(k) = q otherwise. In both ases
the number w
k
= w
k�1
+ z
�(k)
satis�es 0 � w
k
� 2. �
By applying the permutation � to the basis ve tors, we an arrange that
the partial sums
w
k
=
k
X
j=1
z
�(j)
; k = 0; : : : ;M;
214 3. Solutions
satisfy the inequality 0 � w
k
� 2. Let v
k
=
p
w
k
=2. Then Z = XY � Y X,
where
X = i
0
B
B
B
B
B
B
B
�
0 v
1
0 � � � 0
v
1
0 v
2
.
.
.
0 v
2
0
.
.
.
0
.
.
.
.
.
.
.
.
.
v
M�1
0 � � � 0 v
M�1
0
1
C
C
C
C
C
C
C
A
;
Y =
0
B
B
B
B
B
B
B
�
0 �v
1
0 � � � 0
v
1
0 �v
2
.
.
.
0 v
2
0
.
.
.
0
.
.
.
.
.
.
.
.
.
�v
M�1
0 � � � 0 v
M�1
0
1
C
C
C
C
C
C
C
A
:
We will estimate the norms of X and Y , using the fa t that 0 � v
k
� 1.
Note that X and Y are onjugate (by the matrix that has i
k
on the diagonal
and 0 o� the diagonal), so kXk = kY k. Let us examine the matrix A =
i
�1
X, whi h obviously has the same norm, kAk = max
j
j�
j
j, where f�
j
g is
the spe trum of A. All entries of A are nonnegative, so the Perron{Frobenius
theorem [50℄ applies. Thus there exists an eigenvalue �
max
= �
max
(A) su h
that j�
j
j � �
max
for all j. The orresponding eigenve tor has nonnegative
oeÆ ients. It is easy to see that
�
max
(A) = lim
N!1
�
h�jA
N
j�i
�
1=N
;
where j�i is an arbitrary ve tor with positive oeÆ ients. Therefore �
max
(A)
is a monotone fun tion of the matrix entries, i.e., �
max
annot de rease if the
entries in rease. It follows that kAk does not ex eed the largest eigenvalue
of the matrix
B =
0
B
B
B
B
B
B
B
�
0 1 0 � � � 0
1 0 1
.
.
.
0 1 0
.
.
.
0
.
.
.
.
.
.
.
.
.
1
0 � � � 0 1 0
1
C
C
C
C
C
C
C
A
:
The latter is equal to 2 os(�=(M + 1)) < 2.
8.16. By bringing X to the diagonal form, the inequality (8.19) an be
derived from the inequality je
ix
� 1� ixj � O(x
2
), x 2 R.
To prove (8.20), let us �x X and Y and use the formulas
exp(X + Y ) = lim
n!1
�
e
X=n
e
Y=n
�
n
; exp(X) exp(Y ) =
�
e
X=n
�
n
�
e
Y=n
�
n
:
S8. Problems of Se tion 8 215
To pass from P =
�
e
X=n
e
Y=n
�
n
to Q =
�
e
X=n
�
n
�
e
Y=n
�
n
, one needs to pull
all e
X=n
fa tors to the left, ommuting them with e
Y=n
fa tors on the way.
Thus the di�eren e Q�P an be represented as a sum of n(n� 1)=2 terms
of the form U(e
X=n
; e
Y=n
� e
Y=n
e
X=n
)V , where U; V are unitary. The norm
of ea h term equals
e
X=n
e
Y=n
� e
Y=n
e
X=n
�
1
n
2
[X;Y ℄
+O
�
1
n
3
�
;
hen e
exp(X) exp(Y )� exp(X + Y )
�
1
2
[X;Y ℄
� kXk kY k:
The inequality (8.21) follows from (8.19). Indeed, let A � B denote that
kA�Bk � O
�
kXk kY k
�
kXk + jY k
��
(assuming that X and Y are �xed). Then
[[e
X
; e
Y
℄℄� I =
�
(e
X
� I)(e
Y
� I)� (e
Y
� I)(e
X
� I)
�
e
�X
e
�Y
� XY � Y X = [X;Y ℄ � exp([X;Y ℄)� I:
8.17. a) In view of the result of Problem 8.14a), it is suÆ ient to show
that [[�
1
;�
2
℄℄ is an
�
r
1
r
2
=4; (25=6)r
1
r
2
=q
�
-net.
Let V 2 S
r
1
r
2
=4
. Due to the property of the group ommutator (8.18),
there are some V
1
2 S
r
1
and V
2
2 S
r
2
su h that
d
�
[[V
1
; V
2
℄℄; V
�
� O(r
1
r
2
(r
1
+ r
2
)):
Ea h V
j
(j = 1; 2) an be approximated by an element U
j
2 �
j
with pre-
ision Æ
j
= r
j
=q. Using the biinvarian e of the distan e fun tion and the
property of the group ommutator (8.17), we obtain
d
�
[[U
1
; U
2
℄℄; [[V
1
; V
2
℄℄
�
� d
�
[[U
1
; U
2
℄℄; [[V
1
; U
2
℄℄
�
+ d
�
[[V
1
; U
2
℄℄; [[V
1
; V
2
℄℄
�
= d
�
[[V
�1
1
U
1
; U
2
℄℄; I
�
+ d
�
I; [[V
1
; U
�1
2
V
2
℄℄
�
� 2d(V
1
; U
1
)d(U
2
; I) + 2d(V
1
; I)d(V
2
; U
2
)
� 2Æ
1
(r
2
+ Æ
2
) + 2r
1
Æ
2
= 4r
1
r
2
=q + 2r
1
r
2
=q
2
:
Therefore
d
�
[[U
1
; U
2
℄℄; V
�
� O(r
1
r
2
(r
1
+ r
2
)) + 4r
1
r2(q
�1
+ q
�2
=2)
� (4 + f(r; q))r
1
r
2
=q;
where f(r; q) = q
�1
=2 + O(r
1
+ r
2
)q. If r
1
, r
2
are small enough (as in the
ondition of the problem), then f(r; q) is small too; we may assume that
f(r; q) � 1=6. Thus d
�
[[U
1
; U
2
℄℄; V
�
� 25=6.
216 3. Solutions
b) Let V 2 S
r
1
. Then there is some U
1
2 �
1
su h that d(V;U
1
) � Æ
1
.
Therefore U
�1
1
V 2 S
Æ
1
� S
r
2
. It follows that d(U
�1
1
V;U
2
) � Æ
2
for some
U
2
2 �
2
, but d(U
�1
1
V;U
2
) = d(V;U
1
U
2
).
) We just need to iterate the previous argument. Note that the elements
Z
j
2 �
j
an be found by an e�e tive pro edure whi h involves O(n) group
operations and al ulations of the distan e fun tion.
S9. Problems of Se tion 9
9.1. First, we apply k opies of the ir uit U ; then we ompute the
majority fun tion bitwise, i.e., we apply m opies of an operator M that
realizes MAJ
�
with s an illas. Thus the omplete ir uit an be represented
symboli ally as W = M
m
U
k
. We need to estimate the probability of
obtaining the orre t result y = F (x), given by
p(yjx) =
X
y
1
;:::;y
k
z
1
;:::;z
k
�
�
�
y
1
; z
1
; : : : ; y
k
; z
k
; y; 0
ms
�
�
W
�
�
(x; 0
N�n
)
k
; 0
m
; 0
ms
�
�
�
�
2
;
assuming that
P
z
�
�
hF (x); zjU jx; 0
N�n
i
�
�
2
= 1 � "
x
, where "
x
� " < 1=2 for
ea h x.
If more than half of the output registers of the initial ir uit ontain
F (x), then the result of the bitwise majority vote will ne essarily be F (x).
Therefore, similarly to (4.1) on p. 37, we have
1� p(F (x)jx)
�
X
S�f1;:::;kg;
jSj�k=2
X
y
1
;:::;y
k
;
y
j
=F (x), j2S
X
z
1
;:::;z
k
�
�
�
y
1
; z
1
; : : : ; y
k
; z
k
�
�
U
k
�
�
(x; 0
N�n
)
k
�
�
�
�
2
=
X
S�f1;:::;kg;
jSj�k=2
(1� "
x
)
jSj
"
k�jSj
x
< �
k
; where � = 2
p
(1� ")".
9.2. Let j�i = jx; 0
N�n
i and M = jF (x)i B
N�m
( f. Remark 10.1).
We have
h�jU
y
�
M
U j�i � 1� "; k
~
U � Uk � LÆ:
S9. Problems of Se tion 9 217
Thus, we do the following estimate:
�
�
�
h�j
~
U
y
�
M
~
U j�i � h�jU
y
�
M
U j�i
�
�
�
=
�
�
�
h�j(
~
U
y
� U
y
)�
M
~
U j�i+ h�jU
y
�
M
(
~
U � U)j�i
�
�
�
�
�
�
�
h�j(
~
U
y
� U
y
)�
M
~
U j�i
�
�
�
+
�
�
�
h�jU
y
�
M
(
~
U � U)j�i
�
�
�
� k
~
U
y
� U
y
k+ k
~
U � Uk � 2LÆ;
whi h implies that h�j
~
U
y
�
M
~
U j�i � 1� "� 2LÆ.
9.3. If we hange the basis in the ontrolled qubit, we get the operator
�
2
(�1) = �(�
z
). Indeed,
�
x
H H
=
H�
x
H
=
�
z
This operator multiplies j1; 1i by �1 and does not hange the remaining
basis ve tors.
Let us see what happens if we hange the basis in the ontrolling qubit.
The resulting operator is H[1℄ �(�
x
)[1; 2℄H[1℄; we al ulate its matrix ele-
ments:
a; b
�
�
H[1℄ �(�
x
)[1; 2℄ H[1℄
�
�
; d
�
=
X
x;y
�
1
p
2
(�1)
ax
�
x; b
�
�
�(�
x
)[1; 2℄
�
�
y; d
�
�
1
p
2
(�1)
y
�
=
1
2
X
x;y
(�1)
ax+ y
Æ
x;y
Æ
b; y�d
=
1
2
(�1)
(a+ )(b+d)
:
Thus
�
x
H H
=
1
2
0
B
B
�
1 1 1 �1
1 1 �1 1
1 �1 1 1
�1 1 1 1
1
C
C
A
= �
x
[1℄�
x
[2℄V;
where V = I � 2j�ih�j and j�i =
1
2
P
x;y
jx; yi. (Re all that a multiqubit
version of the operator V was de�ned on page 85.)
9.4. Part a) follows from part b).
b) Let us des ribe a ir uit that gives an approximate solution. First of
all, we write the re ursive formula
(S9.1) j�
n;q
i = os# j0i j�
n�1;q
0
i+ sin# j1i j�
n�1;q
00
i;
218 3. Solutions
where
q
0
= 2
n�1
; q
00
= q � 2
n�1
; # = ar os
p
q
0
=q if q > 2
n�1
;
q
0
= q; q
00
= 1; # = 0 if q � 2
n�1
.
We des ribe a omputation pro edure based on formula (S9.1). It on-
sists of the following steps.
1. Compute q
0
, q
00
and #=�, with the latter number represented as an ap-
proximation by l binary digits. Store the results of the omputation in
supplementary qubits.
2. Apply the operator
R(#) =
�
os# � sin#
sin# os#
�
to the �rst qubit of the register in whi h we are omposing j�
n;q
i. (It
initially ontains j0
n
i.)
3. In the remaining n� 1 bits, produ e a state depending on the value of
the �rst bit: if it equals 0, then reate the state j�
n�1;q
0
i (by alling the
pro edure re ursively); otherwise reate j�
n�1;q
00
i.
4. Reverse step 1 to lear the supplementary memory.
The operator R(#) is realized approximately. Let #=� =
P
l
k=1
a
k
2
�k
.
Then R(#) � R(�=2
l
)
a
l
� � �R(�=2)
a
1
with pre ision O(2
�l
). Thus the a -
tion of the operator R(#), ontrolled by #, is represented as the produ t of
operators �(R(�=2
k
)), where the k-th bit of the number #=� ontrols the
appli ation of the operator R(�=2
k
).
The overall pre ision of the onstru ted ir uit is Æ = O(n2
�l
); its size,
expressed in terms of the length of the input and the pre ision, is poly(n+
log(1=Æ)).
) We des ribe the realization of the Fourier transform operator found
by D.Coppersmith [19℄ and, independently, by D.Deuts h.
We enumerate the qubits in des ending order from n � 1 to 0. Thus a
number x, 0 � x < n, is represented as x
n�1
� � � x
0
=
P
n�1
k=0
2
k
x
k
, so that
the exponent in the de�nition of the operator F
q
(q = 2
n
) an be written as
follows:
xy
2
n
=
n�1
X
k=0
n�1
X
j=0
2
k+j�n
x
k
y
j
�
X
k+j<n
2
k+j�n
x
k
y
j
(mod 1):
It is onvenient to reverse the bit order in x, i.e., to repla e k by n� 1� k.
This way the Fourier transform operator is represented in the form
F
2
n
= V
n
R
n
;
S9. Problems of Se tion 9 219
H
H
H
H
!
4
!
8
!
16
!
4
!
8
!
4
x
0
x
1
x
2
x
3
!
n
= e
2�i=n
Fig. S9.1. Realization of the Fourier transform operator F
2
n
(for n = 4).
where R
n
: jx
n�1
; : : : ; x
0
i 7! jx
0
; : : : ; x
n�1
i, and V
n
has the following matrix
elements:
hy
n�1
; : : : ; y
0
jV
n
jx
n�1
; : : : ; x
0
i =
1
2
n=2
exp
0
�
2�i
X
0�j�k<n
2
�(k�j+1)
x
k
y
j
1
A
:
Let us analyze the above equation. If we only keep terms with j = k in
the sum, we will get the matrix elements of the operator H
n
. It is seen by
inspe tion (and an also be proved by indu tion) that the remaining terms
are reprodu ed by this formula:
(S9.2)
V
n
= H[n�1℄ P
n�1
H[n�2℄ P
n�2
� � � H[1℄ P
1
H[0℄;
hyjP
k
jxi = exp
�
2�i
P
k�1
j=0
x
k
y
j
�
Æ
y;x
:
Indeed, omputing the matrix element hyjV jxi from formula (S9.2) amounts
to the summation over paths from x to y. Looking at any path with nonzero
ontribution, we see that x
k
passes through H[0℄; : : : ;H[k�1℄ un hanged,
whereas y
0
; : : : ; y
k�1
pass through H[n�1℄; : : : ;H[k℄ un hanged.
It remains to realize the operators P
k
:
P
k
= S
k;k�1
S
k;k�2
� � � S
k;0
; where S
k;j
= �
2
�
e
2�i=2
k�j+1
�
[k; j℄ (k > j):
The resulting ir uit for the operator F
2
n
is shown in Figure S9.1.
9.5. BPP � BQP. A lassi al probabilisti omputation an be repre-
sented by an invertible ir uit U
L
� � �U
1
, whi h, together with the input word
x, uses a random sequen e r 2 B
s
of 0s and 1s. (In addition to the result,
the ir uit an also produ e some garbage | this does not matter.) For
the quantum simulation of this ir uit, we regard U
j
as unitary operators
permuting the basis ve tors and, instead of the random word r, we prepare
the state
j i = H
�s
j0
s
i = 2
�s=2
X
r2B
s
jri:
220 3. Solutions
BQP � PP. Let a ir uit U
L
� � �U
1
evaluate the predi ate Q(x) for jxj =
n with error probability � 1=3, the total number of bits in the ir uit being
equal to N . The probability p(x) of obtaining the result 1 an be expressed
in terms of the proje tion �
(1)
= j1ih1j applied to the �rst qubit:
p(x) =
x; 0
N�n
�
�
U
y
1
U
y
2
� � �U
y
L
�
(1)
[1℄U
L
U
L�1
� � �U
1
�
�
x; 0
N�n
�
= 2
�h
x; 0
N�n
�
�
V
L
V
L�1
� � � V
�L+1
V
�L
�
�
x; 0
N�n
�
:
(S9.3)
Here V
L
; : : : ; V
�L
are the operators U
y
1
; : : : ;�
(1)
[1℄; : : : ; U
1
, whi h are renum-
bered and also renormalized as follows: if U
k
= H[m℄ (or if U
y
k
= H[m℄),
then the orresponding operator V
j
equals
p
2H[m℄. The number of H gates
in the ir uit is denoted by h.
The matrix elements of the operators V
j
2
�p
2H;K;K
y
;�(�
x
);�
2
(�
x
);
�
(1)
belong to the set
M = f0; +1; �1; +i; �ig:
Multiplying the matri es, we obtain a sum of numbers from the set M .
Sin e the quantity p(x) we are interested in is real, we an limit ourselves to
summing �1. The multipli ities of the summands are expressed in the form
#
a
(x) =
�
�
fw : C
a
(x;w) = 1g
�
�
; a 2 f1;�1g;
where the predi ates C
a
(x;w) will be de�ned below. Thus we obtain the
representation
p(x) = 2
�h
�
#
1
(x)�#
�1
(x)
�
:
The remaining part of the proof does not involve any quantum me hani s
at all. First, we need an expli it des ription of the predi ates C
a
(x;w). To
this end, we express the matrix elements of the produ t V
L
� � � V
�L
in (S9.3)
as a sum over all paths from the initial to the �nal state:
(V
L
� � � V
�L
)
ab
=
X
x
L�1
;:::;x
�L+1
(V
L
)
ax
L�1
� � � (V
�L
)
x
�L+1
b
:
By de�nition, C
a
(u
�L
; : : : ; u
L
) equals 1 if and only if
u
�L
= u
L
= (x; 0
N�n
) and
L
Y
j=�L+1
(V
j
)
u
j
u
j�1
= a:
It is easy to see that C
a
2 P: we only need to represent the matrix elements
as powers of i and sum the exponents modulo 4.
If Q(x) = 0, then p(x) � 1=3; if Q(x) = 1, then p(x) � 2=3. Thus
Q(x) = 1 if and only if
p(x) = 2
�h
�
#
1
(x)�#
�1
(x)
�
>
1
2
:
S10. Problems of Se tion 10 221
This is equivalent to the ondition
#
�1
(x) + 2
h�1
< #
1
(x):
It remains to verify that both sides of the last inequality an be represented
in the form
f(x) =
�
�
fy : F (x; y) = 1g
�
�
; F 2 P:
(Fun tions f of this form onstitute the so- alled lass #P.) We have already
proved that #
a
(x) has the spe i�ed form, so it will suÆ e to show that the
lass #P is losed under addition. Let g(x) =
�
�
fy : G(x; y) = 1g
�
�
, G 2 P.
Then f(x) + g(x) =
�
�
f(y; z) : T (x; y; z) = 1g
�
�
, where T (x; y; 0) = F (x; y)
and T (x; y; 1) = G(x; y). The proof is ompleted.
PP � PSPACE. This is obvious. We introdu e two ounters: one for
R
0
, the other for R
1
. We s an through all possible values of y and add 1 to
the k-th ounter (k = 0; 1) if R
k
(x; y) = 1. Then we ompare the values of
the ounters.
S10. Problems of Se tion 10
10.1. Let � =
P
k
p
k
j�
k
ih�
k
j. We verify onditions 1){3) for �.
1): This is obvious.
2): h�j�j�i =
P
k
p
k
h�j�
k
ih�
k
j�i =
P
k
p
k
�
�
h�j�
k
i
�
�
2
� 0.
3): Tr � =
P
k
p
k
h�
k
j�
k
i =
P
k
p
k
= 1.
Conversely, if � satis�es 1){3), then � =
P
k
�
k
j�
k
ih�
k
j, where �
k
are the
eigenvalues and fj�
k
ig is an orthonormal basis of eigenve tors of �.
10.2. We note that N F
�
=
L(F
�
;N ), so that the ve tor j i 2 N F
an be translated to a linear map X : F
�
! N . The S hmidt de omposition
for j i is basi ally the singular value de omposition forX (see formula (6.2))
| we just need to hange the designation of the bra-ve tors h�
j
j 2 (F
�
)
�
�
=
F to j�
j
i. The ondition �
j
� 1 follows from the fa t that �
2
j
are the
nonzero eigenvalues of the operator XX
y
= Tr
F
�
j ih j
�
, whi h implies
that
P
j
�
2
j
= 1.
10.3. As follows from the solution of the previous problem, the ondition
Tr
F
�
j
1
ih
1
j
�
= Tr
F
�
j
2
ih
2
j
�
allows us to hoose S hmidt de ompositions
for j
1
i and j
2
i with identi al �
j
and j�
j
i. We write down these de ompo-
sitions:
j
k
i =
X
j
�
j
j�
j
i j�
(k)
j
i; k = 1; 2:
222 3. Solutions
Sin e
�
j�
(1)
j
i
and
�
j�
(2)
j
i
are orthonormal families, there exists a unitary
operator U su h that U j�
(1)
j
i = j�
(2)
j
i for all j. Then
(I
N
U)j
1
i =
X
j
�
j
j�
j
i U j�
(1)
j
i =
X
j
�
j
j�
j
i j�
(2)
j
i = j
2
i:
10.4. First, we prove (10.6). Let A =
P
j
�
j
j
j
ih�
j
j be a singular value
de omposition of A (see Problem 6.4). Re all that �
j
> 0 and h
j
j
k
i =
h�
j
j�
k
i = Æ
jk
. The numbers �
j
are pre isely the nonzero eigenvalues of
p
A
y
A, so that kAk
tr
=
P
j
�
j
. Therefore
jTrABj =
�
�
�
X
j
�
j
h�
j
jBj
j
i
�
�
�
�
X
j
�
j
�
�
h�
j
jBj
j
i
�
�
�
X
j
�
j
kBk = kAk
tr
kBk
for any B. On the other hand, if U is a unitary operator that takes j
j
i to
j�
j
i, then TrAU = kAk
tr
.
To prove (10.7), suppose that
P
k
j�
k
ih�
k
j = A, and U is the operator
de�ned above. Then
kAk
tr
= jTrAU j =
�
�
�
X
k
h�
k
jU j�
k
i
�
�
�
�
X
k
�
�
h�
k
jU j�
k
i
�
�
�
X
k
j�
k
i
j�
k
i
:
But if we set j�
k
i = �
k
j
k
i and j�
k
i = j�
k
i, then
P
k
j�
k
i
j�
k
i
= kAk
tr
.
Finally, we prove that k � k
tr
is a norm. The only nontrivial property is
the triangle inequality. It an be derived as follows:
kA
1
+A
2
k
tr
= max
U2U(N )
�
�
Tr(A
1
+A
2
)U
�
�
� max
U2U(N )
jTrA
1
U j+ max
U2U(N )
jTrA
2
U j = kA
1
k
tr
+ kA
2
k
tr
:
10.5. Property a):
kABk
tr
= max
U2U(N )
jTrABU j � max
U2U(N )
kAk
tr
kBUk � kAk
tr
kBk:
Property b) is a spe ial ase of ), so we prove ). Let A 2 N M; then
kTr
M
Ak
tr
= max
U2U(N )
�
�
Tr
�
(Tr
M
A)U
�
�
�
= max
U2U(N )
�
�
Tr
�
A(U I
M
)
�
�
�
� kAk
tr
:
Property d):
kABk
tr
= Tr
q
(AB)
y
(AB) = Tr
�
p
A
y
A
p
B
y
B
�
= kAk
tr
kBk
tr
:
10.6. a) Let j�i and j�i be two unit ve tors, a = h�j�i. We an represent
j�i as aj�i +
p
1� jaj
2
j i, where j i is another unit ve tor orthogonal to
j�i. Hen e
j�i � j�i
2
= j1� aj
2
+ (1� jaj
2
) = 2(1�Re a):
S10. Problems of Se tion 10 223
In the de�nition of the �delity distan e, one an multiply j�i by an
arbitrary phase fa tor without leaving the minimization domain. This or-
responds to multiplying a by the same fa tor. Therefore the minimum is
a hieved when a is real, nonnegative and the largest possible, i.e., a =
p
F (�; ).
b) Let F = N
�
, so that the ve tors j�i; j�i 2 N N
�
an be asso iated
with operators X;Y 2 L(N ) (due to the isomorphism N N
�
�
=
L(N ) ).
The ondition Tr
F
(j�ih�j) = � be omes XX
y
= �. One solution to this
equation is X =
p
�; the most general solution is X =
p
�U , where U is an
arbitrary unitary operator ( f. Problem 10.3). Similarly, Y =
p
V , where
V is unitary. Thus
h�j�i = Tr(X
y
Y ) = Tr
�
U
y
p
�
p
V
�
= Tr
�
p
�
p
W
�
; where W = V U
y
;
F (�; ) = max
W2U(N )
�
�
Tr
�
p
�
p
W
�
�
�
2
=
p
�
p
2
tr
:
) Let j�i and j�i provide the maximum in (10.8). Then
k�� k
tr
=
Tr
F
�
j�ih�j � j�ih�j
�
tr
�
j�ih�j � j�ih�j
tr
= 2
q
1�
�
�
h�j�i
�
�
2
= 2
p
1� F (�; ):
Thus F (�; ) � 1 �
1
4
k� � k
2
tr
, whi h is the required upper bound for the
�delity.
To obtain the lower bound, we will need the following lemma.
Lemma. Let X and Y be nonnegative Hermitian operators. Then
Tr(X � Y )
2
� kX
2
� Y
2
k
tr
:
Proof. Let j
j
i, �
j
be orthonormal eigenve tors and the orresponding
eigenvalues of the operator X � Y . We have the following bound:
kX
2
� Y
2
k
tr
�
X
j
�
�
h
j
j(X
2
� Y
2
)j
j
i
�
�
:
(Indeed, the right-hand side an be represented as Tr((X
2
� Y
2
)U), where
U =
P
j
�j
j
ih
j
j.) To pro eed, we need to estimate ea h term in the sum,
h
j
j(X
2
� Y
2
)j
j
i
=
1
2
h
j
j(X � Y )(X + Y )j
j
i+
1
2
h
j
j(X + Y )(X � Y )j
j
i
= �
j
h
j
j(X + Y )j
j
i;
h
j
j(X + Y )j
j
i � j�
j
j:
Thus
224 3. Solutions
X
j
�
�
h
j
j(X
2
� Y
2
)j
j
i
�
�
�
X
j
�
2
j
= Tr(X � Y )
2
:
�
Now we use the lemma:
p
F (�; ) =
p
�
p
tr
� Tr
�
p
�
p
�
= 1�
1
2
Tr
�
p
��
p
�
2
� 1�
k�� k
tr
2
:
S11. Problems of Se tion 11
11.1. We will solve this problem together with Problem 11.2 if we prove
the following three things:
a) Any superoperator of type 2 or 3 (as des ribed in the main text) has an
operator sum de omposition (11.2).
b) The set of superoperators of the form (11.2) is losed under multipli a-
tion.
) Any su h superoperator an be represented as Tr
F
(V � V
y
).
We pro eed with the proofs.
a) Superoperators of type 3 already have the required form. For the
partial tra e Tr
F
we have the following representation:
Tr
F
� =
X
m
W
m
�W
y
m
; W
m
= I
N
hmj : N F ! N ;
where fjmig is an orthonormal basis of F . We note thatW
m
�
jj; ki
�
= Æ
mk
jji
and W
y
m
�
jji
�
= jj;mi.
b) Let T =
P
m
A
m
�A
y
m
and R =
P
k
B
k
�B
y
k
. Then
RT =
X
k
B
k
�
X
m
A
m
�A
y
m
�
B
y
k
=
X
k;m
(B
k
A
m
) � (B
k
A
m
)
y
;
X
k;m
(B
k
A
m
)
y
(B
k
A
m
) =
X
m
A
y
m
�
X
k
B
y
k
B
k
�
A
m
=
X
m
A
y
m
A
m
= I:
) Let a superoperator T be de omposed into the sum (11.2) of s terms,
and let F be an s-dimensional spa e with the basis ve tors denoted by jmi.
The linear map
V =
X
m
A
m
jmi : j�i 7!
X
m
A
m
j�i jmi
is an isometri embedding sin e
V
y
V =
X
k;m
�
A
y
k
hkj
��
A
m
jmi
�
=
X
k;m
A
y
k
A
m
hkjmi = I:
S11. Problems of Se tion 11 225
Moreover, T = Tr
F
(V � V
y
). Indeed,
Tr
F
(V �V
y
) =
X
m;k
Tr
F
�
A
m
�A
y
k
jmihkj
�
=
X
m
A
m
�A
y
m
:
11.2. See the solution to the previous problem.
11.3.
Tr
F
�
(U Y )�(U Y )
y
�
= Tr
F
�
(U Y )
X
j;l;k;m
�
jlkm
jj; lihk;mj (U Y )
y
�
=
X
j;l;k;m
�
jlkm
Tr
F
�
�
U jjihkjU
y
�
�
Y jlihmjY
y
�
�
=
X
j;l;k;m
�
jlkm
�
U jjihkjU
y
�
Tr
�
Y jlihmjY
y
�
| {z }
Æ
lm
= U(Tr
F
�)U
y
:
11.4. The physi al realizability of T is equivalent to the existen e of a
de omposition T =
P
m
A
m
� A
y
m
su h that
P
m
A
y
m
A
m
= I. In the oordi-
nate form, these equations read as follows:
T
(j
0
j)(k
0
k)
=
j
0
�
�
T (jjihkj)
�
�
k
0
�
=
X
m
hj
0
jA
m
jji hkjA
y
m
jk
0
i =
X
m
a
m(j
0
j)
a
�
m(k
0
k)
;
(S11.1)
X
m;l
a
�
m(lk)
a
m(lj)
= Æ
kj
;(S11.2)
where a
m(j
0
j)
= hj
0
jA
m
jji. If we repla e ea h index pair in parentheses by a
single index, (S11.1) be omes T
JK
=
P
m
a
mJ
a
�
mK
. This is a general form
of a nonnegative Hermitian matrix; therefore (S11.1) is equivalent to ondi-
tions b) and ) in question. Equation (S11.2) is equivalent to ondition a),
provided (S11.1) is the ase.
11.5. Properties a) and b) are equivalent to properties a) and b) in the
previous problem.
It follows from the operator sum de omposition that a physi ally re-
alizable superoperator takes nonnegative operators to nonnegative oper-
ators. On the other hand, if T is physi ally realizable, it has the form
T : � 7! Tr
F
(V �V
y
), hen e the superoperator
T I
L(G)
: � 7! Tr
F
�
(V I
G
)�(V I
G
)
y
�
is also physi ally realizable. Therefore TI
L(G)
takes nonnegative operators
to nonnegative operators.
226 3. Solutions
For the proof of the onverse assertion, we will dedu e property ) of the
previous problem from property ) of the present problem. Spe i� ally, we
will show that the matrix (T
JK
) (where J=(j
0
j) andK=(k
0
k) ) orresponds
to an operator Y that an be represented as (T I
L(G)
)X for some non-
negative Hermitian operator X 2 L(N G).
Let dimG = dimN , and let jji denote the basis ve tors in both spa es.
We set
X =
X
j;k
jjihkj jjihkj = j ih j; where j i =
X
j
jji jji:
Then
Y = (T I
L(G)
)X =
X
j
0
;j;k
0
;k
T
(j
0
j)(k
0
k)
jj
0
ihk
0
j jjihkj;
hen e hj
0
; jjY jk
0
; ki = T
(j
0
j)(k
0
k)
.
11.6. Let us represent T in the form T = Tr
F
0
(V �V
y
), where V : N !
N F F
0
is a unitary embedding ( f. Problem 11.1). By assumption, for
any pure state j�i 2 N we have
Tr
FF
0
�
V j�ih�jV
y
�
= Tr
F
�
Tr
F
0
�
V j�ih�jV
y
��
= Tr
F
�
T (j�ih�j)
�
= j�ih�j:
Therefore j i = V j�i is a produ t state: V j�i = j�i j�i (this follows from
the observation made after the formulation of Problem 10.2). A priori, j�i
depends on the ve tor j�i, but the linearity of V implies that j�i is a tually
onstant. Therefore TX = X , where = Tr
F
0
(j�ih�j).
11.7. A superoperator T of the type L(N )! L(N )�f1; : : : ; rg has the
form T =
P
m
T
m
jmihmj, where T
m
: L(N ) ! L(N ). If T is physi ally
realizable, then it satis�es onditions a){ ) of Problem 11.5, hen e ea h
superoperator T
m
satis�es onditions b) and ). If T is onsistent with (11.4),
then T
m
(j�ih�j) = Æ
mj
j�ih�j for any j�i 2 L
j
; this ondition extends by
linearity as follows:
(S11.3) T
m
X = Æ
mj
X for any X 2 L(L
j
):
Under these assumptions, we will prove that T
m
= �
L
m
� �
L
m
.
Let
�
j�
j;p
i : p = 1; : : : ;dimL
j
be an orthonormal basis of the spa e L
j
.
Formula (S11.3) determines the value of T
m
(j�
j;p
ih�
k;s
j) in the ase j = k.
Therefore it is suÆ ient to prove that if j 6= k, then T
m
(j�
j;p
ih�
k;s
j) = 0.
Let us �x m; j; p; k; s and denote the operator T
m
(j�
j;p
ih�
k;s
j) by A. It
suÆ es to prove that TrAB = 0 for any B or, equivalently, for any B of
the form j�ih�j. Let a = Tr(Aj�ih�j) =
�
�
�
T
m
(j�
j;p
ih�
k;s
j)
�
�
�
�
. Consider the
S11. Problems of Se tion 11 227
fun tion
f(x; y)
def
=
�
�
�
T
m
(j ih j)
�
�
�
�
; where j i = xj�
j;p
i+ yj�
k;s
i;
f(x; y) = Æ
mj
jh�j�
j;p
ij
2
jxj
2
+ axy
�
+ a
�
x
�
y + Æ
mk
jh�j�
k;s
ij
2
jyj
2
:
Sin e the operator T
m
(j ih j) is nonnegative, f(x; y) � 0 for any x and y.
But the ondition j 6= k implies that Æ
mj
= 0 or Æ
mk
= 0. Therefore a = 0.
11.8. In formula (11.7), let k run from 1 to r, whereas � 2 L(N ). We
de�ne the \larger spa e" to be N
0
= N M, where M = C
�
j1i; : : : ; jri
�
.
The isometri embedding V : N ! N
0
(whi h takes � to = V �V
y
) and
the subsequent proje tive measurement 7!
P
j
Tr( �
L
j
) � (j) are de�ned
by the formulas
V j�i =
X
k
p
X
k
j�i jki; L
j
= N C (jji):
It is lear that �
L
j
= I
N
jjihjj, hen e Tr(V �V
y
�
L
j
) = Tr(�X
j
).
11.9. We assume that the measurement is destru tive (whi h is indi-
ated in Figure S11.1 by the measured qubits being dis arded to the trash
bin). Thus the measurement is the following transformation of two quantum
bits into two lassi al bits:
T : � 7!
X
a;b
h�
ab
j�j�
ab
i � (a; b):
To realize the transformation T , Ali e applies the unitary operator
H[1℄ �(�
x
)[1; 2℄ : j�
ab
i 7! jb; ai;
inter hanges the qubits and measures them in the basis fj0i; j1ig. Then she
sends the measurement results to Bob.
�
x
�
x
�
z
H
~
~
9
>
>
>
>
>
>
=
>
>
>
>
>
>
;
j�
00
i
Fig. S11.1. The ir uit for quantum teleportation. The ~ symbol in-
di ates the quantum state being teleported. and denote Ali e's
and Bob's halves of the auxiliary state j�
00
i. The dashed lines represent
lassi al bits sent by Ali e to Bob.
228 3. Solutions
Suppose the initial state of the �rst qubit was �. After the measurement,
the overall state of the system be omes
=
X
a;b
�
a; b;
ab
�
=
X
a;b
p
ab
�
�
a; b; p
�1
ab
ab
�
; p
ab
= Tr
ab
;
ab
= (T I
L(B)
)
�
� j�
00
ih�
00
j
�
=
�
h�
ab
j I
B
�
�
� j�
00
ih�
00
j
�
�
j�
ab
i I
B
�
:
Here p
ab
is the probability to get the measurement out ome (a; b), whereas
p
�1
ab
ab
is the orresponding onditional quantum state. Note that h�
ab
j and
j�
ab
i are regarded as operators of types B
2
! C and C ! B
2
(resp.), so
that h�
ab
j I
B
: B
3
! B and j�
ab
i I
B
: B ! B
3
.
We now des ribe Bob's a tions aimed at the re overy of the initial state
�. Without loss of generality we may assume that � = j ih j. (Indeed,
the whole pro ess of measurement and re overy an be des ribed by a su-
peroperator. If it preserves pure states, it also preserves mixed states, due
to the linearity.) In this ase the state after the measurement is also pure:
ab
= j
ab
ih
ab
j. Let j i = z
0
j0i+ z
1
j1i; then
j
ab
i =
�
h�
ab
j I
B
�
�
j i j�
00
i
�
=
�
h�
ab
j I
B
�
0
�
X
;d
z
j i
1
p
2
jd; di
1
A
=
1
p
2
X
;d
z
h�
ab
j ; di jdi =
1
2
X
;d
z
(�1)
b
Æ
�a;d
jdi
=
1
2
X
z
(�1)
b
ja� i =
1
2
(�
x
)
a
(�
z
)
b
j i:
Note that the probability p
ab
= h
ab
j
ab
i =
1
4
does not depend on the
initial state j i. (In fa t, if it depended on j i, then the re overy would not
be possible; this follows from the result of Problem 11.6.) To restore the
initial state, Bob applies the operators �
x
and �
z
with lassi al ontrol: the
ontrolling parameters are the measured values of a and b. The result is as
follows:
(�
z
)
b
(�
x
)
a
j
ab
i
p
p
ab
= j i:
Remark. One may ask this question: what if there is some other quantum
system, whi h is not involved in the teleportation pro edure, but the qubit
being teleported is initially entangled with it? Will the state be preserved
in this ase? The answer is \yes". Indeed, we have proved that the mea-
surement followed by the re overy e�e ts the superoperator R = I
L(B)
on
the teleported qubit. When the additional system is taken into a ount, the
superoperator be omes R I
L(G)
= I
L(BG)
.
S12. Problems of Se tion 12 229
11.10. To express the �delity F
�
Tr
M
(A�A
y
);Tr
M
(B B
y
)
�
, we start
with some puri� ations of � and over the auxiliary spa e G = N
�
, i.e.,
� = Tr
G
(j�ih�j), = Tr
G
(j�ih�j), where j�i; j�i 2 N G. The states
j�
0
i = (A I
G
)j�i; j�
0
i = (B I
G
)j�i; j�
0
i; j�
0
i 2 F M G
are parti ular puri� ations of �
0
= Tr
M
(A�A
y
) and
0
= Tr
M
(B B
y
)
(�
0
;
0
2 L(F)) over the auxiliary spa e M G. The �delity an be de-
�ned in terms of general puri� ations over the same spa e.
1
As follows
from the result of Problem 10.3, all puri� ations of �
0
and
0
have the form
(I
F
U)j�
0
i or (I
F
V )j�
0
i for some unitary U and V . Therefore
p
F (�
0
;
0
) = max
U;V 2U(MG)
�
�
h�
0
j(I
F
V
y
)(I
F
U)j�
0
i
�
�
= max
W2U(MG)
�
�
h�
0
jI
F
W j�
0
i
�
�
= max
W2U(MG)
�
�
�
�
�
(B
y
I
G
)(I
F
W )(A I
G
)
�
�
�
�
�
�
= max
W2U(MG)
�
�
�
Tr
�
W Tr
F
�
(A I
G
)j�ih�j(B
y
I
G
)
�
�
�
�
�
= max
W2U(MG)
�
�
�
Tr
�
W (T I
L(G)
)(j�ih�j)
�
�
�
�
=
(T I
L(G)
)(j�ih�j)
tr
:
The �delity should be maximized over � and , whi h is equivalent to
maximizing the last expression over all unit ve tors j�i and j�i. On the
other hand, Theorem 11.1 and formula (10.7) imply that
kTk
}
= kT I
L(G)
k
1
= sup
A6=0
k(T I
L(G)
)Ak
tr
kAk
tr
= sup
j�
k
i;j�
k
i
(T I
L(G)
)
P
k
j�
k
ih�
k
j
tr
P
k
j�
k
i
j�
k
i
= max
�
(T I
L(G)
)(j�ih�j)
tr
:
j�i
=
j�i
= 1
o
:
The two results agree.
S12. Problems of Se tion 12
12.1. We will use the following simple property of the operator norm:
if X
j
2 L(N
j
;M
j
) and X =
L
j
X
j
:
L
j
N
j
!
L
j
M
j
, then kXk =
max
j
kX
j
k. Indeed, the set of the eigenvalues for X
y
X is the union of the
orresponding sets for X
y
j
X
j
.
1
The de�nition applies dire tly only if dimF = (dimN )(dimM). As far as the general ase
is on erned, we refer to the remark after formula (10.8).
230 3. Solutions
Let V : K ! KB
N
be the standard embedding, i.e., V j�i = j�ij0
N
i.
Then k
~
U
j
V � V U
j
k � Æ for ea h j. Therefore
~
W
�
I
N
V
�
�
�
I
N
V
�
W
=
M
j
�
L
j
�
~
U
j
V � V U
j
�
� Æ:
12.2. We have
~
U =
X
j2
�
L
j
P
j
; P
j
=
X
y2�
Q
y
�
R
y
j
�
M
y
R
j
�
:
Due to the result of Problem 12.1, it is suÆ ient to show that P
j
approxi-
mates Q
f(j)
for ea h j.
Let j�i 2 K. We need to estimate the norm of the ve tor j i = j~�i �
j�i j0
N
i, where j�i = Q
f(j)
j�i and j~�i = P
j
�
j�i j0
N
i
�
. Su h an estimate
follows from the al ulation
h j i = 2�
�
h�j h0
N
j
�
j~�i � h~�j
�
j�i j0
N
i
�
= 2� 2Re
�
h�j h0
N
j
�
j~�i
= 2� 2Re
X
y2�
h�jQ
y
f(j)
Q
y
j�i h0
N
jR
y
j
�
M
y
R
j
j0
N
i
= 2
X
y2�
�
1�Reh�jQ
y
f(j)
Q
y
j�i
�
P(yjj)
� 2
X
y 6=f(j)
2P(yjj) � 4":
Thus
j i
� 2
p
".
In the ase where V is the opy operator, Q
y
f(j)
Q
y
= Æ
y;f(j)
I
K
, so we get
2" instead of 4" in the above estimate. Hen e
j i
�
p
2".
S13. Problems of Se tion 13
13.1. We denote the required probability by p(X; l). If h
1
; : : : ; h
l
do
not generate the whole group X, then they are ontained in some maximal
proper subgroup Y � X. For ea h Y the probability of su h an event does
not ex eed 2
�l
, be ause jY j � jXj=2. Therefore p(X; l) � 1 �K(X) � 2
�l
,
where K(X) is the number of maximal proper subgroups of the group X:
Subgroups of an Abelian group X are in one-to-one orresponden e with
subgroups of the hara ter group X
�
; maximal proper subgroups orrespond
to minimal nonzero subgroups. Ea h minimal nonzero subgroup is generated
by a single nonzero element, so that K(X) < jX
�
j = jXj.
13.2. We onstru t a lassi al operator V
b
2 L(B B
n
) (the basis
ve tors in B
n
are numbered from 0 to 2
n
� 1) su h that
V
b
j0; 0i = j0; 1i; V
b
j1; 0i = j1; bi:
S13. Problems of Se tion 13 231
Then the ir uit V
�1
b
[0; B℄ U [B;A℄ V
b
[0; B℄ realizes the operator �(U
b
)[0; A℄,
where B denotes a set of n an illas.
13.3. Let us al ulate the expe ted value of exp
�
h
�
P
s
r=1
v
r
� sp
��
and
hoose h so that to minimize the result:
E
h
exp
�
h
�
s
X
r=1
v
r
� sp
��i
=
�
e
�hp
�
(1� p
�
) + p
�
e
h
�
�
s
= exp
�
�H(p; p
�
)s
�
for h = ln
p
1�p
� ln
p
�
1�p
�
, where H(p; p
�
) = (1� p) ln
1�p
1�p
�
+ p ln
p
p
�
. Now we
use the obvious inequality E[e
f
℄ � Pr[f � 0℄, whi h holds for any random
variable f . Note that h � 0 if p � p
�
, and h � 0 if p � p
�
. Thus we get the
inequalities
Pr
h
s
�1
s
X
r=1
v
r
� p
i
� exp
�
�H(p; p
�
)s
�
if p � p
�
;
Pr
h
s
�1
s
X
r=1
v
r
� p
i
� exp
�
�H(p; p
�
)s
�
if p � p
�
:
This is a sharper version of Cherno�'s bound ( f. [61℄).
The fun tion H(p; p
�
) ( alled the relative entropy) an be represented
as (1� p) ln
1
1�p
�
+ p ln
1
p
�
�H(p), where H(p) = (1� p) ln
1
1�p
+ p ln
1
p
is the
entropy of the probability distribution (w
0
= 1 � p; w
1
= p). It is easy to
he k that
H(p
�
; p
�
) = 0;
�H(p; p
�
)
�p
�
�
�
�
p=p
�
= 0;
�
2
H(p; p
�
)
�p
2
= �
�
2
H(p)
�p
2
� 4;
hen e H(p; p
�
) � 2(p� p
�
)
2
. The inequality (13.4) follows.
13.4. The Fourier transform operator F = F
q
a ts on n = dlog
2
qe
qubits; more pre isely, it a ts on the subspa eN = C
�
j0i; : : : ; jq�1i
�
� B
n
.
Let
j
x
i =
1
p
q
q�1
X
y=0
exp
�
�2�i
xy
q
�
jyi; x = 0; : : : ; q � 1:
These are the eigenve tors of the operator X : jyi 7! j(y + 1) mod qi, the
orresponding eigenvalues being equal to �
x
= e
2�i'
x
, '
x
= x=q. Obviously,
F jxi = j
�x
i. The Fourier transform operator is also hara terized by its
a tion on the ve tors j
x
i: F j
x
i = jxi.
Thus, we need to transform j
x
i into jxi. The general s heme of the
solution is as follows:
j
x
i j0i 7
W
�! j
x
i jxi 7
$
��! jxi j
x
i 7
V
�! jxi j
0
i 7
IU
�1
����! jxi j0i:
(The extra j0i in the �rst and the last expression orresponds to an illas.)
We will realize ea h of the operators W , V , U with pre ision Æ
0
= Æ=3.
232 3. Solutions
W is a measuring operator with respe t to the orthogonal de ompo-
sition of N into the eigenspa es of X; it performs a garbage-free mea-
surement of the parameter x that orresponds to the eigenvalue �
x
. To
realize W with pre ision Æ
0
, we need to measure x with error probability
� " = (Æ
0
)
2
=2 = �(2
�2l
) and remove the garbage (see Se tion 12.3). To
measure x, we estimate the orresponding phase '
x
= x=q with pre ision
2
�(n+1)
(this is a di�erent kind of pre ision!), multiply by q and round to
an integer.
A ording to Theorem 13.3, the phase estimation is done by a ir uit of
size O(n logn+ nl) and depth O(logn+ log l) with the aid of the operator
�
m
(X) : jp; yi 7! j(y + p) mod qi; p = 0; : : : ; 2
m
�1; y = 0; : : : ; q�1;
where m = n+ k, k = O(log l + log log n). Note that the operator �
m
(X)
itself has smaller omplexity: it an be realized by a ir uit of size O(nk+k
2
)
and depth O(log n + (log k)
2
) (see Problem 2.14b). However, the multipli-
ation of the estimated phase by q makes a signi� ant ontribution to the
overall size, namely, O(n
2
).
The operator V a ts as follows:
V jx; yi = exp
�
�2�i
xy
q
�
jx; yi:
To realize this operator, we ompute xy=q with pre ision 2
�m
, where m =
l +O(1). More exa tly, we �nd a p 2 f0; : : : ; 2
m
�1g su h that
�
�
2
�m
p� xy=q
�
�
mod 1
� 2
�m
:
Then we apply the operator �
m
(e
2�i=2
m
) ontrolled by this p, and un-
ompute p.
When estimating the value of xy=q, we operate with �(l)-digit real
numbers; therefore p is omputed by a ir uit of size O(l
2
log l) and depth
O((log l)
2
) (see Problem 2.14a). The operator �
m
(e
2�i=2
m
) has approxi-
mately the same omplexity (see Lemma 13.4). Thus V is realized by an
O(l
2
log l)-size, O((log l)
2
)-depth ir uit over the standard basis.
Finally, we need to realize the operator U : j0i 7! j
0
i. If q = 2
n
, this is
very simple: U = H
n
. The general ase was onsidered in Problem 9.4a,
but the pro edure proposed in the solution does not parallelize. We now
des ribe a ompletely di�erent pro edure, whi h lends itself to paralleliza-
tion. Instead of onstru ting the ve tor j
0
i = j�
0;q
i, we will try to reate
the ve tor
j�
a;q
i =
r
1� e
�2a
1� e
�2aq
q�1
X
x=0
e
�ax
jxi
S13. Problems of Se tion 13 233
for a =
1
2
�(n+l)
, where
1
is a suitable onstant. It is obvious that
j�
a;q
i � j�
0;q
i
� O(aq);
so the desired pre ision �(2
�l
) is a hieved this way.
Let us onsider the ve tor j�
a;1
i, whi h belongs to the in�nite-dimen-
sional Hilbert spa e H = C
N
(where N = f0; 1; 2; : : : g). Of ourse, it is
impossible to reate this state exa tly, but the m-qubit state j�
a;2
m
i may be
a good approximation, provided m is large enough. Clearly,
j�
a;r
i � j�
a;1
i
� O(e
�ar
);
so it suÆ es to hoose r = 2
m
su h that e
�ar
�
2
2
�l
for a suitable onstant
2
. Thus m =
�
log
2
�
�a
�1
ln(
2
2
�l
)
��
= n + �(l). We will show how to
onstru t the state j�
a;2
m
i later.
Now, we invoke the fun tion
G : N ! N � f0; : : : ; q � 1g; G(x) =
�
bx=q ; (x mod q)
�
and the orresponding linear map
G
q
: H ! H C
�
j0i; : : : ; jq � 1i
�
. It
transforms the state j�
a;1
i into j�
aq;1
i j�
a;q
i. Thus, the state j�
a;q
i an
be obtained as follows: we reate j�
a;1
i, split it into j�
a;q
i and j�
aq;1
i, and
get rid of the last state (whi h is as diÆ ult as reating it).
In the omputation, we must repla e the operator
G
q
by its �nite version,
jx; 0
n
i 7!
�
�
bx=q ; (x mod q)
�
, where x 2 f0; : : : ; 2
m
�1g. Note that the ratio
x=q ranges from 0 to 2
m�n+1
= 2
O(l)
, hen e the orresponding ir uit has
size O(nl + l
2
log l) and depth O(logn+ (log l)
2
).
Thus, it remains to show how to reate the state j�
b;2
m
i for b = a and
b = aq. This is not hard be ause j�
b;2
m
i is a produ t state:
j�
b;2
m
i = j�(�
m�1
)i � � � j�(�
0
)i; j�(�)i = os(��)j0i+ sin(��)j1i;
where �
j
= (1=�) ar tan(e
�2
m
b
). Ea h ve tor j�(�
j
)i is obtained from j0i as
follows:
j�(�
j
)i = exp(�i��
j
�
y
)j0i = K
�1
H exp(i��
j
�
z
)H j0i:
Moreover, all these ve tors an be easily onstru ted at on e by a ir uit
over the standard basis. Indeed,
exp(i��
m�1
�
z
) � � � exp(i��
0
�
z
) jx
m�1
; : : : ; x
0
i
= exp
0
�
2�i
m�1
X
j=0
(1=2 � x
j
)�
j
1
A
jx
m�1
; : : : ; x
0
i:
Therefore we just need to ompute the sum in the exponent with pre ision
�(2
�l
), and pro eed by analogy with the operator V .
234 3. Solutions
The ir uits for the reation of the ve tors j�
a;2
m
i and j�
aq;2
m
i have size
O(nl+ l
2
log l) and depth O(log n+(log l)
2
). This estimate does not in lude
the omplexity of omputing the numbers �
j
be ause su h omputation be-
longs to the pre-pro essing stage. We mention, however, that it an be done
in time poly(l).
To summarize, the Fourier transform operator F
q
is realized by a ir uit
with the the following parameters:
size = O(n
2
+ l
2
log l); depth = O(log n+ (log l)
2
):
S15. Problems of Se tion 15
15.1. We represent the total Hilbert spa e in the form B
n
= K L,
where K is the state spa e of qubits in A, and L is the state spa e of the
remaining qubits. We need to re onstru t the state � from Tr
K
�.
Let us write an operator sum de omposition for the superoperator T :
� 7! Tr
K
� ( f. Problem 11.2):
T =
X
m
A
m
�A
y
m
; A
m
= hmj I
L
: K L ! L;
where fjmig is an orthonormal basis of K. We see that T 2 D � D
y
, where
D = K
�
I
L
� L(K L; L):
If X;Y 2 D, then Y
y
X 2 L(K) I
L
= E(A). Consequently, the ode M
orre ts errors from D. It remains to apply Theorem 15.3.
15.2 (Cf. [11, 41℄). Suppose that M is a ode of type (4; 1) orre ting
one-qubit errors. Then it must dete t two-qubit errors, in parti ular errors
in qubits 1; 2 as well as in qubits 3; 4. This means that an arbitrary state
� 2 L(M) an be restored both from the �rst two qubits and from the last
two qubits ( f. Problem 15.1). Let us show that this is impossible.
Let N
1
be the spa e of qubits 1; 2, and N
2
the spa e of qubits 3; 4. Then
M is a subspa e of N
1
N
2
. Denote the in lusion map M ! N
1
N
2
by V . We have assumed the existen e of error- orre ting transformations
| physi ally realizable superoperators P
1
: N
1
! M and P
2
: N
2
! M
satisfying
P
1
Tr
N
2
(V �V
y
) = P
2
Tr
N
1
(V �V
y
) = � for any � 2M:
Therefore we an de�ne a physi ally realizable superoperator
P = (P
1
P
2
)(V � V
y
) : M!MM;
S15. Problems of Se tion 15 235
whi h has the following properties:
Tr
N
2
P� = Tr
N
2
�
(P
1
P
2
)(V �V
y
)
�
= P
1
Tr
N
2
(V �V
y
) = �;
Tr
N
1
P� = Tr
N
1
�
(P
1
P
2
)(V �V
y
)
�
= P
2
Tr
N
1
(V �V
y
) = �:
A ording to Problem 11.6, the �rst identity implies that P� = �
2
,
where
2
does not depend on �. Similarly, P� =
1
�. We have arrived at
a ontradi tion:
1
� = �
2
for any �.
15.3. We will only give the idea of the solution. It is suÆ ient to ex-
amine the phase omponent g
(z)
of the error g = g
(x)
+ g
(z)
(the lassi al
omponent g
(x)
is treated similarly). A syndrome bit equals 1 if and only if
the star of the orresponding vertex ontains an odd number of edges from
g
(z)
. Therefore we obtain the following problem. Let D be the boundary of
a 1- hain C with Z
2
- oeÆ ients (i.e., D is a set of an even number of latti e
verti es); we need to �nd su h a hain C
min
whi h ontains the smallest
number of edges.
It is not diÆ ult to �gure out that C
min
is the disjoint union of paths
that onne t pairs of verti es of the set D (two di�erent paths annot have
a ommon edge). Therefore the problem of determining the error by its
syndrome is redu ed to the the following weighted mat hing problem. There
is a graph G (in our ase, the omplete graph whose verti es are the verti es
of the latti e) with a weight assigned to ea h edge (in our ase, the shortest
path length on the latti e). It is required to �nd a perfe t mat hing with
minimal total weight. (A perfe t mat hing on a bipartite graph was de�ned
in Problem 3.4, but here we talk about mat hing on an arbitrary unoriented
graph.)
There exist polynomial algorithms solving the weighted mat hing prob-
lem (see, for example, [52, Chapter 11℄, where an algorithm based on ideas
of linear programming is des ribed).
Appendix A
Elementary Number
Theory
In this Appendix we outline some basi de�nitions and theorems of arith-
meti . This, of ourse, is not a substitute for more detailed books; see
e.g., [70, 33℄.
A.1. Modular arithmeti and rings. One says that a is ongruent to b
modulo q and writes
a � b (mod q)
if a � b is a multiple of q. This ondition an be also expressed by the
notation q j (a � b), whi h reads \q divides a � b". A set of all (mod q)-
ongruent integers is alled a ongruen e lass. (For example, the set of even
numbers and the set of odd numbers are ongruen e lasses modulo 2.) Ea h
lass an be hara terized by its anoni al representative, i.e., an integer
r 2 f0; : : : ; q � 1g. We write
a mod q = r;
whi h means pre isely that r is the residue of a (the remainder of integer
division of a by q), i.e., a = mq + r, where m 2 Z, r 2 f0; : : : ; q � 1g. In
most ases we do not need to make a distin tion between ongruen e lasses
and their anoni al representatives, so the term \residue" refers to both.
Thus 7 mod 3 = 1, but we may also say that (7 mod 3) is the ongruen e
lass ontaining 7 (i.e., the set f: : : ;�5;�2; 1; 4; 7; : : : g) whi h has 1 as is
anoni al representative. In any ase, the operation x 7! (x mod q) takes
an integer to a residue.
237
238 Appendix A
Residues an be added, subtra ted or multiplied by performing the or-
responding operation on integers they represent. Thus r = r
1
r
2
(modu-
lar multipli ation) if and only if a � a
1
a
2
(mod q), where a mod q = r,
a
1
mod q = r
1
, and a
2
mod q = r
2
. It is important to note that a
1
and
a
2
an be repla ed by any (mod q)- ongruent numbers, the produ t being
ongruent too. Indeed,
if a
1
� b
1
and a
2
� b
2
; then a
1
a
2
� b
1
b
2
(mod q):
What are the ommon properties of integer arithmeti operations and
operations with residues? In the most abstra t form, the answer is that both
integers and (mod q) residues form ommutative rings.
De�nition A.1. A ring is a set R equipped with two binary operations,
\+" and \ � " (the dot is usually suppressed in writing), and two spe ial
elements, 0 and 1, so that the following relations hold:
(a+ b) + = a+ (b+ ); a+ b = b+ a; a+ 0 = a;
(ab) = a(b ); 1 � a = a � 1 = a;
(a+ b) = a + b ; (a+ b) = a+ b:
For any a 2 R there exists an element v su h that a+ v = 0.
If, in addition, ab = ba for any a and b, then R is alled a ommutative
ring.
In what follows we onsider only ommutative rings, so we omit the
adje tive \ ommutative".
Note that the element v in the above de�nition is unique. Indeed, if
another element v
0
satis�es a+ v
0
= 0, then
v
0
= v
0
+ 0 = v
0
+ (a+ v) = (v
0
+ a) + v=(a+ v
0
) + v = 0 + v = v + 0 = v:
Su h v is denoted by �a. The relations on the list imply other well-known
relations, for example, a � 0 = 0. Indeed,
a � 0 = a � 0 + a � 0 + (�(a � 0))
= a(0 + 0) + (�(a � 0)) = a � 0 + (�(a � 0)) = 0:
The di�eren e between two elements of a ring is de�ned as a � b
def
=
a+ (�b). Any ring be omes an Abelian group if we forget about the multi-
pli ation and 1, but keep + and 0. This group is alled the additive group
of the ring. The ring of residues modulo q is denoted by Z=qZ, whereas the
orresponding additive group is denoted by Z
q
(this is just the y li group
of order q).
What are the di�eren es between the ring of integers Z and residue
rings Z=qZ? There are many; for example, Z is in�nite while Z=qZ is �nite.
Elementary Number Theory 239
Another important distin tion is as follows. For integers, xy = 0 implies
that x = 0 or y = 0. This is not true for all residue rings (spe i� ally, this
is false in the ase where q is a omposite number). Example: 2 � 5 � 0
(mod 10), although both 2 and 5 represent nonzero elements of Z=10Z. We
say that an element x of a ring R is a zero divisor if 9y 6= 0 (xy = 0). For
example 0; 2; 3; 4; 6; 8; 9; 10 are zero divisors in Z=12Z, whereas 1; 5; 7; 11 are
not. It will be shown below that r is a zero divisor in Z=qZ if and only if r
and q (regarded as integers) have a nontrivial ommon divisor.
1
Let us introdu e another important on ept. An element x 2 R is alled
invertible if there exists y su h that xy = 1; in this ase we write y = x
�1
.
For example, 7 = 4
�1
in Z=9Z, sin e 4 � 7 � 1 (mod 9). It is obvious
that if a and b are invertible, then ab is also invertible, and that (ab)
�1
=
a
�1
b
�1
. Therefore invertible elements form an Abelian group with respe t
to multipli ation, whi h is denoted by R
�
. For example, Z
�
= f1;�1g and
(Z=12Z)
�
= f1; 5; 7; 11g. In the latter ase, invertible elements are exa tly
the elements whi h are not zero divisors. This is not a oin iden e.
Proposition A.1. If an element x 2 R is invertible, then x is not a zero
divisor. In the ase where R is �nite, the onverse is also true.
Proof. Suppose that xy = 0. Then y = (x
�1
x)y = x
�1
(xy) = x � 0 = 0.
Now let us assume that x is not a zero divisor, and R is �nite. Then
some elements in the sequen e 0; x; x
2
; x
3
; : : : must repeat, so there are some
n > m � 0 su h that x
n
= x
m
. Therefore x
m
(x
n�m
� 1) = 0. This implies
that x
m�1
(x
n�m
� 1) = 0 be ause x is not a zero divisor. Iterating this
argument, we get x
n�m
� 1 = 0, i.e., x
n�m
= 1. Hen e x
�1
= x
n�m�1
. �
A.2. Greatest ommon divisor and unique fa torization. One of the
most fundamental properties of integers is as follows.
Theorem A.2. Let a and b be integers, at least one of whi h is not 0. Then
there exists d � 1 su h that
d j a; d j b; d = ma+ nb; where m;n 2 Z:
Su h d is alled the greatest ommon divisor of a and b and denoted by
g d(a; b). Explanation of the name: �rstly, d divides a and b by de�nition.
Inasmu h as d = ma + nb, any ommon divisor of a and b also divides d;
therefore it is not greater than d.
There are several ways to prove Theorem A.2. There is a onstru tive
proof whi h a tually provides an eÆ ient algorithm for �nding the numbers
1
One an easily prove this assertion by fa toring r and q into prime numbers. The argument,
however, relies on the fa t that the fa torization is unique. The uniqueness of fa torization is
a tually a theorem whi h requires a proof. We will derive this theorem from an equivalent of the
above assertion.
240 Appendix A
d, m and n. This algorithm will be des ribed in Se tion A.6. For now,
we will use a shorter but more abstra t argument. We note that the set
M =
�
ma + nb : m;n 2 Z
is a group with respe t to addition, and that
a; b 2M . Therefore Theorem A.2 an be obtained from the following more
general result.
Theorem A.3. LetM 6= f0g be a subgroup in the additive group of integers.
Then M is generated by a single element d � 1, i.e., M = (d), where
(d)
def
=
�
kd : k 2 Z
.
Proof. It is lear that M ontains at least one positive element. Let d be
the smallest positive element of M . Obviously, (d) � M , so it suÆ es to
prove that any integer x 2M is ontained in (d).
Let r = x mod d, i.e., x = kd+r, where 0 � r < d. Then r = x�kd 2M
(be ause x; d 2 M , and M is a group). Sin e d is the smallest positive
element in M , we on lude that r = 0. �
Now we derive a few orollaries from Theorem A.2.
Corollary A.2.1. The residue r = (a mod b) 2 Z=bZ is invertible if and
only if g d(a; b) = 1.
Proof. The residue r = (a mod b) being invertible means that ma � 1
(mod b) for some m, i.e., the equation ma + nb = 1 is satis�ed for some
integers m and n. But this is exa tly the ondition that g d(a; b) = 1. �
If p is a prime number, then every nonzero element of the ring Z=pZ is
invertible. A ring with this property is alled a �eld.
2
The �eld Z=pZ is also
denoted by F
p
.
Corollary A.2.2. If g d(x; q) = 1 and g d(y; q) = 1, then g d(xy; q) = 1.
Proof. Using Corollary A.2.1, we reformulate the statement as follows: if
(x mod q) and (y mod q) are invertible residues, then (xy mod q) is also
invertible. But this is obvious. �
Theorem A.4 (\Unique fa torization"). Any nonzero integer x an be
represented in the form x = �p
1
� � � p
m
where p
1
; : : : ; p
m
are prime numbers.
This representation is unique up to the order of fa tors.
Proof. Existen e. Without loss of generality we may assume that x is posi-
tive. If x = 1, then the fa torization is trivial: the number of fa tors is zero.
For x > 1 we use indu tion. If x is a prime, then we are done; otherwise
x = yz for some y; z < x, and y and z are already fa tored into primes.
2
Fields play an important role in many parts of mathemati s, e.g., in linear algebra: ve tors
with oeÆ ients in an arbitrary �eld have essentially the same properties as real or omplex
ve tors.
Elementary Number Theory 241
Uniqueness. This is less trivial. Again, we use indu tion: assume that
x > 1 and that the uniqueness holds for all numbers from 1 through x� 1.
Suppose that x has two fa torizations:
x = p
1
� � � p
m
= q
1
� � � q
n
:
First, we show that p
1
2 fq
1
; : : : ; q
n
g. Indeed, if it were not the ase,
we would have g d(p
1
; q
j
) = 1 for all j. Hen e g d(p
1
; x) = 1 (due to
Corollary A.2.2), whi h ontradi ts the fa t that p
1
j x (due to the �rst
fa torization).
By hanging the order of q's, we an arrange that p
1
= q
1
. Then we
infer that there is a number y < x with two fa torizations, namely, y =
p
2
� � � p
m
= q
2
� � � q
n
. By the indu tion assumption, the fa torizations of y
oin ide (up to the order of fa tors), so that the same is true for x = p
1
y. �
It is often onvenient to gather repeating prime fa tors, i.e., to write
x = �p
�
1
1
� � � p
�
k
k
;
where all p
j
are distin t. Theorem A.4 implies many other \obvious" prop-
erties of integers, e.g., the following one.
Corollary A.4.1. Let a and b be nonzero integers, and
=
jabj
g d(a; b)
:
Then, for any integer x, the onditions a j x and b j x imply that j x.
The number is alled the least ommon multiple of a and b.
A.3. Chinese remainder theorem. Let b and q be positive integers su h
that b j q. Then (mod q)-residues an be unambiguously onverted to
(mod b)-residues. Indeed,
if x
0
� x
00
(mod q); then x
0
� x
00
(mod b):
Therefore a map �
q;b
: Z=qZ! Z=bZ is de�ned, for example,
�
6;3
: 0 7! 0; 1 7! 1; 2 7! 2; 3 7! 0; 4 7! 1; 5 7! 2:
This map is a ring homomorphism, i.e., it is onsistent with the arithmeti
operations (see De�nition A.2 below).
Now, let b
1
j q and b
2
j q. Any (mod q)-residue x an be onverted
into a (mod b
1
)-residue x
1
, as well as (mod b
2
)-residue x
2
. Thus a map
x 7! (x
1
; x
2
) is de�ned; we denote it by �
q;(b
1
;b
2
)
.
Theorem A.5 (Chinese remainder theorem). Let q = b
1
b
2
, where b
1
and b
2
are positive integers su h that g d(b
1
; b
2
) = 1. Then the map
�
q;(b
1
;b
2
)
: Z=qZ! (Z=b
1
Z)� (Z=b
2
Z)
242 Appendix A
is an isomorphism of rings.
(The ring stru ture on the Cartesian produ t of rings will be de�ned below;
see De�nition A.3.)
Abstra t terminology aside, Theorem A.5 says that the map �
q;(b
1
;b
2
)
is
one-to-one. In other words, for any a
1
, a
2
the system
(A.1)
x � a
1
(mod b
1
);
x � a
2
(mod b
2
)
has a unique, up to (mod q)- ongruen e, solution. Indeed the existen e of a
solution says that �
q;(b
1
;b
2
)
is a surje tive (onto) map, whereas the uniqueness
is equivalent to the ondition that �
q;(b
1
;b
2
)
is inje tive.
Proof. We will �rst prove that �
q;(b
1
;b
2
)
is inje tive, i.e., any two solutions
to system (A.1) are ongruent modulo q. Let x
0
and x
00
be su h solutions;
then x = x
0
� x
00
satis�es
x � 0 (mod b
1
);
x � 0 (mod b
2
);
i.e., b
1
j x and b
2
j x. Therefore j x, where is the least ommon multiple
of b
1
and b
2
(see Corollary A.4.1). But g d(b
1
; b
2
) = 1, hen e = b
1
b
2
= q.
Thus, �
q;(b
1
;b
2
)
maps the set Z=qZ to the set (Z=b
1
Z)� (Z=b
2
Z) inje -
tively. Both sets onsist of the same number of elements, q = b
1
b
2
; therefore
�
q;(b
1
;b
2
)
is a one-to-one map. �
We will now explain the abstra t terms used in the formulation of The-
orem A.5 and derive one orollary.
De�nition A.2. Let A and B be rings. Denote the zero and the unit
elements in these rings by 0
A
; 0
B
; 1
A
; 1
B
, respe tively. A map f : A! B is
alled a ring homomorphism if
f(x+ y) = f(x) + f(y); f(xy) = f(x) f(y); f(1
A
) = 1
B
for any x; y 2 A. (Note that the property f(0
A
) = 0
B
follows automati ally.)
If the homomorphism f is a one-to-one map, it is alled an isomorphism.
(In this ase the inverse map f
�1
exists and is also an isomorphism.)
De�nition A.3. The dire t produ t of rings A
1
and A
2
is the set of pairs
A
1
�A
2
=
�
(x
1
; x
2
) : x
1
2 A
1
; x
2
2 A
2
endowed with omponentwise ring
operations:
(x
1
; x
2
) + (y
1
; y
2
) = (x
1
+ y
1
; x
2
+ y
2
);
(x
1
; x
2
) � (y
1
; y
2
) = (x
1
y
1
; x
2
y
2
);
0
A
1
�A
2
= (0
A
1
; 0
A
2
); 1
A
1
�A
2
= (1
A
1
; 1
A
2
):
Elementary Number Theory 243
Similarly one an de�ne the produ t of any number of rings.
Corollary A.5.1. If q = p
�
1
1
� � � p
�
k
k
is the fa torization of q, then there
exists a ring isomorphism
Z=qZ
�
=
Z=p
�
j
j
Z � � � � � Z=p
�
j
j
Z:
A.4. The stru ture of �nite Abelian groups. We assume that the
reader is familiar with the basi on epts of group theory (group homomor-
phism, osets, quotient group, et .) and an use them in simple ases, e.g.,
Z
4
=Z
2
�
=
Z
2
but Z
4
� Z
2
�Z
2
. Also important is Lagrange's theorem, whi h
says that the order of a subgroup divides the order of the group.
First, we onsider a y li group G =
�
�
k
: k = 0; : : : ; q � 1
�
=
Z
q
.
Note that the hoi e of the generator � 2 G is not unique: any element of
the form �
k
, where g d(k; q) = 1, also generates the group. Another simple
observation: if q = p
�
1
1
� � � p
�
k
k
is the fa torization of q, then
(A.2) Z
q
�
=
Z
q
1
� � � � � Z
q
k
; q
j
= p
�
j
j
:
(We emphasize that here
�
=
stands for an isomorphism of groups, not rings.)
This property follows from Corollary A.5.1 | we just need to keep the
additive stru ture of the rings and forget about the multipli ation. We will
all the group Z
p
�
(where p is prime) a primitive y li group.
Theorem A.6. Let G be a �nite Abelian group of order q = p
�
1
1
� � � p
�
k
k
.
Then G an be de omposed into a dire t produ t of primitive y li groups:
(A.3) G
�
=
�
1
Y
r=1
�
Z
p
r
1
�
m(p
1
;r)
!
� � � � �
�
k
Y
r=1
�
Z
p
r
k
�
m(p
k
;r)
!
:
The numbers m(p
j
; r) are uniquely determined by the group G.
Note that the isomorphism in (A.3) may not be unique; there is no
preferred way to hoose one de omposition over another. However, the
produ ts in parentheses are de�ned anoni ally by the group G.
Corollary A.6.1. If an Abelian group G of order q is not y li , then there
is a nontrivial divisor n j q su h that 8x 2 G (x
n
= 1).
Proof. Let q = p
�
1
1
� � � p
�
k
k
be the fa torization of q. The group G is y li
if and only if
m(p
j
; r) =
�
1 if r = �
j
;
0 otherwise:
Sin e G is not y li , the above ondition is violated for some j. However,
P
r
r �m(p
j
; r) = �
j
; therefore m(p
j
; �
j
) = 0. If n = q=p
j
, then x
n
= 1 for
all x 2 G. �
244 Appendix A
The proof of Theorem A.6 requires some preparation. Sin eG is Abelian,
the map '
a
: x 7! x
a
(where a is an arbitrary integer) is a homomorphism
of G into itself. Let us de�ne the following subgroups in G:
(A.4)
G
(a)
= Im'
a
=
�
x
a
: x 2 G
;
G
(a)
= Ker'
a
=
�
x 2 G : x
a
= 1
:
Lemma A.7. If g d(a; b) = 1, then G
(ab)
= G
(a)
� G
(b)
. In other words,
for any z 2 G
(ab)
there are unique x 2 G
(a)
and y 2 G
(b)
su h that xy = z.
Proof. Let ma + nb = 1. Then we an hoose x = z
nb
, y = z
ma
, whi h
proves the existen e of x and y.
On the other hand, G
(a)
\ G
(b)
= f1g. Indeed, if u 2 G
(a)
\ G
(b)
, then
u = u
ma+nb
= u
ma
u
nb
= 1 � 1 = 1. This implies the uniqueness. �
Lemma A.8. If p is a prime fa tor of jGj, then G ontains an element of
order p.
Proof. We will use indu tion on q = jGj. Suppose that q > 1 and that the
lemma holds for all Abelian groups of order q
0
< q. It suÆ es to show that
G has an element whose order is a multiple of p.
Let x be a nontrivial element of G. If p divides the order of x, then we
are done. Otherwise, let H be the subgroup generated by x, and G
0
= G=H
the orresponding quotient group. In this ase p divides jG
0
j.
By the indu tion assumption, there is a nontrivial element y
0
2 G
0
of
order p. It is lear that (y
0
)
k
= 1 if and only if p j k. Let y 2 G be a member
of the oset y
0
(re all that the quotient group is formed by osets). Then
y
k
2 H if and only if p j k. Therefore the order of y is a multiple of p. �
Proof of Theorem A.6. Lemma A.7 already shows that
G =
k
Y
j=1
G
(q
j
)
; where q
j
= p
�
j
j
:
By Lemma A.8, jG
(q
j
)
j has no prime fa tors other than p
j
; therefore jG
(q
j
)
j =
q
j
= p
�
j
j
. We need to split the subgroups G
(q
j
)
even further.
Let us �x j and drop it from notation. The subgroup G
(p)
is spe ial
in that all its elements have order p (or 1). Therefore we may regard it as
a linear spa e over the �eld F
p
(the group multipli ation plays the role of
ve tor addition). Let
L
p;r
= G
(p)
\G
(p
r
)
; m(p; r) = dimL
p;r�1
� dimL
p;r
:
It is lear that G
(p)
= L
p;0
� L
p;1
� � � � � L
p;��1
� L
p;�
= f1g.
We hoose a basis in the spa e L
p;0
in su h a way that the �rst m(p; �)
basis ve tors belong to L
p;��1
, the nextm(p; ��1) ve tors belong to L
p;��2
,
Elementary Number Theory 245
and so on. For ea h basis ve tor e 2 L
p;r�1
n L
p;r
(where n denotes the set
di�eren e, i.e., e 2 L
p;r�1
but e =2 L
p;r
) we �nd an element v 2 G
q
su h that
v
p
r�1
= e. Powers of v form a y li subgroup of order p
r
. One an show
that G
q
is the dire t produ t of these subgroups (the details are left to the
reader). �
A.5. The stru ture of the group (Z=qZ)
�
. Let q = p
�
1
1
� � � p
�
k
k
be the
fa torization of q. Due to Corollary A.5.1, there is a group isomorphism
(Z=qZ)
�
�
=
�
Z=p
�
1
1
Z
�
�
� � � � �
�
Z=p
�
k
k
Z
�
�
:
Therefore it is suÆ ient to study the group (Z=p
�
k
Z)
�
, where p is a prime
number. We begin with the ase � = 1.
Let p be a prime number. All nonzero (mod p)-residues are invertible,
hen e
�
�
(Z=pZ)
�
�
�
= p � 1. The order of a group element always divides
the order of the group (by Lagrange's theorem); therefore the order of any
element in (Z=pZ)
�
divides p�1. Thus we have obtained the following result.
Theorem A.9 (Fermat's little theorem). If p is a prime number and
x 6� 0 (mod p), then x
p�1
� 1 (mod p).
The next theorem fully hara terizes the group (Z=pZ)
�
.
Theorem A.10. If p is a prime number, then (Z=pZ)
�
is a y li group of
order p� 1.
Proof. Suppose the Abelian group G = (Z=pZ)
�
is not y li . By Corol-
lary A.6.1, there exists some integer n, 0 < n < p� 1, su h that x
n
= 1 for
all x 2 G. Therefore the equation x
n
� 1 = 0 has p� 1 solutions in the �eld
F
p
(the solutions are the nonzero elements of the �eld).
The fun tion f(x) = x
n
� 1 is a polynomial of degree n with F
p
oeÆ-
ients. A polynomial of degree n with oeÆ ients in an arbitrary �eld has at
most n roots in that �eld. Indeed, if a
1
is a root, then f(x) = (x�a
1
)f
1
(x),
where f
1
is a polynomial of degree n � 1. If a
2
is another root, then
f
1
(a
2
) = (a
2
� a
1
)
�1
f(a
2
) = 0; therefore f
1
(x) = (x � a
2
)f
2
(x
2
). This
pro ess an ontinue for at most n steps be ause ea h polynomial f
k
has
degree n� k.
But the number of roots is p�1 > n. We have arrived at a ontradit ion.
�
Example. The group (Z=13Z)
�
is generated by the element 2; see the table:
k 2 Z
12
0 1 2 3 4 5 6 7 8 9 10 11
(2
k
mod 13) 2 (Z=13Z)
�
1 2 4 8 3 6 12 11 9 5 10 7
Note that 6, 11 and 7 are also generators of (Z=13Z)
�
. Indeed, if � is a
generator of (Z=pZ)
�
, then the fun tion k 7! �
k
maps Z
p�1
to (Z=pZ)
�
246 Appendix A
isomorphi ally. Thus the element �
k
generates the group (Z=pZ)
�
if and
only if k generates Z
p�1
, i.e., if g d(k; p� 1) = 1. In our ase, the numbers
2; 6; 11; 7 (the generators of (Z=13Z)
�
) orrespond to the invertible (mod 12)-
resudues k = 1; 5; 7; 11.
Theorem A.11. Let p be a prime number, and � � 1 an integer.
1. If p 6= 2, then (Z=p
�
Z)
�
�
=
Z
p�1
� Z
p
��1
�
=
Z
(p�1)p
��1.
2. (Z=2
�
Z)
�
�
=
Z
2
� Z
2
��2for � � 2.
(The isomorphism Z
p�1
� Z
p
��1
�
=
Z
(p�1)p
��1is due to formula (A.2).)
Proof. An element x 2 Z=p
�
Z is invertible if and only if (x mod p) 6= 0.
Therefore
�
�
(Z=p
�
Z)
�
�
�
= (p� 1)p
��1
.
Let us denote G = (Z=p
�
Z)
�
and introdu e a sequen e of subgroups
G � H
0
� H
1
� � � � � H
��1
= f1g de�ned as follows:
(A.5) H
r
=
�
1 + p
r+1
x : x 2 Z=p
��r�1
Z
for r = 0; : : : ; �� 1:
Note that if a 2 H
r�1
, then a
p
2 H
r
(for 1 � r � �� 1). Indeed,
(A.6) (1 + p
r
x)
p
= 1 +
�
p
1
�
p
r
x+
�
p
2
�
p
2r
x
2
+ � � � 2 H
r
(x 2 Z=p
��r
Z):
In the rest of the proof we onsider the two ases separately.
1. p 6= 2. In this ase G = G
(p�1)
� G
(p
��1
)
(the notation was de�ned
in Se tion A.4; see (A.4)). The subgroup G
(p
��1
)
is easily identi�ed: it
oin ides with H
0
de�ned by (A.5). Indeed, if a 2 H
0
, then a
p
��1
= 1;
therefore H
0
� G
(p
��1
)
. On the other hand, jG
(p
��1
)
j = jH
0
j = p
��1
.
We will not �nd the subgroup G
(p�1)
expli itly, but rather give an ab-
stra t argument:
G
(p�1)
�
=
G=G
(p
��1
)
= G=H
0
�
=
(Z=pZ)
�
�
=
Z
p�1
:
It remains to he k that G
(p
��1
)
= H
0
is isomorphi to Z
p
��1. To this
end, it suÆ es to prove that any element of the set di�eren e H
0
nH
1
has
order p
��1
. If we apply the map '
p
: x 7! x
p
repeatedly, H
0
is mapped to
H
1
, then to H
2
, and so on. We need to show that this shift over the subroups
takes pla e one step at a time, i.e., if a 2 H
r�1
nH
r
, then a
p
2 H
r
nH
r+1
.
The ondition a 2 H
r�1
nH
r
an be represented as follows: a = 1+ p
r
x,
where x 2 Z=p
��r
Z and x 6� 0 (mod p). We use Equation (A.6) again.
Note that the terms denoted by the ellipsis ontain p raised to the power
3r � r + 2 or higher, so they are not important. Moreover,
�
p
2
�
=
p(p�1)
2
is
divisible by p. Therefore,
a
p
= (1 + p
r
x)
p
� 1 + p
r+1
x+ p
r+1
�
p
2
�
p
r�1
x
2
� 1 + p
r+1
x (mod p
r+2
);
Elementary Number Theory 247
so that a
p
2 H
r
but a
p
=2 H
r+1
.
2. p = 2. The only reason why the previous proof does not work is that
�
2
2
�
= 1 is not divisible by 2. But the last argument is still orre t for r > 1;
therefore H
1
�
=
Z
p
��2. On the other hand, G = f1;�1g �H
1
. �
A.6. Eu lid's algorithm. On e again, let a and b be integers, at least one
of whi h is not 0. How does one ompute g d(a; b) and solve the equation
ma+ nb = g d(a; b) eÆ iently?
Eu lid's algorithm. Without loss of generality, we may assume that b > 0.
We set x
0
= a, y
0
= b and iterate the transformation
(A.7) (x
j+1
; y
j+1
) = (y
j
; x
j
mod y
j
)
until we get a pair of the form (x
t
; y
t
) = (d; 0). This d is equal to g d(a; b).
Indeed, any ommon divisor of x and y is a ommon divisor of y and
x mod y, and vi e versa. Therefore g d(a; b) = g d(d; 0) = d.
The omplexity of the algorithm will be estimated later, after we des ribe
additional steps that are needed to �nd m and n satisfying ma+ nb = d.
We �rst give some analysis of the algorithm. Pro edure (A.7) an be
represented as follows:
(A.8)
�
x
j
y
j
�
=
�
k
j
1
1 0
��
x
j+1
y
j+1
�
;
�
x
t
y
t
�
=
�
d
0
�
;
where k
j
= bx
j
=y
j
. Note that k
0
is an arbitrary integer, while k
1
; : : : ; k
t�1
are positive; moreover, k
t�1
> 1 if t > 1. Thus we have
(A.9)
�
a
b
�
=
�
k
0
1
1 0
�
� � �
�
k
t�1
1
1 0
��
d
0
�
:
The produ t of the matri es here is denoted by A
t
. Let us also introdu e
partial produ ts,
(A.10) A
j
=
�
k
0
1
1 0
�
� � �
�
k
j�1
1
1 0
�
:
It is easy to see that det(A
j
) = (�1)
j
, and that
(A.11) A
j
=
�
p
j
p
j�1
q
j
q
j�1
�
;
p
0
= 1; p
�1
= 0; p
j+1
= k
j
p
j
+ p
j�1
;
q
0
= 0; q
�1
= 1; q
j+1
= k
j
q
j
+ q
j�1
:
Equation (A.9) says that a = p
t
d and b = q
t
d. (Therefore p
t
=q
t
is an
irredu ible fra tion representation of the rational number a=b.) On the other
hand, p
t
q
t�1
� p
t�1
q
t
= det(A
t
) = (�1)
t
, hen e
(�1)
t
(q
t�1
a� p
t�1
b) = d:
248 Appendix A
Thus we have solved the equation ma + nb = d. We summarize the result
as follows.
Extended Eu lid's algorithm (for solving the equation ma + nb = d).
We iterate transformation (A.7), omputing the ratios k
j
= bx
j
=y
j
on the
way. Then we ompute p
j
, q
j
a ording to (A.11) (this an be made a part
of the iterative pro edure as well). The answer to the problem is as follows:
(A.12) m = (�1)
t
q
t�1
; n = (�1)
t�1
p
t�1
:
Let us estimate the omplexity of the algorithm in terms of the problem
size s, i.e., the total number of digits in the binary representations of a and
b. Using formula (A.8) and the onditions k
1
; : : : ; k
t�1
� 1, d � 1, we obtain
the omponentwise inequality
�
x
1
y
1
�
�
�
1 1
1 0
�
t�1
�
1
0
�
=
�
F
t
F
t�1
�
; where F
t
=
�
t
� (��)
�t
p
5
:
(Here F
0
= 0, F
1
= 1, F
j+1
= F
j
+ F
j�1
are the Fibona i numbers,
whereas � =
1+
p
5
2
is the golden ratio.) Therefore b = x
1
� (�
t
), and
t � O(log b) = O(s). Ea h appli ation of transformation (A.7), as well as
the omputation of k
j
, p
j
, q
j
, are performed by O(s
2
)-size ir uits. Therefore
the overall omplexity is O(s
3
).
A.7. Continued fra tions. Eu lid's algorithm an be viewed as a pro e-
dure for onverting the fra tion z = a=b into the irredu ible fra tion p
t
=q
t
.
It turns out that some steps in this pro edure (namely, the omputation of
k
0
; : : : ; k
t�1
) an be formulated in terms of rational numbers, or even real
numbers. Indeed, let us de�ne z
j
= x
j
=y
j
. Then equations (A.7) and (A.8)
be ome
z
j+1
=
1
fra (z
j
)
; k
j
= bz
j
; where fra (x)
def
= x� bx ;(A.13)
z = z
0
; z
j
= k
j
+
1
z
j+1
; z
t
=1;(A.14)
(Note that z
j
> 1 for all j � 1.) Thus we obtain a representation of z in the
form of a �nite ontinued fra tion [k
0
; k
1
; : : : ; k
t�1
℄ with terms k
j
, whi h is
de�ned as follows:
(A.15) [k
0
; k
1
; : : : ; k
t�1
℄
def
= k
0
+
1
k
1
+
1
. . . . . . . . .
k
t�1
+
1
1
; k
j
2 Z;
k
1
; : : : ; k
t�1
� 1:
We all a ontinued fra tion anoni al if t = 1 or k
t�1
> 1; the pro edure
des ribed by equation (A.13) guarantees this property. (If we started with
Elementary Number Theory 249
an irrational number z, we would get an in�nite ontinued fra tion, whi h
is always onsidered anoni al.)
Proposition A.12.
1. Any real number has exa tly one anoni al ontinued fra tion represen-
tation.
2. A rational number with anoni al representation [k
0
; k
1
; : : : ; k
t�1
℄ has
exa tly one non anoni al representation, namely, [k
0
; k
1
; : : : ; k
t�1
�1; 1℄.
(The proof is left as an exer ise to the reader.)
What are ontinued fra tions good for? We will see that the �rst j
terms of the anoni al ontinued fra tion for z provide a good approxima-
tion of z by a rational number p
j
=q
j
, meaning that jz � p
j
=q
j
j = O(q
�2
j
).
All suÆ iently good approximations are obtained by this pro edure (see
Theorem A.13 below). Put it in a di�erent way: if z is a suÆ iently good
approximation of a rational number p=q, we an �nd that number by exam-
ining the ontinued fra tion representation of z.
To deal with partial ontinued fra tion expansions, we de�ne the fun -
tion
(A.16) [k
0
; k
1
; : : : ; k
j�1
℄(u)
def
= k
0
+
1
k
1
+
1
. . . . . . . . .
k
j�1
+
1
u
:
It allows us to represent z as follows: z = [k
0
; k
1
; : : : ; k
j�1
℄(z
j
).
To obtain an expli it formula for fun tion (A.16), we note that it is a
omposition of fra tional linear fun tions,
[k
0
; k
1
; : : : ; k
j�1
℄(u) = g
k
0
(g
k
1
(� � � g
k
j�1
(u) � � � )); where g
k
(u) =
ku+ 1
u
:
Composing fra tional linear fun tions is equivalent to multiplying 2� 2 ma-
tri es: if f
1
(u) = (a
1
u + b
1
)=(
1
u + d
1
) and f
2
(v) = (a
2
v + b
2
)=(
2
v + d
2
),
then f
1
(f
2
(v)) = (av + b)=( v + d), where
�
a b
d
�
=
�
a
1
b
1
1
d
1
��
a
2
b
2
2
d
2
�
:
Therefore
(A.17) [k
0
; k
1
; : : : ; k
j�1
℄(u) =
p
j
u+ p
j�1
q
j
u+ q
j�1
;
where the integers p
j
, q
j
are de�ned by equation (A.11).
250 Appendix A
Substituting u =1 into (A.17), we obtain the rational number
(A.18)
p
j
q
j
= [k
0
; k
1
; : : : ; k
j�1
℄:
This number is alled the j-th onvergent of z. For example, p
t
=q
t
= z,
p
1
=q
1
= k
0
= bz ; we may also de�ne the 0-th onvergent, p
0
=q
0
= 1=0 =1.
Note that the ontinued fra tion in (A.18) is not ne essarily anoni al.
Let us examine the properties of onvergents. Inasmu h as p
j
q
j�1
�
p
j�1
q
j
= det(A
j
) = (�1)
j
and q
0
< q
1
< q
2
< � � � , the following relations
hold:
(A.19)
p
j
q
j
�
p
j�1
q
j�1
=
(�1)
j
q
j
q
j�1
;
p
1
q
1
<
p
3
q
3
< � � � � z � � � � <
p
2
q
2
<
p
0
q
0
:
(To put z in the middle, we have used the fa t that z = [k
0
; k
1
; : : : ; k
j�1
℄(z
j
)
for 1 < z
j
� 1.) This justi�es the name \ onvergent".
Theorem A.13. Let z be a real number, p=q an irredu ible fra tion, q > 1.
1. If p=q is a onvergent of z, then jz � p=qj < 1=(q(q + 1)).
2. If jz � p=qj < 1=(q(2q � 1)), then p=q is a onvergent of z.
Proof. Let us onsider a more general problem: given the number w = p=q,
�nd the set of real numbers z that have w among their onvergents. We an
represent w as a anoni al ontinued fra tion [k
0
; k
1
; : : : ; k
t�1
℄ and de�ne
p
j
; q
j
(j = 0; : : : ; t) using this fra tion. Note that t > 1 (be ause w is not
an integer), and that p
t
= p, q
t
= q. It is easy to see that three ases are
possible.
1. z = w = p
t
=q
t
.
2. The anoni al ontinued fra tion for z has the form [k
0
; k
1
; : : : ; k
t�1
; : : : ℄.
Then z = (p
t
z
t
+ p
t�1
)=(q
t
z
t
+ q
t�1
) for 1 < z
t
< 1; therefore z lies
between p
t
=q
t
and (p
t
+ p
t�1
)=(q
t
+ q
t�1
) (the ends of the interval are
not in luded).
3. The anoni al ontinued fra tion for z is [k
0
; k
1
; : : : ; k
t�1
�1; 1; : : : ℄. In
this ase z = (p
t
z
t
+ p
t�1
)=(q
t
z
t
+ q
t�1
), where
k
t
+
1
z
t
= k
t
� 1 +
1
1 +
1
z
t+1
; 1 < z
t+1
<1:
Thus z
t
< �2, so that z lies between p
t
=q
t
and (2p
t
� p
t�1
)=(2q
t
� q
t�1
).
Elementary Number Theory 251
Combining these ases, we on lude that w is a onvergent of z if and
only if z 2 I, where I is the open interval with these endpoints:
(A.20)
p
t
+ p
t�1
q
t
+ q
t�1
=
p
t
q
t
� (�1)
t
1
q
t
(q
t
+ q
t�1
)
;
2p
t
� p
t�1
2q
t
� q
t�1
=
p
t
q
t
+ (�1)
t
1
q
t
(2q
t
� q
t�1
)
:
But 1 � q
t�1
� q
t
� 1; therefore
S
�
p
t
=q
t
; 1=(q
t
(2q
t
� 1))
�
� I � S
�
p
t
=q
t
; 1=(q
t
(q
t
+ 1))
�
;
where S(x; Æ) stands for the Æ-neighborhood of x. �
Bibliography
[1℄ J. F. Adams, Le tures on Lie groups, W.A. Benjamin, In ., New York{Amsterdam,
1969.
[2℄ L.M. Adleman, J. DeMarrais and M.A. Huang, Quantum omputability, SIAM J.
Comput. 26 (1997), pp. 1524{1540.
[3℄ D. Aharonov and M. Ben-Or, Fault tolerant quantum omputation with onstant error,
e-print quant-ph/9611025; extended version, e-print quant-ph/9906129.
[4℄ D. Aharonov, A. Kitaev, and N. Nisan, Quantum ir uits with mixed states, STOC'29,
1997; e-print quant-ph/9806029.
[5℄ A.V. Aho and J.D. Ullman, Prin iples of ompiler design, Addison-Wesley, Reading,
MA, 1977.
[6℄ L. Babai and S. Moran, Arthur{Merlin games: A randomized proof system and a
hierar hy of omplexity lasses, Journal of Computer and System S ien es 36 (1988),
pp. 254{276.
[7℄ A. Baren o, C. H. Bennett, R. Cleve, D. P. DiVin enzo, N. Margolus, P. Shor,
T. Sleator, J. Smolin, and H. Weinfurter, Elementary gates for quantum omputa-
tion, Phys. Rev. Ser. A52 (1995), pp. 3457{3467; e-print quant-ph/9503016.
[8℄ D.A. Barrington, Bounded-width polynomial-size bran hing programs re ognize ex-
a tly those languages in NC
1
, Journal of Computer and System S ien es 38 (1989),
pp. 150{164.
[9℄ P. Beame, S. Cook, and H. J. Hoover, Log depth ur uits for division and related
problems, SIAM J. Comput. 15 (1986), pp. 994{1003.
[10℄ C. H. Bennett, Logi al reversibility of omputations, Journal of Resear h and Devel-
opment 17 (1973), pp. 525{532.
[11℄ C. Bennett, G. Brassard, C. Cr�epeau, R. Jozsa, A. Peres, and W. Wootters, Tele-
porting an unknown quantum state via dual lassi al and Einstein{Podolsky{Rosen
hannel, Phys. Rev. Lett. 70 (1993), pp. 1895{1899.
[12℄ C. Bennett, D. DiVin enzo, J. Smolin, and W. Wootters, Mixed state entangle-
ment and quantum error orre tion, Phys. Rev. A54 (1996), pp. 3824{3851; e-print
quant-ph/9604024.
253
254 Bibliography
[13℄ D. Boneh and R. Lipton, Quantum ryptoanalysis of hidden linear fun tions, Pro . of
Advan es in Cryptology|CRYPTO-95, Le ture Notes Computer S ien e, vol. 963,
Springer-Verlag, Berlin, 1995, pp. 424{437.
[14℄ R. Boppana and M. Sipser, The omplexity of �nite fun tions, Handbook of Theoreti-
al Computer S ien e. Volume A, Algorithms and Complexity, Ch. 14. J. van Leeuwen
(ed.), Elsevier, Amsterdam; MIT Press, Cambridge, MA, 1990, pp. 757{804.
[15℄ N. Bourbaki, Lie Groups and Lie Algebras, Hermann, Paris, 1971.
[16℄ A.R. Calderbank and P.W. Shor, Good quantum error- orre ting odes exist,
Phys. Rev. A A54 (1996), pp. 1098{1106; e-print quant-ph/9512032.
[17℄ A.R. Calderbank, E.M. Rains, P.W. Shor, and N. J.A. Sloane Quantum error or-
re tion and orthogonal Geometry, Phys. Rev. Lett. 78 (1997), pp. 405{408; e-print
quant-ph/9605005.
[18℄ R. Cleve and J. Watrous, Fast parallel ir uits for the quantum Fourier transform,
FOCS'41, 2000, pp. 526{536; e-print quant-ph/0006004.
[19℄ D. Coppersmith, An approximate Fourier transform useful in quantum fa toring,
Te hni al Report RC19642, IBM, 1994; e-print quant-ph/0201067.
[20℄ D. Deuts h, Quantum theory, the Chur h{Turing prin iple and the universal quantum
omputer, Pro . Roy. So . London A400 (1985), pp. 97{117.
[21℄ , Quantum omputational networks, Pro . Roy. So . London. A425 (1989),
pp. 73{90.
[22℄ P. Erd�os and J. Spen er, Probabilisti methods in ombinatori s, A ademi Press,
New York, 1974.
[23℄ R. P. Feynman, Simulating physi s with omputers, International Journal of Theoret-
i al Physi s 21(6/7) (1982), 467{488.
[24℄ , Quantum me hani al omputers, Opti s News, 11, February 1985, p. 11.
[25℄ M. H. Freedman, P/NP, and the quantum �eld omputer, Pro . Natl. A ad. S i. USA
95 (1998), pp. 98{101.
[26℄ R. Impagliazzo and A. Wigderson. P = BPP if E requires exponential ir uits: De-
randomizing the XOR lemma, STOC'29, 1997.
[27℄ M. H. Freedman and A.Yu. Kitaev, Diameter of homogeneous spa es, unpublished.
[28℄ L. Fortnow and M. Sipser, Are there intera tive proto ols for Co-NP-languages?,
Inform. Pro ess. Lett. 28 (1988), pp. 249{251.
[29℄ M. R. Garey and D. S. Johnson, Computers and intra tability, Freeman, New York,
1983.
[30℄ J. Gruska, Quantum Computing, M Graw-Hill, London, 1999.
[31℄ A.W. Harrow, B. Re ht and I. L. Chuang, Tight bounds on dis rete approximation of
quantum gates, e-print quant-ph/0111031.
[32℄ L. Grover, A fast quantum me hani al algorithm for database sear h, STOC'28, 1996,
pp. 212{219.
[33℄ A. J. Khin hin, Continued fra tions, Univ. of Chi ago Press, 1992.
[34℄ A.A. Kirillov, Elements of the theory of representations, Springer-Verlag, New York,
1976.
[35℄ A.Yu. Kitaev, Fault-tolerant quantum omputation by anyons, e-print
quant-ph/9707021.
[36℄ A.Yu. Kitaev, Quantum omputations: algorithms and error orre tion, Uspekhi Mat.
Nauk 52 (1997), no. 6, pp. 53{112; English transl., Russian Math. Surveys 52 (1997),
no. 6, pp. 1191{1249.
Bibliography 255
[37℄ A. Kitaev, A. Shen, M. Vyalyi, Classi al and Quantum Computations, Mos ow, 1999
(in Russian); available at http://www.m me.ru/free-books.
[38℄ A.Yu. Kitaev and J. Watrous, Parallelization, ampli� ation, and exponential time
simulation of quantum intera tive systems, STOC'32, 2000, pp. 608{617.
[39℄ S. C. Kleene, Mathemati al logi , Wiley, New York, 1967.
[40℄ , Introdu tion to metamathemati s, Van Nostrand, New York, 1952.
[41℄ E. Knill and R. La amme, A theory of quantum error- orre ting odes, e-print
quant-ph/9604034.
[42℄ E. Knill, R. La amme, and W. Zurek, Threshold a ura y for quantum omputation,
e-print quant-ph/9610011.
[43℄ D. E. Knuth, The art of omputer programming, Addison-Wesley, Reading, MA, 1973.
[44℄ A. I. Kostrikin and Yu. I. Manin, Linear algebra and geometry, Nauka, Mos ow, 1986;
English transl., Gordon and Brea h, New York, 1989.
[45℄ R. Landauer, Irreversibility and heat generation in the omputing pro ess, Journal of
Resear h and Development 3 (1961), pp. 183{191.
[46℄ C. Lautemann, BPP and the polynomial hierar hy, Inform. Pro ess. Lett. 17 (1983),
no. 4, pp. 215{217.
[47℄ F. J. Ma Williams and N. J. A. Sloane, The theory of error orre tion odes, North
Holland, New York, 1981.
[48℄ A. I. Maltsev, Algorithms and re ursive fun tions, Wolters-Noordhof, Groningen,
1970.
[49℄ Yu. I. Manin Computable and In omputable, Mos ow, 1980 (in Russian).
[50℄ M. Mar us and H. Min . A survey of matrix theory and matrix inequalities, Allyn
and Ba on, Boston, 1964.
[51℄ M. A. Nielsen and I. L. Chuang Quantum omputation and quantum information,
Cambridge University Press, 2000.
[52℄ C. H. Papadimitriou and K. Steiglitz, Combinatorial optimization: algorithms and
omplexity, Prenti e-Hall, Englewood Cli�s, NJ, 1982.
[53℄ V.V. Prasolov, Problems and theorems in linear algebra, Amer. Math. So ., Provi-
den e, RI, 1994.
[54℄ H. Rogers, Theory of re ursive fun tions and e�e tive omputability, MIT Press, Cam-
bridge, MA, 1987.
[55℄ A. S hrijver, Theory of linear and integer programming, Wiley-Inters ien e, Chi h-
ester, NY, 1986.
[56℄ J. P. Serr, Lie algebras and Lie groups, W.A. Benjamin, In ., New York{Amsterdam,
1965.
[57℄ I. R. Shafarevi h, Basi notions of algebra, Springer-Verlag, New York, 1997.
[58℄ A. Shamir, IP=PSPACE, J. Asso . Comput. Ma h. 39 (1992), no. 4, 869{877.
[59℄ A. Shen, IP=PSPACE: simpli�ed proof, J. Asso . Comput. Ma h. 39 (1992), no. 4,
878{880.
[60℄ J. R. Shoen�eld, Mathemati al logi , Addison-Wesley, Reading, MA, 1967.
[61℄ , Degrees of unsolvability, Elsevier, New York, 1972.
[62℄ P.W. Shor, Algorithms for quantum omputation: Dis rete log and fa toring,
FOCS'35, 1994, pp. 124{134.
256 Bibliography
[63℄ , Polynomial-time algorithms for prime fa torization and dis rete loga-
rithms on a quantum omputer, SIAM J. Comput. 26 (1997), 1484{1509; e-print
quant-ph/9508027.
[64℄ , S heme for redu ing de oheren e in quantum memory, Phys. Rev. A52
(1995), pp. 2493{2496.
[65℄ , Fault-tolerant quantum omputation, FOCS'37, 1996, pp. 56{65; e-print
quant-ph/9605011.
[66℄ D. Simon, On the power of quantum omputation, FOCS'35, 1994, pp. 116{123.
[67℄ M. Sipser, Introdu tion to the theory of omputation, PWS, Boston, 1997.
[68℄ A.M. Steane, Multiple parti le interferen e and quantum error orre tion, Pro . Roy.
So . London A452 (1996), p. 2551; e-print quant-ph/9601029.
[69℄ C. Umans, Pseudo-random generators for all hardnesses, to appear in STOC
2002 and Complexity 2002 joint session; http://www.resear h.mi rosoft. om/
~umans/resear h.htm.
[70℄ I.M. Vinogradov, Elements of number theory, Dover, New York, 1954.
[71℄ J. Watrous, On quantum and lassi al spa e-bounded pro esses with algebrai transi-
tion amplitudes, FOCS'40, 1999, pp. 341-351; e-print s.CC/9911008.
[72℄ J. Watrous, PSPACE has onstant-round quantum intera tive proof systems,
FOCS'40, 1999, pp. 112{119; e-print: CC/9901015.
[73℄ T. Yamakami and A.C. Yao, NQP
C
= o� C
=
P, Information Pro essing Letters 71
(2) (1999), pp. 63{69; e-print quant-ph/9812032.
[74℄ A.C.-C. Yao, Quantum ir uit omplexity, FOCS'34, 1993, pp. 352{361.
[75℄ C. Zalka, Grover's quantum sear hing algorithm is optimal, e-print
quant-ph/9711070.
Index
Algorithm, 9
for �nding the hidden subgroup
in Z
k
, 135
for period �nding, 121, 127
Grover's, 83
Grover's (for the solution of the general
sear h problem), 87
nondeterministi , 28
primality testing, 40
probabilisti , 36
quantum, 89, 91
Simon's (for �nding the hidden subgroup
in Z
k
2
), 118
Ampli� ation of probability, 37, 83, 139, 141
Amplitudes, 55, 92
An illa, 60
Angle between subspa es, 147
Anyons, 172
Automaton
�nite-state, 24
Basis
lassi al, 55
Bit, 1
quantum (qubit), 53
Bra-ve tor, 56
Carmi hael numbers, 39
Che k matrix
for a linear lassi al ode, 155
Che k operator, 167
Cherno�'s bound, 127, 231
Chur h thesis, 12
Cir uit
Boolean, 17
depth, 23
fan-in, 23
fan-out, 23
formula, 18
graph, 17
size, 19
width, 27
quantum, 60
omplete basis, 73
standard basis, 73
universal, 88
reversible, 61
omplete basis, 61
uniform sequen e of, 22, 23, 89
Cir uit omplexity, 20
Clause, 33
CNF, 19, 33
Code
Hamming, 154
repetition, 153, 154
Shor, 160
Code distan e
lassi al, 154
Codes, error- orre ting, 151
lassi al, 152
linear, 155
quantum, 152
ongruent symple ti , 167
symple ti , 166, 167
tori , 169
Codeve tor, 152
Codeword, 152
Complexity lasses, 14
BQNP, 137
�
k
, 45
�
k
, 45
257
258 Index
P=poly, 20
BPP, 36, 37
MA, 138
Arthur and Merlin, 30, 138
BPP, 150
BQNP, 150, 151
BQP, 91
de�nition using games, 44, 138, 150
dual lass ( o-A), 44
EXPTIME, 22
MA, 150
NC, 23
NP, 28, 150
Karp redu ibility, 30
NP- omplete, 31
P, 14
PP, 91
PSPACE, 15, 150
Computation
nondeterministi , 27
probabilisti , 36
quantum, 82
reversible, 63
Copying
of a quantum state, 103
De oheren e, 102
Density matrix, 94
Diagonalization, 179
distan e fun tion, 77
DNF, 19
Element | f. Operator
Elementary transformation, 58
En oding
for a quantum ode, 152
one-to-many, 152
Error
lassi al, 160
phase, 160
Fidelity, 99
distan e, 99
Fun tion
Boolean, 17
basis, 17
omplete basis, 18
onjun tion, 19
disjun tion, 19
negation, 19
standard omplete basis, 18, 19
omputable, 11, 12
majority, 26, 83
partial, 10, 137
total, 10
Garbage, 62
removal, 63
Gate
ontrolled NOT, 62
Deuts h, 75
Fredkin, 206
quantum, 60
To�oli, 61
Group
(Z=qZ)
�
, 119, 121
ESp
2
(n), 164, 166
SO(3), 66, 75
Sp
2
(n), 165
U(1), 66
U(2), 66
hara ter, 118
Hamiltonian, 156, 172
k-lo al, 141
y le, 28
graph, 28
Inner produ t, 56
Ket-ve tor, 56
Language, 12
Literal, 19
Matrix, Pauli, 66
Measurement, 92, 105
onditional probabilities, 113
destru tive, 107
POVM, 107
proje tive, 106
Measuring operator, 111, 112
onditional probabilities, 112
eigenvalues, 113
Miller{Rabin test, 38
Net, 77
�-sparse, 77
in SU(M), 77
quality, 77
Norm
of a superoperator
stable, 109
unstable, 108
operator, 71
tra e, 98
One-way fun tion, 44
Operator
applied to a register, 58
approximate representation, 72
using an illas, 73
Hermitian adjoint, 56
permutation, 61
Index 259
proje tion, 93
realized by a quantum ir uit, 60
using an illas, 60
unitary, 57
with quantum ontrol, 65
Ora le, 26, 35, 83, 116
quantum, 117
randomized, 117
Partial tra e, 95
Phase estimation, 124, 127
Polynomial growth, 14
POVM, 107
Predi ate, 12
de idable, 12
Problem
TQBF , 50
3-CNF, 33
3-SAT , 33
3- oloring, 34
lique, 35
determining the dis rete logarithm, 135
Euler y le, 35
fa toring, 119
general sear h, 83
quantum formulation of, 84
hidden subgroup, 135
hidden subgroup, 116
ILP, 34
independent set, 198
lo al Hamiltonian, 142
mat hing
perfe t, 35
period finding, 119
primality, 38
satis�ability, 31
TQBF, 64
with ora le, 83
Pseudo-random generator, 43
Puri� ation, 96
unitary equivalen e, 97
Quantum omputer, 53
Quantum Fourier transform, 88, 134, 218
Quantum probability
for simple states, 92
general de�nition, 94, 95
simplest de�nition, 55, 82
Quantum register, 58
Quantum teleportation, 107, 227{229
Resolution method, 195
S hmidt de omposition, 97
Set
enumerable, 16
Singular value, 57
de omposition, 57
State of a quantum system
basis, 53
entangled, 60
mixed, 95
produ t, 60
pure, 95
Superoperator, 99, 106
physi ally realizable, 100
hara terization, 100, 101
Superposition of states, 54
Syndrome, 171
Tensor produ t, 55
of operators, 57
universality property, 55
Transformation, error- orre ting, 158, 160
lassi al, 153
for symple ti odes, 171
Turing ma hine, 10
alphabet, 9, 10
blank symbol, 10
ell, 10
omputational table, 20, 32
on�guration, 11
ontrol devi e, 10
external alphabet, 10
head, 10
initial on�guration, 11
initial state, 10
input, 11
multitape, 16
nondeterministi , 28
omputational path, 28
output, 11
probabilisti , 36
state, 10
step (or y le) of work, 11
tape, 10
universal, 14
with ora le, 26, 50
Turing thesis, 12
Witness, 38