6
Information Processing Letters 111 (2010) 40–45 Contents lists available at ScienceDirect Information Processing Letters www.elsevier.com/locate/ipl An improved lower bound on query complexity for quantum PAC learning Chi Zhang Department of Computer Science, Columbia University, New York, NY 10027, USA article info abstract Article history: Received 21 October 2009 Received in revised form 5 October 2010 Accepted 8 October 2010 Available online 12 October 2010 Communicated by B. Doerr Keywords: PAC (Probably Approximately Correct) learning Lower bound Quantum algorithm VC dimension Computational complexity In this paper, we study the quantum PAC learning model, offering an improved lower bound on the query complexity. For a concept class with VC dimension d, the lower bound is Ω( 1 (d 1e + log( 1 δ ))) for accuracy and 1 δ confidence, where e can be an arbitrarily small positive number. The lower bound is very close to the best lower bound known on query complexity for the classical PAC learning model, which is Ω( 1 (d + log( 1 δ ))). © 2010 Elsevier B.V. All rights reserved. 1. Introduction In recent years quantum computing has attracted much attention. There are quantum algorithms which can solve certain problems significantly faster than the best known classical algorithms [1,2]. Some of such algorithms are closely related to the learning theory field. An example is one of the earliest quantum algorithms, Deutsch–Jozsa algorithm [3], which decides whether a boolean function over the domain {0, 1} n is constant or balanced. To exactly decide the problem requires Θ(2 n ) queries in the classi- cal setting, but only O (1) queries in the quantum setting. Another example is the Bernstein–Vazirani algorithm [4], where the goal is to learn a concept from the concept class of boolean functions c s (x) = x · s mod 2, where x, s {0, 1} n . The problem has a query complexity of Ω(n) in the classical setting, while it can be solved using only 1 quantum query. To systematically study learning problems in the quantum setting, Bshouty and Jackson [5] intro- duced the definition of the quantum PAC learning model as a generalization of the standard PAC learning model. They also offered a polynomial-time algorithm for learning E-mail address: [email protected]. DNF in the quantum PAC learning model, while there is no polynomial-time algorithm known in the classical PAC learning model. The relationship between the number of quantum queries and the number of classical queries re- quired for PAC learning was first studied by R. Servedio and S. Gortler [6], and then improved by A. Atici and R. Serve- dio [7]. In this paper, we will improve the lower bound given in [7], offering a new quantum lower bound which is extremely close to the best lower bound known in the classical PAC learning model. The PAC (Probably Approximately Correct) model of concept learning was introduced by Valiant [8] and has been extensively studied. In the model, the learning al- gorithm has access to an example oracle EX(c, D) where c C is the unknown target concept, C is a known con- cept class and D is an unknown distribution over {0, 1} n . The oracle EX(c, D) takes no inputs; when invoked, it gen- erates a labeled example (x, c(x)) according to the distribu- tion D. The goal of the learning algorithm is to provide a hypothesis h :{0, 1} n →{0, 1} which is an -approximation for c under the distribution D, i.e., a hypothesis h such that Pr xD [h(x) = c(x)] . An algorithm A is a PAC learn- ing algorithm for C if the following condition holds: given 0 < ,δ < 1, for all c C and all distributions D over {0, 1} n , with probability at least 1 δ , the algorithm A 0020-0190/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.ipl.2010.10.007

An improved lower bound on query complexity for quantum PAC learning

Embed Size (px)

Citation preview

Page 1: An improved lower bound on query complexity for quantum PAC learning

Information Processing Letters 111 (2010) 40–45

Contents lists available at ScienceDirect

Information Processing Letters

www.elsevier.com/locate/ipl

An improved lower bound on query complexity for quantum PAC learning

Chi Zhang

Department of Computer Science, Columbia University, New York, NY 10027, USA

a r t i c l e i n f o a b s t r a c t

Article history:Received 21 October 2009Received in revised form 5 October 2010Accepted 8 October 2010Available online 12 October 2010Communicated by B. Doerr

Keywords:PAC (Probably Approximately Correct)learningLower boundQuantum algorithmVC dimensionComputational complexity

In this paper, we study the quantum PAC learning model, offering an improved lowerbound on the query complexity. For a concept class with VC dimension d, the lower boundis Ω( 1

ε (d1−e + log( 1δ))) for ε accuracy and 1 − δ confidence, where e can be an arbitrarily

small positive number. The lower bound is very close to the best lower bound known onquery complexity for the classical PAC learning model, which is Ω( 1

ε (d + log( 1δ))).

© 2010 Elsevier B.V. All rights reserved.

1. Introduction

In recent years quantum computing has attracted muchattention. There are quantum algorithms which can solvecertain problems significantly faster than the best knownclassical algorithms [1,2]. Some of such algorithms areclosely related to the learning theory field. An exampleis one of the earliest quantum algorithms, Deutsch–Jozsaalgorithm [3], which decides whether a boolean functionover the domain {0,1}n is constant or balanced. To exactlydecide the problem requires Θ(2n) queries in the classi-cal setting, but only O (1) queries in the quantum setting.Another example is the Bernstein–Vazirani algorithm [4],where the goal is to learn a concept from the conceptclass of boolean functions cs(x) = x · s mod 2, where x, s ∈{0,1}n . The problem has a query complexity of Ω(n) inthe classical setting, while it can be solved using only 1quantum query. To systematically study learning problemsin the quantum setting, Bshouty and Jackson [5] intro-duced the definition of the quantum PAC learning modelas a generalization of the standard PAC learning model.They also offered a polynomial-time algorithm for learning

E-mail address: [email protected].

0020-0190/$ – see front matter © 2010 Elsevier B.V. All rights reserved.doi:10.1016/j.ipl.2010.10.007

DNF in the quantum PAC learning model, while there isno polynomial-time algorithm known in the classical PAClearning model. The relationship between the number ofquantum queries and the number of classical queries re-quired for PAC learning was first studied by R. Servedio andS. Gortler [6], and then improved by A. Atici and R. Serve-dio [7]. In this paper, we will improve the lower boundgiven in [7], offering a new quantum lower bound whichis extremely close to the best lower bound known in theclassical PAC learning model.

The PAC (Probably Approximately Correct) model ofconcept learning was introduced by Valiant [8] and hasbeen extensively studied. In the model, the learning al-gorithm has access to an example oracle EX(c, D) wherec ∈ C is the unknown target concept, C is a known con-cept class and D is an unknown distribution over {0,1}n .The oracle EX(c, D) takes no inputs; when invoked, it gen-erates a labeled example (x, c(x)) according to the distribu-tion D . The goal of the learning algorithm is to provide ahypothesis h : {0,1}n → {0,1} which is an ε-approximationfor c under the distribution D , i.e., a hypothesis h suchthat Prx∈D [h(x) �= c(x)] � ε . An algorithm A is a PAC learn-ing algorithm for C if the following condition holds: given0 < ε, δ < 1, for all c ∈ C and all distributions D over{0,1}n , with probability at least 1 − δ, the algorithm A

Page 2: An improved lower bound on query complexity for quantum PAC learning

C. Zhang / Information Processing Letters 111 (2010) 40–45 41

outputs a hypothesis h, which is an ε-approximation to cunder D . In the literature, ε is usually known as accuracyand 1 − δ is known as confidence. The query complexityof the learning algorithm A for C is the maximum num-ber of queries to EX(c, D) A makes for any c ∈ C and anydistribution D over {0,1}n .

The quantum PAC learning model is a generalizationof the classical PAC learning model in the quantum com-puting model. Detailed descriptions of the quantum com-puting model can be found in [9,10], here we only out-line the basics needed in this paper. The most widelyused model for quantum computing is the quantum circuitmodel. The basic building blocks of this model are qubits,which can be described as vectors in the two-dimensionalcomplex Hilbert space C

2 equipped with the unit vectorbasis {|0〉, |1〉}. The quantum circuit can be viewed as a se-quence of unitary transformations

U T , Q c, U T −1, Q c, . . . , U1, Q c, U0

acting on an m-qubit quantum register, where each Uiis an arbitrary unitary transformation on m qubits, fori = 1, . . . , T , and Q c is a unitary transformation whichcorresponds to an oracle call. At every stage in the ex-ecution, the state of the register is a unit vector in the2m-dimensional complex Hilbert space, which can be rep-resented as a superposition

∑z∈{0,1}m αz|z〉, where αz are

complex numbers satisfying∑

z∈{0,1}m ‖αz‖2 = 1. If this

state is measured, then with probability ‖αz‖2 the stringz ∈ {0,1}m is observed and the state collapses to |z〉. Afterthe final transformation U T , a measurement is performedon some subset of the qubits in the register and the ob-served value is used as the output of the computation.

In the quantum PAC learning model, the algorithm hasaccess to a quantum example oracle QEX(c, D) which gen-erates a superposition of all labeled examples, where eachlabeled example (x, c(x)) appears in the superposition withamplitude proportional to the square root of D(x). Moreprecisely, for a distribution D over {0,1}n , the quantumexample oracle QEX(c, D) is a quantum gate which trans-forms the initial state |0n,0〉 as

∣∣0n,0⟩ → ∑

x∈{0,1}n

√D(x)

∣∣x, c(x)⟩.

The QEX(c, D) is only defined on the initial state |0n,0〉,and it is required that a quantum learning algorithm onlyinvokes the QEX(c, D) oracle on the basis state |0n,0〉.A quantum learning algorithm is a quantum PAC learn-ing algorithm for C if it has the following property: given0 < ε, δ < 1, for all c ∈ C and all distributions D over{0,1}n , with probability 1 − δ the algorithm outputs a rep-resentation of a hypothesis h, which is an ε-approximationto c under the distribution D . Similar to the classical PAClearning model, ε is usually known as accuracy and 1−δ isknown as confidence. The quantum query complexity of aquantum PAC learning algorithm is the maximum numberof queries to QEX(c, D) for any c ∈ C and any distribu-tion D over {0,1}n .

Because of the specialization of quantum circuits usedin the quantum PAC learning model, they can be simplified

as follows. First, since all the information used for com-putation comes from the oracle calls, the initial state ofthe quantum register can be assumed as |0〉m . Secondly,since QEX(c, D) is required to be invoked only on the state|0n,0〉, without loss of generality, all QEX(c, D) calls inthe quantum learning algorithm can be assumed to occurat the beginning of the algorithm. Finally, after a set ofqueries to QEX(c, D), the algorithm performs a sequenceof unitary transformations and one measurement. Since weare chiefly interested in query complexity and not the cir-cuit size, the quantum circuit can be replaced by a POVM(Positive Operator-Valued Measure) measurement, which isrepresented by a set of positive self-adjoint operators {Πh},where

∑h Πh = I [9]. More precisely, let

|ψc〉 =∑

x∈{0,1}n

√D(x)

∣∣x, c(x)⟩.

A quantum learning algorithm with query complexity Tcan be represented as a POVM measurement {Πh} for|ψc〉⊗T . If the algorithm is a quantum PAC learning algo-rithm with accuracy ε and confidence 1−δ for the conceptclass C , for any c ∈ C ,∑

h∈Hc

〈ψc|⊗T Πh|ψc〉⊗T > 1 − δ, (1)

where Hc is the set of hypothesis h, such that h is an ε-approximation to c under the distribution D .

In the PAC learning model, the Vapnik–Chervonenkis(VC) dimension plays a key role in measuring the querycomplexity of a concept class. For S ⊆ {0,1}n , we writeΠC (S) to denote {c ∩ S: c ∈ C}, so |ΠC (S)| is the numberof different dichotomies which the concepts in C induce onthe points in S . A subset S ⊆ {0,1}n is said to be shatteredby C if |ΠC (S)| = 2|S| , i.e., if C induces every possible di-chotomy on the points in S . For a concept class C , its VCdimension is the cardinality d of the largest set S shatteredby C [11]. In the classical PAC learning model, an algorithmwith ε accuracy and 1 − δ confidence, for a concept classwith VC dimension d, requires at least

Ω

(1

ε

(d + log

1

δ

))

examples [12]. The lower bound is nearly optimal, since anupper bound of

O

(1

ε

(d log

1

ε+ log

1

δ

))

is given in [13]. Since a quantum PAC learning algorithmcan be directly transferred to a classical PAC learning al-gorithm [9], the upper bound in the classical PAC learningmodel is also an upper bound in the quantum PAC learningmodel. On the other hand, in the quantum PAC learningmodel, the best lower bound known is given by A. Aticiand R. Servedio [7], which is

Ω

(d + 1

ε

(√d + log

1

δ

)).

In this paper, we will improve the lower bound to

Page 3: An improved lower bound on query complexity for quantum PAC learning

42 C. Zhang / Information Processing Letters 111 (2010) 40–45

Ω

(1

ε

(d1−e + log

1

δ

)),

for any e > 0. Hence, we show that the lower bound onquery complexity are almost the same in both the quan-tum and classical PAC learning models.

2. Former results and improvements

In this section, we first review the proof of the lowerbound Ω( 1

ε

√d ) offered in [7], then provide a new lower

bound Ω( 1ε ( d

ln d )23 ) as an improvement of the former

proof. In the next section, we will show how to furtherimprove the lower bound to Ω( 1

ε d1−e), for an arbitrarilysmall positive number e.

To prove the lower bound Ω( 1ε

√d ), in [7], the au-

thors used the following strategy. First, they designeda set of concepts, such that under a given distribution,the difference of each two concepts is no less than 2ε .Hence, the quantum states generated from such concepts,|ψ1〉, . . . , |ψN 〉, should be distinguished with confidence1 − δ, where N is the number of concepts. Then, by con-structing corresponding quantum states |φi〉, which areclose to quantum states |ψi〉t , for i = 1, . . . , N , where twill be decided later, if |ψi〉T t can be distinguished, so can|φi〉T . At the same time, from the polynomial-based argu-ment [14], there is a lower bound on the copies of |φi〉needed to distinguish each other. Hence from the lowerbound on the copies of |φi〉 and the condition that |φi〉 isclose to |ψi〉t , a lower bound on the copies of |ψi〉, i.e., tT ,is derived in [7], which is Ω( 1

ε

√d).

We find that if we change the quantum states |φi〉 tobe closer to |ψi〉t , for i = 1, . . . , N , we can derive improvedlower bounds compared to the one in [7]. In this section,we show how to derive a lower bound Ω( 1

ε ( dln d )2/3), as a

simple example. In the next section, we prove that for anyarbitrarily small positive number e, there is a lower boundΩ( 1

ε d1−e) on the query complexity.To start the proof, we introduce the Gilbert–Varshamov

bound [15], which is well known in coding theory.

Lemma 1. There exists a set {z1, . . . , zN } of d-bit binary stringssuch that for all i �= j the string zi and z j differ in at least d/4bit positions, where

N � 2d

∑d/4−1j=0

( dj

) � 2d

∑d/4j=0

( dj

) � 2d(1−H(1/4)) > 2d/6,

where H(p) = −p log p − (1 − p) log(1 − p) is the binary en-tropy function.

Theorem 1. Let C be any concept class with VC dimension d+1.Any quantum algorithm with a QEX(c, D) oracle to learn a con-cept belonging to C with ε accuracy, 4/5 confidence, must havequantum query complexity at least

Ω

(1

ε

(d

ln d

)2/3).

Proof. Let {x0, x1, . . . , xd} be a set of inputs which can beshattered by C . As in [7], we consider the same distribu-tion D , that is

D(x0) = 1 − 8ε,

and

D(xi) = 8ε

d,

for i = 1, . . . ,d.From Lemma 1, there are N strings, z1, . . . , zN , whose

length is d, and the Hamming distance1 between each isat least d/4, where N > 2d/6. For i = 1, . . . , N , let ci be aconcept, such that ci(x0) = 0 for any i, and ci(x j) equals tothe j-th element in string zi . Since for any i �= j,

PrD[ci(x) �= c j(x)

]� 8ε

d

d

4= 2ε,

the quantum PAC learning algorithm should distinguisheach quantum state with confidence 4/5. The quantumstates generated from QEX(c, D) are

|ψi〉 = √1 − 8ε |x0,0〉 +

d∑j=1

√8ε

d

∣∣x j, ci(x j)⟩

= √1 − 8ε |α〉 + √

8ε |βi〉, (2)

where |α〉 = |x0,0〉, and

|βi〉 = 1√d

d∑j=1

∣∣x j, ci(x j)⟩,

for i = 1, . . . , N . Note that |β1〉, . . . , |βN 〉 and |α〉 are allnormalized.

Before we approximate |ψi〉⊗t , we define a series ofquantum states, which are not normalized. Let∣∣μ0(i,n)

⟩ = |α〉n,

and

∣∣μk(i,n)⟩ =

n−k∑j=0

∣∣μk−1(i,n − 1 − j)⟩ ⊗ |βi〉 ⊗ |α〉 j

|μk(i,n)〉 has the direct physical meaning that it is the sumof all n-composed quantum states, such that k of the sub-systems are in the state |βi〉, the others are all in the state|α〉. For example,∣∣μ1(i, t)

⟩ = |α〉t−1|βi〉 + |α〉t−2|βi〉|α〉 + · · · + |βi〉|α〉t−1

and∣∣μ2(i, t)

⟩ = |α〉t−2|βi〉2 + |α〉t−3|βi〉|α〉|βi〉 + · · ·+ |βi〉|α〉t−2|βi〉 + |α〉t−3|βi〉2|α〉+ |α〉t−4|βi〉|α〉|βi〉|α〉 + · · ·+ |βi〉|α〉t−3|βi〉|α〉 + · · · + |βi〉2|α〉t−2.

From its physical meaning, it is easy to know that

〈μk(i,n)|μl(i,n)〉 =(

nk

)δk,l, (3)

1 The Hamming distance between two strings of equal length is thenumber of positions at which the corresponding symbols are different.

Page 4: An improved lower bound on query complexity for quantum PAC learning

C. Zhang / Information Processing Letters 111 (2010) 40–45 43

where δk,l = 1 if k = l, and δk,l = 0 if k �= l. Since |ψi〉 =√1 − 8ε |α〉 + √

8ε |βi〉,

|ψi〉t =t∑

k=0

(1 − 8ε)t−k

2 (8ε)k2∣∣μk(i, t)

⟩. (4)

In the proof in [7], the authors define |φi〉 as the firsttwo terms in Eq. (4) with an additional orthogonal term,while in the current proof we define it from the first threeterms in Eq. (4), i.e.,

|φi〉 = (1 − 8ε)t2∣∣μ0(i, t)

⟩ + (1 − 8ε)t−1

2 (8ε)12∣∣μ1(i, t)

⟩+ (1 − 8ε)

t−22 (8ε)

∣∣μ2(i, t)⟩ + |z〉, (5)

for i = 1, . . . , N , where |z〉 is an orthogonal term whichnormalizes |φi〉. Then, |φi〉 is an approximation to |ψi〉t ,such that

〈ψi|t |φi〉 � (1 − 8ε)t + (1 − 8ε)t−18εt

+ (1 − 8ε)t−2(8ε)2 t(t − 1)

2

� 1 − 8tε +(

t2

)(8ε)2 −

(t3

)(8ε)3

+ 8tε(1 − 8(t − 1)ε

)+ t(t − 1)

2(8ε)2 − t(t − 1)(t − 2)

2(8ε)3

= 1 − 2t(t − 1)(t − 2)

3(8ε)3. (6)

Since∣∣x j, ci(x j)

⟩ = (1 − ci(x j)

)|x j,0〉 + ci(x j)|x j,1〉,

|βi〉 =d∑

j=1

(1 − ci(x j)

)|x j,0〉 +d∑

j=1

ci(x j)|x j,1〉,

|μk(i, t)〉 is a superposition where each amplitude is a d-variate polynomial whose degree is at most k. Hence, eachamplitude of |φi〉 is a d-variate polynomial with degreeat most 2, for any i. So, for any measurement of |φi〉T ,there exists a set of d-variate 4T -degree polynomials Pi ,for i = 1, . . . , N , such that Pi(z j) equals to the probabil-ity of attaining |φi〉, while the real quantum state is |φ j〉,where z j = (c j(x1), c j(x2), . . . , c j(xd)) [14].

Consider the scenario that |φi〉 can be distinguishedwith confidence 2/3 by T copies. Consider an N × N ma-trix L, whose (i, j) element L(i, j) = Pi(z j). Then

Li,i � 2/3 > 1/3 �∑j �=i

L j,i .

Hence, L is a strictly diagonally dominant matrix. Fromthe Levy–Desplanques theorem [16], it is known that anystrictly diagonally dominant matrix must be of full rank, sothe rank of L is N . All the d-variate 4T -degree polynomi-als are in the linear space spanned by monic multilinearmonomials over d variables of degree at most 2T . Forinstance, if d = 2, and T = 1, the space are spanned by{1, x1, x2, x2

1, x1x2, x22, x3

1, x21x2, x1x2

2, x32, x4

1, x31x2, x2

1x22, x1x3

2,

x4}. The dimension of the linear space is

2

NT =4T∑

k=0

(k + d − 1

k

)=

(d + 4T

4T

).

Since each Pi is in an NT -dimensional space, so are therows of L, which is (Pi(z1), Pi(z2), . . . , Pi(zN )). Hence, N �NT , i.e,

NT =(

d + 4T4T

)� N � 2d/6. (7)

If T < 14 d, then

(d + 4T

4T

)� (d + 4T )4T � (2d)4T .

Consequently, (2d)4T � 2d/6,

T � ln 2

24

d

ln d + ln 2� ln 2

48

d

ln d, (8)

for d � 2. On the other hand, if T � 14 d, T is also greater

than ln 248

dln d . Hence, to distinguish |φi〉, we need

T � ln 2

48

d

ln d(9)

copies.If

〈φi|⊗T |ψi〉⊗tT � 5

6,

and |ψi〉⊗tT can be distinguished with confidence 4/5,then |φi〉T can be distinguished with confidence 2/3 bythe same measurement. From Eq. (6), when

t = 1

(1

4T

)1/3

, (10)

the condition is satisfied, hence Eq. (9) holds. By combin-ing Eqs. (10) and (9), we derive that

tT � 1

8

(1

4

)1/3( ln 2

48

)2/3(1

ε

)(d

ln d

)2/3

� 0.004

(1

ε

)(d

ln d

)2/3

.

Hence there exists a lower bound for the query complexity,which is

Ω

(1

ε

(d

ln d

)2/3). �

In the above analysis, the lower bound for query com-plexity is improved by approximating |ψi〉t with moreterms in Eq. (4). It offers us a strategy to further improvethe lower bound by using more and more terms in Eq. (4)to construct |φi〉, as is discussed in the next section.

Page 5: An improved lower bound on query complexity for quantum PAC learning

44 C. Zhang / Information Processing Letters 111 (2010) 40–45

s}

3. A lower bound on query complexity

We start this section with a lemma about the tails ofthe binomial distribution [17], which will be used in theproof of the main result in this paper.

Lemma 2. Consider a sequence of n Bernoulli trials, where suc-cess occurs with probability p. Let X be a random variable de-noting the total number of successes. Then for 0 � k � n, theprobability of at least k successes is

Pr{X � k} =n∑

i=k

b(i;n, p) �(

nk

)pk.

Proof. For S ⊆ {1,2, . . . ,n}, we let Ns denote the eventthat the i-th trial is a success for every i ∈ S . ClearlyPr{Ns} = pk if |S| = k. We have

Pr{X � k} = Pr{

there exists S ⊆ {1,2, . . . ,n}: |S| = k and N

= Pr

{ ⋃S⊆{1,2,...,n}: |S|=k

Ns

}

�∑

S⊆{1,2,...,n}: |S|=k

Pr{Ns}

=(

nk

)pk. �

We now state and prove the main result of this paper.

Theorem 2. Let C be any concept class of VC dimension d + 1.Any quantum algorithm with a QEX(c, D) oracle which learns aconcept belonging to C with ε accuracy, 4/5 confidence, musthave quantum query complexity

Ω

(1

εd1−e

),

for an arbitrarily small constant e > 0.

Proof. Similar to the proof of Theorem 1, let {x0, x1, . . . , xd}be a set of inputs which can be shattered by C . Weconsider the same distribution D , and the same con-cepts c1, . . . , cN given in the proof of Theorem 1, whereN > 2d/6, then

PrD[ci(x) �= c j(x)

]� 8ε

d

d

4= 2ε.

Hence, the quantum PAC learning algorithm should suc-cessfully distinguish between any two concepts ci and c jwith confidence 4/5, i.e., it can distinguish each quantumstate generated by QEX(c, D) with confidence 4/5, wherethe quantum states are

|ψi〉 = √1 − 8ε |x0,0〉 +

d∑j=1

√8ε

d

∣∣x j, ci(x j)⟩,

for i = 1, . . . , N .As in the proof of Theorem 1, we will construct a quan-

tum state |φi〉 which is close to |ψi〉t such that if |ψi〉T t

can be distinguished, so can |φi〉T , where t will be decidedlater. Then, by providing a lower bound for the copies of|φi〉 needed to distinguish each other, we can derive thelower bound for the copies of |ψi〉.

From Eq. (4),

|ψi〉t =t∑

k=0

(1 − 8ε)t−k

2 (8ε)k2∣∣μk(i, t)

⟩. (11)

for i = 1, . . . , N , where μk(i, t) has the same definition asin the above section. Then, we define quantum states |φi〉as an approximation of |ψi〉t ,

|φi〉 =s−1∑k=0

(1 − 8ε)t−k

2 (8ε)k2∣∣μk(i, t)

⟩, (12)

where s = 2/e. Then,

〈ψi|t |φi〉 �s−1∑k=0

(tk

)(8ε)k(1 − 8ε)t−k

� 1 −(

ts

)(8ε)s. (13)

The last equation comes from the bound on the tail of thebinomial distribution as given in Lemma 2:

n∑i=k

b(i;n, p) �(

nk

)pk.

Consider the scenario that |φi〉 can be distinguishedwith confidence 2/3 by T copies. Since each amplitudein |μk(i, t)〉 is a d-variate polynomial whose degree isat most k, each amplitude of |φi〉 is a d-variate poly-nomial with degree at most s, for any i. Hence thereexists a set of d-variate 2sT -degree polynomials Pi , fori = 1, . . . , N , such that Pi(z j) equals the probability of at-taining |φi〉 when the real quantum state is |φ j〉, wherez j = (c j(x1), c j(x2), . . . , c j(xd)). Through a similar analysisto that given in the proof of Theorem 1, we know that if|φi〉 can be distinguished with confidence 2/3 by T copies,

NT =(

d + 2sT2sT

)� N � 2d/6,

where NT is the dimension of the linear space spanned byd-variate 2sT -degree polynomials. So,

T � ln 2

24s

d

ln d. (14)

Moreover, if

〈φi|⊗T |ψi〉⊗tT � 5

6, (15)

and |ψi〉 can be distinguished with confidence 4/5 by tTcopies, |φi〉⊗T can be distinguished with confidence 2/3 bythe same measurement. In order to satisfy Eq. (15), fromEq. (13), it is enough to let

t = 1(

1)1/s

. (16)

8ε 6T
Page 6: An improved lower bound on query complexity for quantum PAC learning

C. Zhang / Information Processing Letters 111 (2010) 40–45 45

Combining Eqs. (16) and (14), the lower bound of thequery complexity is

T t =(

1

6

)1/s( ln 2

24s

)1−1/s( 1

)(d

ln d

)1−1/s

= Ω

(1

ε

(d

ln d

)1−1/s).

Since s = 2/e, then the number of quantum queries is atleast T t = Ω( 1

ε d1−e). �In [7], the authors proved another lower bound for the

number of queries in the quantum PAC learning model,which is Ω( 1

ε log 1δ). Combining it with Theorem 2, we

have the following theorem.

Theorem 3. Let C be any concept class of VC dimension d. Anyquantum algorithm with a QEX(c, D) oracle to learn a conceptbelonging to C with ε accuracy, 1 − δ confidence, must havequantum query complexity

Ω

(1

ε

(d1−e + log

1

δ

)),

for an arbitrarily small constant e > 0.

Acknowledgements

We are grateful to Rocco A. Servedio and Joseph F. Traub,Columbia University, for their very helpful discussions andcomments. This work has been supported in part by theNational Science Foundation.

References

[1] P. Shor, Algorithms for quantum computation: discrete logarithmsand factoring, in: Proc. 39th IEEE Symp. on Foundations of ComputerScience, 1994.

[2] L. Grover, A fast quantum mechanical algorithm for database search,in: Proc. 28th ACM Symp. on the Theory of Computing, 1996.

[3] D. Deutsch, R. Jozsa, Rapid solution of problems by quantum compu-tation, Proc. R. Soc. Lond. Ser. A 439 (1992) 553.

[4] E. Bernstein, U. Vazirani, Quantum complexity theory, SIAM J. Com-put. 26 (5) (1997).

[5] N.H. Bshouty, J.C. Jackson, Learning DNF over the uniform distribu-tion using a quantum example oracle, SIAM J. Comput. 28 (3) (1999)1136–1153.

[6] S. Gortler, R.A. Servedio, Equivalences and separations between quan-tum and classical learnability, SIAM J. Comput. 33 (6) (2004).

[7] A. Atici, R.A. Servedio, Improved bounds on quantum learning algo-rithms, Quantum Inf. Process. 4 (5) (2005).

[8] L.G. Valiant, A theory of the learnable, Commun. ACM 27 (11) (1984)1134–1142.

[9] M.A. Nielsen, I.L. Chuang, Quantum Computation and Quantum Infor-mation, Cambridge University Press, 2000.

[10] A.Y. Kitaev, A.H. Shen, M.N. Vyalyi, Classical and Quantum Computa-tion, Grad. Stud. Math., vol. 47, 2002.

[11] M.J. Kearns, U.V. Vazirani, An Introduction to Computational LearningTheory, MIT Press, 1994.

[12] A. Ehrenfeucht, D. Haussler, M. Kearns, L. Valiant, A general lowerbound on the number of examples needed for leanring, Inform. andComput. 82 (3) (1989) 247–251.

[13] A. Blumer, A. Ehrenfeucht, D. Haussler, M.K. Warmth, Learnabilityand the Vapnik–Chervonenkis dimension, J. ACM 36 (4) (1989) 929–965.

[14] R. Beals, H. Buhrman, R. Cleve, M. Mosca, R. de Wolf, Quantum lowerbounds by polynomials, in: Proc. 39th IEEE Symp. on Foundations ofComputer Science, 1998, pp. 352–361.

[15] J.H. Van Lint, Introduction to Coding Theory, Springer-Verlag, 1992.[16] R.A. Horn, C.R. Johnson, Matrix Analysis, Cambridge University Press,

1985.[17] T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein, Introduction to Algo-

rithms, MIT Press, 2001.