Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Int. J. Appl. Math. Comput. Sci., 2019, Vol. 29, No. 1, 139–150DOI: 10.2478/amcs-2019-0011
RECOMMENDATION SYSTEMS WITH THE QUANTUM K–NN AND GROVERALGORITHMS FOR DATA PROCESSING
MAREK SAWERWAIN a,∗, MAREK WRÓBLEWSKI a
aInstitute of Control and Computation EngineeringUniversity of Zielona Góra, Licealna 9, 65-417 Zielona Góra, Poland
e-mail: M.Sawerwain,[email protected]
In this article, we discuss the implementation of a quantum recommendation system that uses a quantum variant of thek-nearest neighbours algorithm and the Grover algorithm to search for a specific element in an unstructured database.In addition to the presentation of the recommendation system as an algorithm, the article also shows the main steps inconstruction of a suitable quantum circuit for realisation of a given recommendation system. The computational complexityof individual calculation steps in the recommendation system is also indicated. The verification of the correctness ofthe proposed system is analysed as well, indicating an algebraic equation describing the probability of success of therecommendation. The article also shows numerical examples presenting the behaviour of the recommendation system fortwo selected cases.
Keywords: quantum k-NN algorithm, recommendation systems, Grover algorithm, big data.
1. Introduction
The development of quantum computational methods(Nielsen and Chuang, 2010; Galindo and Martin-Delgado,2002) allows their use in areas such as machine learning(Wiebe et al., 2015), decisions models (Busemeyer andBruza, 2012) or big data (Nielsen, 2016), as well asin classification tasks (Sergioli et al., 2017; Santucci,2017; Montanaro, 2017). The classical methods ofanalysing data sets of big data are widely used, butthe use of quantum computers that allow the processingof exponential amounts of data seems to be extremelyimportant in this area (Stefanowski et al., 2017; Velosoet al., 2015). In addition, due to the features ofquantum information, the lack of the possibility of cloninginformation and direct comparison of two quantumregisters, the construction of such a system requires aslightly different approach. In this article we show thatthe recommendation process can be based on the methodof k-nearest neighbours classification (Hechenbichler andSchliep, 2004). Such an approach allows us to create asystem where the effectiveness of the recommendationcan be very high.
In the article, we discuss the construction of
∗Corresponding author
a recommendation system based on two quantumcomputing algorithms. The first one is the quantumalgorithm of k-nearest neighbours (termed quantumk-NN) (Pinkse et al., 2013; Schuld et al., 2014;Wiebe et al., 2015), based on the method presentedby Trugenberger (2002). In this algorithm, we usethe Hamming distance to classify the elements tobe recommended (Wisniewska and Sawerwain, 2018).Grover’s algorithm (Grover, 1996; Arikan, 2003; Galindoand Martin-Delgado, 2000) is used to improve the qualityof recommendations. The use of both the methodsallows the construction of a recommendation algorithmwhich is characterized in the recommendation process bybetter computational complexity than classical approaches(Alpaydin, 2004; Armbrust et al., 2010). However, itshould be emphasized that the described solution, likeother quantum algorithms (Shor, 1999), is probabilistic,although the probability of indicating recommendedelements is very high.
The solution proposed in this work can also be basedon the classifiers proposed by Santucci (2017), Sergioliet al. (2018) and Chakrabarty et al. (2017). However, theuse of the k-nearest neighbours algorithm and the Groveralgorithm, due to the relative simplicity of circuit designfor these algorithms allows us to expect that their physical
140 M. Sawerwain and M. Wróblewski
implementation of Q Experience (IBM, 2018) is possibleat the moment. In addition, the proposed method, due toits direct translation into the language of quantum gates,allows providing its exact computational complexity in theprocess of recommendation. Additionally, owing to theuse of the Grover algorithm, it is possible to achieve ahigh probability of indicating the right elements. In thefurther part of the work, the value of this probability isdescribed by the value Pmax.
It should be noted that this article is an extensionof a previous publication (Sawerwain and Wróblewski,2019). The current paper presents more preciselythe structure of the quantum register and numericalexamples showing the behaviour of the recommendingsystem. The computational complexity of individual stepsimplemented in the recommendation process was alsoindicated.
The organization of the article is as follows: InSection 2, selected issues of quantum computing, suchas quantum register and superpositions, are presented.These concepts characteristic for quantum computingare directly applied in the discussed recommendationsystem. In Section 3, we discuss the construction of therecommendation system, present the algorithm of conductand discuss the construction of the quantum circuit whichimplements the discussed solution. The correctness ofthe recommendation system, i.e., the description of theprobability of indicating the correct element (or elements),is presented in Section 3.4. Numerical experimentsare also implemented and discussed in Section 4. Theexperiment shows the behaviour of the system in twocases, when one and two elements are recommended.The summary of the article is given in Section 5, wherethe final conclusions and some comments about possiblefurther work on the method described in this article aregiven.
2. Brief introduction to quantuminformation processing
The concept of a classic bit, which is the basic unit ofinformation, can be extended to the qubit for quantumsystems (Walther et al., 2005; Steane, 1998). For thispurpose, we consider a two-dimensional Hilbert space H2
and indicate its orthonormal basis:
|0〉 =[
10
], |1〉 =
[01
]. (1)
In the last equation, Dirac’s notation is used. Thevector |0〉 usually means zero. The natural thing is that |1〉represents one. The given 1 base is also called a standardcomputation base.
In quantum information, the notion of a qubit isthe equivalent of the classical bit. The qubit state is
represented by a vector in the two-dimensional Hilbertspace H:
|ψ〉 = α|0〉+ β|1〉. (2)
This vector is normalized, so that |α|2 + |β|2 = 1,where α, β ∈ C, the set of complex numbers. Thepresented state |ψ〉 is called a vector state or a pure state.
Following the classic computer science,concatenation of multiple classical bits creates aregister. The |ψ〉 register with n qubits is built using thetensor product
|ψ〉 = |ψ0〉 ⊗ |ψ1〉 ⊗ |ψ2〉 ⊗ · · · ⊗ |ψn−1〉, (3)
where ⊗ is the tensor product operation.
Remark 1. (Quantum entanglement) It should be addedthat there are cases where the register cannot be written inthe form of a tensor product. This state of the register iscalled an entangled state.
It should also be noted that, in addition to thedescription in the form of pure states, the so-calledunknown density matrices (the pure state of qubit |ψ〉)taking the following form are also used:
ρ = |ψ〉〈ψ| =[α2 αβαβ β2
], (4)
where the vector 〈ψ| signifies the transpose of |ψ〉.In general, the representation for a mixed state for
which only pure states are included, |ψi〉, takes thefollowing form:
ρ =∑i
λi|ψi〉〈ψi|, (5)
where λi means the probability of the state |ψi〉 and∑i λi = 1.
Remark 2. (Exponential capacity of a quantum regis-ter) One of the main differences between the classicand quantum registers is the exponential capacity of thelatter. The amount of classical information contained inthe quantum register described by the state vector canbe described by 2n, where n is the number of qubits,which makes classical simulation of the quantum registerineffectual using classical computations methods. Thereare, of course, special cases such as the so-called CHPcircuits (Aaronson and Gottesman, 2004), but in generalthe quantum register requires exponential computationalresources to simulate the operation of the quantum registerusing a classical machine. This situation is much worsewhen we using a density matrix, as their size is dn × dn.
The state vector for 16 qubits needs 512 kB ofmemory, and 256 MB for a density matrix, assumingthat we use double precision floating point numbersdescribing individual amplitudes of probability. However,
Recommendation systems with the quantum k-NN and Grover algorithms . . . 141
doubling the number of qubits to 32 due to the exponentialcapacity of the quantum register already requires 32 GBof available memory, and 1 ZB (zetta bytes) for the densitymatrix.
Several additional operations can be performed onthe quantum register. As part of this brief introduction,we will only quote the most important examples ofoperations, i.e., unitary and measurement ones.
The realization of a unitary operation of U for thequantum state given by the vector is represented by theequation
U |ψ0〉 = |ψ1〉. (6)
For a density matrix, the unitary operation U is describedby the following relationship:
Uρ0U† = ρ1. (7)
A very important thing is the method of creating a unitaryoperation for the quantum register. In the case of statemodifications for the first and the third qubit, the unitaryunit construction takes the following form:
U = u1 ⊗ I ⊗ u2. (8)
The unitary operation is reversible (also called anuncompute one), i.e., for the operation U you can alwaysenter the operation U †, where the symbol † represents theHermitian adjoint operation. If U = U †, then U is calledas a self-adjoint operation.
The basic set of unit operations includes the so-calledPauli gates (operators): X , Y , Z . Their matrixrepresentation is
X =
(0 11 0
), Y =
(0 −ii 0
),
Z =
(1 00 −1
).
(9)
The Hadamard gate
H =1√2
(1 11 −1
)(10)
is also very important because it is used to introduce theso-called superposition for quantum states. For example,let |Ξ〉 be an n-qubit state
|Ξ〉 = |ξ0〉 ⊗ |ξ1〉 ⊗ · · · ⊗ |ξn−2〉 ⊗ |ξn−1〉. (11)
Using the |Ξ〉 register, you can easily show how theuse of a single Hadamard gate works:
H |Ξ〉 = H(|ξ0〉 ⊗ |ξ1〉 ⊗ · · · ⊗ |ξn−2〉 ⊗ |ξn−1〉)= H |ξ0〉 ⊗H |ξ1〉 ⊗ · · · ⊗H |ξn−2〉 ⊗H |ξn−1〉
=
n−1⊗i=0
H |ξi〉 = 1√2n
[n−1⊗i=0
(|0〉+ (−1)ξi |1〉)].
(12)
The use of the Hadamard gateway results in theamplitude values being equal to the absolute value 1/
√2n
for all the states taken by the specified quantum register.We should be also pay attention to the construction of
controlled gates, whose construction requires, in additionto the tensor product, the use of a direct sum of matricesand also uses the projection matrix.
An example may be one of the possibleimplementations of the CNOT gate (controlled negationgate) for qubits:
U = |000〉〈000|I + |010〉〈010|X + |111〉〈111|I. (13)
This gate performs a qubit negation if the so-called controlqubit takes the state one.
The second type of the basic operation is theso-called measurement operation. We will present onlyone example of this type of operation: a von Neumanntype measurement. It begins with the preparation ofobservables. The operator is presented in the followingway:
M =∑i
λiPi, (14)
where Pi is a projector onto the operator’s eigenspace Mwith the eigenvalue λi.
The results of the measurement performed representthe eigenvalues λi, where the individual results occur withthe probability
P (λi) = 〈ψ|Pi|ψ〉. (15)
The obtained result λi indicates that the register |ψ〉has been transformed to
|ψ′〉 = Pi|ψ〉√λi
, (16)
with the probability determined by Eqn. (15).
3. Quantum recommendation system
This section presents the construction of a quantumrecommendation system. An example of a databaseis indicated from which the elements chosen by userswill be recommended. The quantum register is alsodescribed in Section 3.1. Section 3.2 describes thenecessary calculation steps for the implementation of thequantum recommendation algorithm. The construction ofa quantum circuit is also shown in Section 3.3. Analysisof the correctness of the proposed algorithm is given inSection 3.4. An analytical description of the probability ofsuccess of indicating the recommended element (or manyelements) is presented.
142 M. Sawerwain and M. Wróblewski
3.1. Database and quantum register structure. Theproposal for building a quantum recommendation system(QRS) presented in this article is hybrid. The databasefrom which items are recommended based on the user’ssuggestions and needs is naturally stored in the classicalsystem. An example of a classical database on the basisof which a quantum recommendation system can be builtis the IMDB movie database (OMDb, 2018). Figure 1shows several records selected from the database, with afield describing the feature of the particular movie.
It should be emphasized that there is no need to storethe entire database into the quantum system. Usuallyonly two columns of data are relevant: the identifier ofrecommended elements and the element’s feature savedas a binary word. The length of words that are used todescribe the identifier and characteristics is important. It isassumed that these will be binary numbers: the identifierwill be described with q bits, while the feature with lbits. Usually the entire classic database can be dividedinto sections referring to elements with common features.Based on this, you can create a quantum register thatcontains information about the database. A representationof this register and its possible division are shown inFig. 1. If we use the entire database in this case, we haveonly one register, or we can divide the whole register intosub-registers with common features, e.g., all historicalfilms would be collected in one register.
In view of the remarks about the form of the databaseand the structure of the quantum register, we distinguishfour main parts:
|Ψ〉 = |ψq〉 ⊗ |ψff 〉 ⊗ |ψfu〉 ⊗ |ψk〉. (17)
The first q qubits of the registry |Ψ〉, i.e., |ψq〉 areused to encode the identifier of the recommended element.The next part of the |ψff 〉 register describes the features ofindividual database elements. They refer naturally to theidentifier of the individuals items. The l qubits are used todescribe the feature, because thanks to the superpositionproperties we shall create 2l various features in thedatabase containing 2q classical elements. The thirdpart of |ψfu〉 is a description of the feature desired bythe user. Once again, l qubits are used, although wedescribe only one feature desired by the user. Again bythe superposition principle, applying only l operations(the feature is described by a binary number with a widthof l bits), we connect information about user featureswith the rest of the database. The last part, denoted bythe |ψk〉 register, represents additional/auxiliary qubits.More precisely we only need one qubit for the k-NNquantum part and one additional qubit for the Groveramplification. We point out where there qubits areused later in Section 3.3. Section 3.4 presents ananalytical analysis, and the success probability of therecommendation is described by Pmax.
Table1.
Sam
plestructure
ofthe
classicdatabase.
Itshouldbe
notedthatin
thequantum
partofthe
databaseonly
two
columns
areused:
identifier
(“Id”)and
descriptionof
thefeature
(“Feature.”)T
heindicated
fields
arealso
them
ainkeys
within
theclassicalbase.
IdYear
Title
Tim
eFeatu
reLan
guage
Ratin
gVotes
Type
Featu
rein
bits
8312004
Vendetta:
NoCon
science,
NoMercy
112min
Crim
e,Dram
a,Action
English
4.5239
movie
1101001100001010001090
1976TheBig
Bus
88min
Com
edy,
Action
English
5.62165
movie
0111001011000000001597
1940Flash
Gord
onCon
quers
theUniverse
220min
Action
,Sci-F
i,Adven
ture
English
7.0819
movie
1011001011101000002533
1999Batm
anBeyon
d:TheMovie
132min
Anim
ation,Sci-F
i,Action
English
7.83201
movie
0101101011101011002624
1958Kings
GoForth
109min
Dram
a,War,
Action
English
,Fren
ch,Germ
an6.6
909movie
1100001101101010002977
1987Innersp
ace120
min
Action
,Adven
ture,
Com
edy
English
6.738517
movie
1011001001100110004540
2012Aftersh
ock
89min
Action
,Adven
ture,
Horror
Span
ish,English
4.87531
movie
1011001001101110005267
2005TheProp
hecy
:Uprisin
g88
min
Thriller,
Action
,Fan
tasyEnglish
,Rom
anian
5.41603
movie
1100101011000000005898
2000Epicen
ter102
min
Action
,Adven
ture,
Crim
eEnglish
3.01059
movie
1011001001101100006334
2002Escap
efrom
Afgh
anistan
90min
Action
,War
English
3.0141
movie
1011001101100000007775
2000U-571
116min
Action
,War
English
,Germ
an6.6
62229movie
1011001101100000008364
1973Yaad
onKiBaaraat
164min
Dram
a,Rom
ance,
Action
Hindi
7.41267
movie
1100001001001010008430
1994TheKungFuMaster
45min
Action
Can
tonese
7.545
series101100000000000000
86431974
TheAren
a83
min
Action
,Adven
ture
English
5.4650
movie
1011001001100000009110
2003TheForeign
er96
min
Action
,Thriller
English
,Dan
ish,Germ
an,Polish
3.25272
movie
1011001100100000009299
2001American
Outlaw
s94
min
Action
,Western
English
6.011659
movie
10110010101000000012510
1994ISpyRetu
rns
95min
Adven
ture,
Com
edy,
Action
English
5.2236
movie
100110011100101000
Recommendation systems with the quantum k-NN and Grover algorithms . . . 143
Quantum
Musical
Fantasy
Comedy
Biography
History
Documentary
Romance
Adventure
jQDi =
jqd0i =
jqd1i =
jqd2i =
jqd3i =
jqd22i =
jqd23i =
jqd24i =
News 0,000%Music 1,718%
Game-Show 0,025%Talk-Show 0,017%Reality-TV 0,083%
Sport 0,108%Animation 4,846%
Family 1,029%Mystery 0,573%Western 0,647%Action 14,464%Sci-Fi 0,622%Drama 21,923%Thriller 1,129%Crime 5,817%War 0,133%
Horror 4,771%Adult 0,199%
8>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>:
9>>>>>>>>>>>>>>>>>>>>>>>>=
>>>>>>>>>>>>>>>>>>>>>>>>;
jqd25i =
100%
0,531%
0,531%
23,865%
3,203%
0,100%
8,489%
0,647%
4,531%
Database
Fig. 1. Partition of the main quantum registry into subregis-ters in the proposed recommendation system. It is as-sumed that individual sub-registers contain entries withthe same leading feature.
Remark 3. (Quantity of recommendations) In thedescription of the registry structure, we have providedonly one feature for the user |ψfu〉; of course, you can addadditional user features, e.g., |ψfu1
〉, |ψfu2〉, . . . , |ψfuK
〉,to compute a K-th recommendation. However, in theremainder of this article, for simplicity, we limit ouranalysis to a single feature.
3.2. Main QRS algorithm. The proposed approachto the recommendation system is based on two mainstages. The first one points the database elementswhose features are closest to those indicated by the user.For this purpose, the quantum version of the k-nearestneighbours algorithm is used. Then, to amplify theeffectiveness of the recommendation, Grover’s algorithmis used. The individual computational steps are presentedas Algorithm 1.
The algorithm can also be represented as a controlflow in the proposed approach to the recommendationsystem. The corresponding diagram is shown inFig. 2. Because the algorithm is based on measurementoperations, an additional qubit called c0 has beenintroduced in the quantum register. Performing the
Algorithm 1. Quantum recommendation system.Successive steps in the quantum recommendation system:
(I) Creation of a database; in this step we use |ψq〉 and|ψff 〉 registers.
(II) The user determines which features recommendedelements should have; information is represented bythe register |ψfu〉.
(III) For a given feature, the appropriate sub-register (or awhole quantum register) is selected representing therelevant part of the database; data are encoded in thestate of register |ψff 〉.
(IV) A recommendation process is performed using aquantum algorithm of k-nearest neighbours; tworegisters are used, |ψfu〉 and |ψff 〉, and we also useone auxiliary qubit from |ψk〉 called |c0〉.
(Va) If the state of qubit |c0〉 after measurement is|0〉, then the obtained probability distribution ofthe recommended elements from Step (IV) can beamplified by Grover’s algorithm to improve theprobabilistic properties of the best recommendedelements; only the state of register |ψfu〉 is amplified.Additionally, we use one auxiliary qubit from |ψk〉called |qA〉. We can jump to Step (VI).
(Vb) If the state of qubit |c0〉 after measurement is |1〉, thenwe uncompute the q-kNN part and Step (IV) must berepeated.
(VI) Performing the second measurement on the quantumregister representing the database. Finally, therecommended element will be obtained (i.e., theelement that will have the highest probability ofmeasurement).
measurement operation on the qubit c0 allows us todetermine whether the registry has successively beenconverted to the correct probability distribution. Themeasurement result c0 assuming the state |0〉 meansthat the quantum register has been transformed into aproper state, and its probability distribution for the nextmeasurement will indicate a recommended element.
3.3. Quantum circuit for QRS and computa-tional complexity. Figure 3(a) shows the generalstructure of the quantum circuit implementing thequantum recommendation system. Three samplecircuits implementing the most important stages of therecommendation system are also given. The examples arebased on a 16-element database shown in Fig. 3(b).
Figure 3(d) shows the database initialization process,
144 M. Sawerwain and M. Wróblewski
START
STOP
User feature vector
Selection of
Perform quantum k-NN
Feature amplitude
amplification
Measure ofc0 qubit
Result is j1i
Result is j0i
Measure subregister j ff i
database subregister
Result jfii
uncomputequantum k-NN
Fig. 2. Control flow of the proposed the quantum recommenda-tion system.
which consists of three stages. The first one createsdatabase identifiers, and for this purpose it applies exactlyl Hadamard gates. The next stage is the encoding offeatures: unfortunately, it requires a larger amount ofoperation, which depends on the number of elements inthe database.
In general, after creating identifiers in the |ψq〉register, a permutation of elements must be performed inthe register |ψff 〉. Permutation generally requires no morethan N(N − 1)/2 single qubits and controlled gates (Liet al., 2013), with N = 2q. Unfortunately, the processof creating a database will not be significantly better thanthe creation of a classic database, although the structure offeature codes may be helpful in the process of creating adatabase. This can also be seen in the presented example,where six operators of the controlled negation are enough(the given relation defines the upper limit, therefore for16 elements, the maximum number of gates is 120) togenerate 16 database entries. The third stage, due to thesuperposition principle, will require only l negation gatesto build the user’s feature. Computational complexity in
the O notation of this stage can be written as
O1(l, N) = 2l+N(N − 1)/2. (18)
However, as shown in the correctness analysis inSection 3.4, you do not need to repeat the processof building the entire database. Circuit (d) presentsthe process of calculating the Hamming distance andthe so-called sum of distances to make a classification.The recommendation process is completed after themeasurement has been made by auxiliary qubits c0 if theresult measurement is |0〉; further details are presented inSection 3.4. The number of operations to be performedwhen calculating the Hamming distance (d) is linear anddepends on the width of the feature, i.e., the number ofclassic bits l describing the feature:
O2(l) = 3l+ 2. (19)
Remark 4. (Complexity of the quantum k-NN part)Particular attention should be paid to the fact that thecomputational complexity of this step depends only onthe width of the pattern, unlike in the classical methods,where the number of rows should be taken into accountfor the analysis of computational complexity.
The construction of the amplifying amplitude circuitfor one or several amplitudes with the Grover algorithmdepends on the form of the feature described by the user.Figure 3(e) shows the amplification of one amplitude forthe sample database shown in Fig. 3(b). The number andarrangement of the negation gates in the oracle section arean exact mapping of the feature specified by the user. Thedefinition of an oracle can be simplified by using onlynegation on the bits that are encoded, which encode themain feature, e.g., a historical movie can be encoded withthe oldest or the youngest bit. However, it can be assumedthat the feature requires the use of the largest number ofgates, defined as
O3(l) = 7l + 2c+ 3, (20)
where l is the width of the feature, and cmeans the numberof gates after decomposing the negation gate in the oraclestage of the Grover algorithm and in the controlled gateZ in the part performing the rotation around the average.Decomposition of controlled gates can be done accordingto Barenco et al. (1995) or Shende and Markov (2009).However, because we have l wide feature then finally wecan obtain polynomial complexity (in addition, we stilloperate on 2q elements of the classical data).
All computational complexity is naturally dominatedby the database creation process, but the calculationprocess of the recommendation depends only on the widthof the feature, and not on the amount of data in thedatabase.
Recommendation systems with the quantum k-NN and Grover algorithms . . . 145
Fig. 3. Quantum circuit for the discussed recommendation system. Part (a) presents three main elements of the recommendationprocess: preparation of the database, implementation of a quantum algorithm and amplifying the recommended element (withindicated circuits representing individual stages of the recommendation algorithm). Part (b) presents a system of recommendeditems that were used to construct sample circuits (column H shows the Hamming distance between the element and the examplefeature 101011 defined by the user). Part (c) represents the initialization part of the database. Part (d) shows an example ofimplementation of the main part of the quantum k-NN algorithm: Hamming distance calculation and the so-called quantumsumming of the distance. Part (e) represents the operation of amplifying amplitudes for the user’s feature, i.e., 101011.
3.4. Correctness of the quantum recommendationsystem. We will begin the analysis of the correctnessof the quantum recommendation system by defining theinitial state of a quantum register during the first stage,i.e., preparation of the database (for the clarity of ouranalysis, the qubits describing the record identifier |ψq〉are omitted):
|ψ0〉 = |0〉⊗2l+1. (21)
The first l qubits represent features from the databasewhile the next l qubits represent the user feature vector.Initializing the sub-register with the feature database givesthe following state:
|Ψ1〉 = 1√L
L∑p=1
|rp1 , . . . , rpl 〉, (22)
146 M. Sawerwain and M. Wróblewski
where L = 2l, and rpi , i = 1, . . . , l, is a bit descriptionof the features of a selected row in the database. Afterentering the form of the user’s feature vector, the state ofthe quantum register is given as
|Ψ2〉 = 1√L
(L∑
p=1
|rp1 , . . . , rpl 〉)
⊗ |t1, . . . , tl〉 (23)
⊗ 1√2(|0〉+ |1〉),
where |rpi 〉, i = 1, . . . , l, represents features from thedatabase and |ti〉, i = 1, . . . , l, represents the userfeatures.
In the next step, we can calculate the Hammingdistance between the user’s feature and those stored inthe database. Responsible for distance calculation is thequantum circuit in Fig. 3(d), which is described by thefollowing unitary operation U :
U = e−i π2l H , (24)
H =
[1 00 0
]⊗l
⊗ Il×l ⊗[
1 00 −1
].
The operation U can be decomposed as the set ofsingle qubit gates
P1 =
[e−i π
2l 00 1
], (25)
and the controlled gate
P2 = |0〉〈0|+ |1〉〈1| ⊗ P1−2. (26)
A further discussion about decomposition of U canbe found in the work of Trugenberger (2002). Asuitable quantum circuit which implements operationU isdepicted as the circuit in Fig. 3(d) as “Quantum summingof distance.”
The use of the U operation described by Eqn. (24)results in the state
|ψ4〉 = 1√L
L∑p=1
(ei
π2ld(t,r
p)|dp1, . . . , dpl 〉
⊗|t1, . . . , tl〉 ⊗ |0〉+ e−i π2ld(t,r
p)|dp1, . . . , dpl 〉⊗|t1, . . . , tl〉 ⊗ |1〉) , (27)
where |dpi 〉, i = 1, . . . , l, represents the quantum registerwhich possesses information about the distance between afeature from the database and the user feature, and d(t, rp)represents the value of the Hamming distance between thefeatures. The performing of a Hadamard operation on theadditional qubit c0 allows us to obtain the description ofthe final state for the stage related to the quantum k-NN
algorithm:
|ψ5〉 = 1√L
L∑p=1
(cos
( π2ld(t, rp)
)|dp1, . . . , dpl 〉
⊗|t1, . . . , tl〉 ⊗ |0〉+ sin
( π2ld(t, rp)
)|dp1, . . . , dpl 〉
⊗|t1, . . . , tl〉 ⊗ |1〉) .
(28)
The probability of measuring state zero on qubit c0,i.e., of success in obtaining the desired probability withthe correct indication of the recommended elements, is
P (c0) =1
L
∑p
cos2( π2ld(t, rp)
). (29)
If we have measured the state |1〉 on qubits c0, thenrestoring the state |ψ4〉 is necessary. We can do this byperforming an inverse operation to U given by Eqn. (24)i.e., U †. It will require the same amount of work asthe operation U . This operation restores the state ofthe registry to the state before the Hamming distancecalculation and summing. In this way, we avoid the costlyprocess of rebuilding the database.
If, by simplifying, we denote by P the probabilityof receiving a particular recommendation, then obtaininga probability distribution with the accuracy of ε requiresgenerally O(P × (1 − P ) × 1/ε2) repetitions of theimplementation of a quantum algorithm of k-nearestneighbours. It should be emphasized, however, thatusing classical multiprocessor solutions we can use manyquantum machines to solve the same task, for example,to obtain linear calculation time of the probabilitydistribution for the exponential amount of data L,described by Eqn. (22).
Remark 5. (Improving the distribution of the prob-ability of success) The distribution of the probabilityobtained is not satisfactory if the user’s designated featuredetermines the choice of one compatible item or a fewitems from a very close neighbourhood. Therefore, usingthe Grover algorithm significantly improves the finalprobability distribution of the obtained recommendation(Brassard and Hoyer, 1997).
Therefore, assuming that zero was obtained bymeasuring the state of c0, we get the state of
|ψ6〉 = 1√L
L∑p=1
mp|ψrcmd〉, (30)
where |ψrcmd〉 = |dp1, . . . , dpl 〉 ⊗ |t1, . . . , tl〉 ⊗ |0〉 and mp
is represented by the probability amplitudes obtained afterthe measurement. It is possible to emphasize g amplitudes
Recommendation systems with the quantum k-NN and Grover algorithms . . . 147
for recommended elements denoted as mrp and also L− g
amplitudes with lower compatibility denoted by mnrp :
|ψ7〉 = 1√L
(g∑
p=1
mrp|ψrcmd〉+
L∑p=g+1
mnrp |ψrcmd〉
).
(31)The average for amplitudes and variance for probabilityamplitudes for state |ψ7〉 take the following form basedon the work of (Biham et al., 1999):
mr(t) =1
g
g∑p=1
mrp(t),
mnr(t) =1
g
L∑p=g+1
mnrp (t),
σ2nr(t) =
1
L− g
L∑p=g+1
|mnrp (t)−mnr(t)|2,
(32)
where t is the iteration number commonly referred toas the execution time of the Grover algorithm, fort = 0; of course, the initial values of the distributionof probability amplitudes are known (Biham et al.,1999). The highest probability of measuring the existingrecommended elements from the database is described as
Pmax = 1− (L − g)σ2nr
− 1
2
((L− g)|mnr(0)|2 + g|mr(0)|2
)
+
(1
2|(L− g)mnr(0)
2+ gmr(0)
2|),
(33)
with the O(√L/g) iteration.
4. Numerical experiments
The given algebraic relations describing the states ofthe quantum register after particular stages of theimplementation of Algorithm 1 allow us to give numericalexamples presenting the behaviour of the recommendingsystem. Figure 4 presents probability distributionsshowing the state of the registry for a database of 16elements. There have been cases where one element f15or two elements f6 and f13 are amplified in the databasethat match the user’s expectations
Regardless of the case, Fig. 4(a) shows theprobability distribution for the database after theexecution of the k-NN part. In both the cases we haveno explicit indication of the element nearest to the userfeature, although, for the eligible feature, for example,f10, we have the highest probability of measuring thisstate. However, since the difference in the Hammingdistance for the other features was not significantlylarge, the other elements also have a high probability ofmeasurement. If we stop the process of recommendation
at this stage, we should repeat the implementation of Steps(I)–(IV) from Algorithm 1 and the measurement operationto collect the results with the probability described byEqn. (29).
It should also be emphasized that in Figs. 4(b) and(c) we describe the case of a recommendation of only oneelement. However, the quality of the amplification willbe good for two or more elements that will be indicatedby a user-specified characteristic, which is presented inFigs. 4(e) and (f). It should be emphasized that the sum ofprobabilities of recommended elements is determined byEqn. (33).
Remark 6. (Use of Grover’s algorithm) At this moment,one can ask whether instead of using k-NN we can employonly the Grover algorithm to amplify the recommendedelement or more elements. The answer is affirmative onlyif there is an element or elements with the same vectorof features as the user’s requirements. If, however, thereis no such element, then the Grover algorithm will notbe able to amplify elements similar to those defined bythe user. It should be added that you do not have tospecify that we are interested in amplify only one elementthat fully complies with the user-defined characteristic.You can specify that we amplify the k-NN result for aleading user-specified characteristic (e.g., historical films,and the idea of such a gain is shown in Figs. 4(f)–(h)).Therefore, the combination of two quantum k-NNalgorithms and amplitude amplification using the Groveralgorithm allows building a recommendation system.
According to Remark 6, it is reasonable to usethe Grover algorithm to amplify the probability for theelement (elements) indicated by k-NN. For t > 1we already have a very high probability (close to themaximum theoretical value) to measure the recommendedelement. However, even for smaller values of t it amplifiesthe desired effect, while maintaining the probabilitydistribution obtained after the quantum part k-NN. Thisalso means shortening the circuit’s operational time,because you do not have to do additional iterations duringthe amplification with the help of the Grover algorithm.
The legitimacy of the application of the Groveralgorithm is also shown the cases described byFig. 4 (g)–(i) where the user’s feature does not preciselyindicate a single element or two, but the whole group. Theamplification for the group also retains the differences inprobabilities for individual elements. Additionally, thebigger group of amplified amplitudes makes the valueof time smaller for other cases where only one or twoelements were recommended. In the case when thereis no recommendation, the initial probability distributionobtained after executing the k-NN part will be preserved,which is presented in the example described Fig. 4(j)–(l).
148 M. Sawerwain and M. Wróblewski
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
Probability
Probability distribution after qk-NN part
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.000
0.025
0.050
0.075
0.100
0.125
0.150
0.175
Probability
Probability distribution after Grover iteration t=0.25
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.0
0.1
0.2
0.3
0.4
0.5
Probability
Probability distribution after Grover iteration t=1.0
(a) (b) (c)
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
Probability
Probability distribution after qk-NN part
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.000
0.025
0.050
0.075
0.100
0.125
0.150
0.175
Probability
Probability distribution after Grover iteration t=0.25
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
Probability
Probability distribution after Grover iteration t=1.0
(d) (e) (f)
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
Probability
Probability distribution after qk-NN part
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Probability
Probability distribution after Grover iteration t=0.05
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
Probability
Probability distribution after Grover iteration t=0.2
(g) (h) (i)
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
Probability
Probability distribution after qk-NN part
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
Probability
Probability distribution after Grover iteration t=0.25
f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15
Labels for features
0.00
0.02
0.04
0.06
0.08
0.10
Probability
Probability distribution after Grover iteration t=1.0
(j) (k) (l)
Fig. 4. Distribution of probability in the quantum register (for a database with 16 elements) at three main stages in the proposedrecommendation system. The probability after the stage of database initialization is not shown because of an even probabilitydistribution for each case. In this case each feature fi has a probability of 1/16. Cases (a), (d), (g) and (j) represent the quantumstate after the execution of the part realisations of the k-NN algorithm. Cases (b), (c) and (e), (f) show the process of amplifyingthe amplitude of one (feature f10) and two recommended elements (f3, f9) in the database using the Grover algorithm. Cases(h), (i) describe a situation when the feature described by the user indicates a larger group of searched elements and Groveramplification affects a group of matching elements (labels from f9 to f15 and labels are grouped for better clarity of plots).Cases (k), (l) present a situation when in a database there are no elements compatible with the user’s feature; in this case theinitial probability distribution is preserved and the Grover algorithm does not amplify any elements.
Recommendation systems with the quantum k-NN and Grover algorithms . . . 149
5. Conclusions
The article presented the structure of the recommendationsystem based on the quantum k-NN and Groveralgorithms. The discussed approach is characterized bybetter computational complexity in the recommendationprocess. However, the construction of the database canbe done only once, at the beginning of the mentionedstage. The process of its construction depends onlyon the amount of data. The probability value of therecommendation’s success was also indicated, showingthat it depends directly on the Hamming distance andamplitude amplification using the Grover algorithm.
One of the next tasks may be a different approachto verifying the correctness operation of the system usingthe quantum predicate system (D’Hondt and Panangaden,2006; Gielerak and Sawerwain, 2010). An attempt toimplement the presented system is also planned to checkthe quality of recommendations on existing systems ofexperimental installations of quantum computing systems(IBM, 2018). They offer access to a quantum registerof 20 qubits. This is not a sufficient number to build alarger system. However, technological advances seem tosoon allow the construction of a system containing over50 qubits, and this size is already available in the system(IBM, 2018).
The sample full database includes over 12,000records. However, in the case of quantum calculations,the quantum register stores an exponential amount ofdata. Therefore, in the case of the OMDB movie databaseand indexes of individual films, we only need 15 qubits:215 > 12, 000. Therefore, if nearby quantum processingsystems offer access to 50 qubits, this number will allowfull implementation of the proposed solution.
Acknowledgment
We appreciate the useful discussions within the Q-INFO group at the Institute of Control and ComputationEngineering of the University of Zielona Góra, Poland.We would like also to thank the anonymous referees fortheir comments on the preliminary version of this paper.The numerical results were obtained using the hardwareand software available at the GPU μ-Lab located at theInstitute of Control and Computation Engineering of theUniversity of Zielona Góra.
References
Aaronson, S. and Gottesman, D. (2004). Improved simulation ofstabilizer circuits, Physical Review A 70(5): 052328, DOI:10.1103/PhysRevA.70.052328.
Alpaydin, E. (2004). Introduction to Machine Learning (Adap-tive Computation and Machine Learning), MassachusettsInstitute of Technology Press, Cambridge, MA.
Arikan, E. (2003). An information-theoretic analysis of Grover’salgorithm, in A.S. Shumovsky and V.I. Rupasov (Eds.),Quantum Communication and Information Technologies,Springer Netherlands, Dordrecht, pp. 339–347.
Armbrust, M., Fox, A., Griffith, R., Joseph, A.D., Katz, R.,Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica,I. and Zaharia, M. (2010). A view of cloud computing,Communications of the Association for Computing Ma-chinery 53(4): 50–58, DOI: 10.1145/1721654.1721672.
Barenco, A., Bennett, C.H., Cleve, R., DiVincenzo, D.P.,Margolus, N., Shor, P., Sleator, T., Smolin, J.A. andWeinfurter, H. (1995). Elementary gates for quantumcomputation, Physical Review A 52(5): 3457–3467, DOI:10.1103/PhysRevA.52.3457.
Biham, E., Biham, O., Biron, D., Grassl, M. and Lidar,D. (1999). Grover’s quantum search algorithm for anarbitrary initial amplitude distribution, Physical Review60(4): 2742–2745, DOI: 10.1103/PhysRevA.60.2742.
Brassard, G. and Hoyer, P. (1997). An exact quantumpolynomial-time algorithm for Simon’s problem, Pro-ceedings of the 5th Israeli Symposium on Theory ofComputing and Systems, Ramat Gan, Israel, DOI:10.1109/ISTCS.1997.595153.
Busemeyer, J. and Bruza, P. (2012). Quantum Models ofCognition and Decision, Cambridge University Press,Cambridge.
Chakrabarty, I., Khan, S. and Singh, V. (2017). Dynamicgrover search: Applications in recommendation systemsand optimization problems, Quantum Information Pro-cessing 16(6): 153, DOI: 10.1007/s11128-017-1600-4.
D’Hondt, E. and Panangaden, P. (2006). Quantum weakestpreconditions, Mathematical Structures in Computer Sci-ence 16(3): 429–451.
Galindo, A. and Martin-Delgado, M. A. (2000). A family ofGrover’s quantum searching algorithms, Physical ReviewA 62(6): 062303, DOI: 10.1103/PhysRevA.62.062303.
Galindo, A. and Martin-Delgado, M.A. (2002). Informationand computation: Classical and quantum aspects,Reviews of Modern Physics 74(2): 347–423, DOI:10.1103/RevModPhys.74.347.
Gielerak, R. and Sawerwain, M. (2010). Generalised quantumweakest preconditions, Quantum Information Processing9(4): 441–449, DOI: 10.1007/s11128-009-0151-8.
Grover, L.K. (1996). A fast quantum mechanical algorithmfor database search, Proceedings of the 28th An-nual ACM Symposium on Theory of Computing,STOC’96, Philadelphia, PA, USA, pp. 212–219, DOI:10.1145/237814.237866.
Hechenbichler, K. and Schliep, K. (2004). Weightedk-nearest-neighbor techniques and ordinal classification,Technical report, Ludwig-Maximilians-UniversitätMünchen, München, https://epub.ub.uni-muenchen.de/1769/1/paper_399.pdf.
IBM (2018). Q Experience, https://quantumexperience.ng.bluemix.net/.
150 M. Sawerwain and M. Wróblewski
Li, C.-K., Roberts, R. and Yin, X. (2013). Decompositionof unitary matrices and quantum gates, InternationalJournal of Quantum Information 11(1): 1350015, DOI:10.1142/S0219749913500159.
Montanaro, A. (2017). Quantum pattern matchingfast on average, Alghoritmica: An InternationalJournal in Computer Science 77(1): 16–39, DOI:10.1007/s00453-015-0060-4.
Nielsen, M. and Chuang, I. (2010). Quantum Computationand Quantum Information: 10th Anniversary Edition,Cambridge University Press, Cambridge.
Nielsen, P. (2016). Big data analytics—a brief researchsynthesis, in L. Borzemski et al. (Eds.), Information Sys-tems Architecture and Technology, Springer InternationalPublishing, Cham, pp. 3–9.
OMDb (2018). Homepage, http://www.omdbapi.com/.
Pinkse, P., Goorden, S., Horstmann, M., Skoric, B. andMosk, A. (2013). Quantum pattern recognition, Con-ference on Lasers and Electro-Optics Europe (CLEOEUROPE/IQEC) and International Quantum ElectronicsConference, Munich, Germany, p. 1–1.
Santucci, E. (2017). Quantum minimum distance classifier, En-tropy 19(12): 659, DOI: 10.3390/e19120659.
Sawerwain, M. and Wróblewski, M. (2019). Applicationof quantum k-nn and Grover’s algorithms forrecommendation big-data system, in L. Borzemskiet al. (Eds.), Information Systems Architecture andTechnology, Springer International Publishing, Cham,pp. 235–244.
Schuld, M., Sinayskiy, I. and Petruccione, F. (2014). Quantumcomputing for pattern classification, in D.-N. Pham andS.-B. Park (Eds.), PRICAI 2014: Trends in ArtificialIntelligence, Springer International Publishing, Cham,pp. 208–220.
Sergioli, G., Bosyk, G.M., Santucci, E. and Giuntini, R.(2017). A quantum-inspired version of the classificationproblem, International Journal of Theoretical Physics56(12): 3880–3888, DOI: 10.1007/s10773-017-3371-1.
Sergioli, G., Santucci, E., Didaci, L., Miszczak, J.A. andGiuntini, R. (2018). A quantum-inspired version of thenearest mean classifier, Soft Computing 22(3): 691–705,DOI: 10.1007/s00500-016-2478-2.
Shende, V. and Markov, I.L. (2009). On the CNOT-cost ofTOFFOLI gates, Quantum Information & Computation9(5): 461–486.
Shor, P. (1999). Polynomial-time algorithms for primefactorization and discrete logarithms on a quantumcomputer, SIAM Review 41(2): 303–332, DOI:10.1137/S0036144598347011.
Steane, A. (1998). Quantum computing, Reportson Progress in Physics 61(2): 117–173, DOI:10.1088/0034-4885/61/2/002.
Stefanowski, J., Krawiec, K. and Wrembel, R. (2017). Exploringcomplex and big data, International Journal of AppliedMathematics and Computer Science 27(4): 669–679, DOI:10.1515/amcs-2017-0046.
Trugenberger, C.A. (2002). Quantum pattern recognition,Quantum Information Processing 1(6): 471–493, DOI:10.1023/A:1024022632303.
Veloso, B., Malheiro, B. and Burguillo, J.C. (2015).A multi-agent brokerage platform for media contentrecommendation, International Journal of Applied Math-ematics and Computer Science 25(3): 513–527, DOI:10.1515/amcs-2015-0038.
Walther, P., Resch, K.J., Rudolph, T., Schenck, E., Weinfurter,H., Vedral, V., Aspelmeyer, M. and Zeilinger, A.(2005). Experimental one-way quantum computing, Na-ture 434(0): 169–176, DOI: 10.1038/nature03347.
Wiebe, N., Kapoor, A. and Svore, K. (2015). Quantumalgorithms for nearest-neighbor methods for supervisedand unsupervised learning, Quantum Information andComputation 15(3–4): 316–356.
Wisniewska, J. and Sawerwain, M. (2018). Recognizing thepattern of binary Hermitian matrices by quantum knnand SVM methods, Vietnam Journal of Computer Science5(3): 197–204, DOI: 10.1007/s40595-018-0115-y.
Marek Sawerwain received the MSc and PhDdegrees at the Faculty of Electrical Engineer-ing, Computer Science and Telecommunications,University of Zielona Góra, Poland, in 2004 and2010, respectively. He is currently with the samefaculty as an assistant professor. His present re-search interests include quantum communicationmethods, quantum computation methods, the the-ory of quantum programming languages, simula-tions of quantum computations models and cre-
ating effective algorithms for solutions based on modern multicore CPUand GPU technology.
Marek Wróblewski received the MSc degree atthe Faculty of Computer, Electrical and ControlEngineering, University of Zielona Góra, Poland,in 2015. He is currently associated with the samefaculty as an assistant lecturer and a PhD student.His present research interests include big datatechnologies, quantum computation, databasesand programming languages.
Received: 20 November 2018Revised: 7 January 2019Accepted: 18 January 2019