SPARSE REPRESENTATIONS FOR IMAGE CLASSIFICATION … · B. IMPLEMENTATION ON D-WAVE MACHINE C. SPARSE CODING FOR OBJECT DETECTION D.SUMMARY AND FUTURE WORK. D. SUMMARY first demonstration

SPARSE REPRESENTATIONS FOR IMAGE CLASSIFICATION USING

QUANTUM D-WAVE 2X MACHINE

Nga Nguyen1, Amy Larson1, Carleton Coffrin2, and Garrett Kenyon1,3 1CCS-3, 2A-1, Los Alamos National Laboratory & 3New Mexico Consortium

D-Wave Debrief, LANL, April 27, 2017

OUTLINE

A. SPARSE CODING REPRESENTATIONS B. IMPLEMENTATION ON D-WAVE MACHINE C. SPARSE CODING FOR OBJECT DETECTION D. SUMMARY AND FUTURE WORK

OUTLINE

A. SPARSE CODING REPRESENTATIONS B. IMPLEMENTATION ON A D-WAVE MACHINE C. SPARSE CODING FOR OBJECT DETECTION D. SUMMARY AND FUTURE WORK

A. SPARSE CODING REPRESENTATIONS

Solving a sparse-coding (SC) problem

Lp-sparseness penalty

• non-convex problem • NP-hard class

reconstruction errorOlshausen and Field, Nature 381, 607 (1996)

Rozell, Johnson, Baraniuk, and Olshausen, Neur. Comp. 20, 2526 (2008)

p=0, the problem is called L0-norm

courtesy of D-Wave

Objective function is of the form:

Solving a sparse-coding (SC) problem

Objective function is of the form:

Lp-sparseness penaltyreconstruction errorOlshausen and Field, Nature 381, 607 (1996)

Rozell, Johnson, Baraniuk, and Olshausen, Neur. Comp. 20, 2526 (2008)

courtesy of Xinhua Zhang

an example of SC reconstruction

A. SPARSE CODING REPRESENTATIONS

* =

OUTLINE

A. SPARSE CODING REPRESENTATIONS B. IMPLEMENTATION ON A D-WAVE MACHINE C. SPARSE CODING FOR OBJECT DETECTION D. SUMMARY AND FUTURE WORK

D-Wave Hamiltonian:

where

mapping the sparse-coding problem onto a Quantum Unconstrained Binary Optimization (QUBO):

SC ON A QUANTUM D-WAVE MACHINE

This mapping is achieved by the relations:

analogous to L0-sparseness penalty [Nguyen and Kenyon, PMES-16 (2016)]

SC ON A QUANTUM D-WAVE MACHINE

mapping the sparse-coding problem onto a Quantum Unconstrained Binary Optimization (QUBO):

D-Wave Hamiltonian:

where

OUTLINE

A. SPARSE CODING ON A QUANTUM D-WAVE B. IMPLEMENTATION ON D-WAVE MACHINE C. SPARSE CODING FOR OBJECT DETECTION D. SUMMARY AND FUTURE WORK

DATASET32x32

24x24

airplane

automobile

ship

truck

CIF

AR

-10

edge

detection

8 hand-designed features

“row” “column”

orthogonality!number of features

Features

8 hand-designed features

“row” “column”

orig

reco

n

Features

orthogonality!number of features

Apply Gram-Schmidt Algorithm:

Desire: Randomly generated :

• to fulfill the Chimera orthogonality

• the way is generated defines architecture of the mapping

Features

Building features

…

Building features

24x24 patch images

orig

inal

reco

n8x

12x1

2re

con

32x6

x6

airplane automobile ship truck8 and 32 features

24x24 patch images

orig

inal

reco

n8x

12x1

2re

con

32x6

x6

airplane automobile ship truck8 and 32 features

1100 active qubits 3068 coupling strengths

overcomplete order:

stride: 2, 4

orig

inal

reco

n11

52X1

X1re

con

32x6

x6

airplane automobile ship truck

1100 active qubits 3068 coupling strengths

overcomplete order:

32 and 1152 features

stride: 24, 4

24x24 patch images

1 2 3 4 5-2309.2

-2309.0

-2308.8

-2308.6

-2308.4

-2308.2

ensemble

energy

1 2 3 4 5-1738.6

-1738.4

-1738.2

-1738.0

-1737.8

ensemble

energy

image B

image C

0 20 40 60 80 100-2500-2000-1500-1000

0 200 400 600 800 1000

-3000

-2500

-2000

-1500

-1000

-500

0

images

GSEnergy

ensembles

ensembles

Energy

1 2 3 4 5-2073.0

-2072.8

-2072.6

-2072.4

-2072.2

-2072.0

ensemble

energy

image A

ensembles

24x24Nf = 288

airplane automobile ship truckre

con

orig

inal

overcomplete order : 8

stride: 2

12x12 patch images

CLASSIFICATION RESULTS

airplane automobile ship truckre

con

orig

inal

overcomplete order : 8

stride: 2

CLASSIFICATION RESULTS

classes air auto bird cat deer dog frog horse ship truck

accur. (binary)

89.21% 93.38% 90.87% 89.42% 94.71% 88.94% 87.98% 89.9% 89.9% 85.58%

Classification task: SVM (liblinear) 1042 training/208 test images

Nguyen and Kenyon, PMES-16 (2016)

12x12 patch images

COMPARISON WITH A CLASSICAL SOLVER

So far, quantum computation (D-Wave 2X) has NOT outperformed its classical counterpart (GUROBI). Both are comparable.

We already made the problem hard. We need to make it harder.

How can we make the SC problem harder for both?

From SC perspective: more overcomplete, harder to solve… Meanwhile: The full Chimera in D-Wave offers a certain set of (nearest-neighbor) connectivity…


EMBEDDING technique



EMBEDDING technique•Embedding exploits the ability to tie

qubits together •Employ all bipartite couplings•Small number of nodes (qubits) but

more couplings for neurons



EMBEDDING technique5x5



EMBEDDING technique5x5

In practice (D-Wave 2X): Fully connected: 48, 49 nodes on DW2X and DW2X_VFYC, respectively Partially orthogonal: 72 nodes Feature optimization!



STARTING TO SEE SOMETHING GOOD…

solver

problem

72 nodes: partially

Chimera-orthogonal

Energy Time

~ 300 seconds

Energy Time

< 60 seconds

Energy Time

-48.476 30 min

Energy Time

-51.294 few seconds

No. of Hamiltonians: 1

GUROBI (best classical solver)

D-Wave 2X (ISING)

-27.84 -27.84


47 nodes: fully connected

STARTING TO SEE SOMETHING GOOD…

47 nodes: fully connected

70 nodes: partially

Chimera-orthogonal

Energy Time

~ 300 seconds

Energy Time

< 60 seconds

Energy Time

~2000 seconds

Energy Time

< 60 seconds

No. of Hamiltonians: 1

solver

problem


D-Wave 2X (ISING)

-27.84 -27.84


-43.251 -43.251

{ given a set of neuron activity generated by D-Wave 2X, do:

end }

5x5

Feature Learning (in progress)before… feature optimization

Stochastic gradient descent

for iteration for mini_batch %[1:size(sampling)] %update weights end end

5x5

Feature Learning (in progress)before… …after

5x5

many “lazy” features

…THE UNEXPECTEDImprinting technique

randomly sampled imprinting features

GENERATING FEATURES

Does this enhance the “hardness”?

…THE UNEXPECTEDImprinting technique

randomly generated features randomly sampled imprinting features

GENERATING FEATURES

…THE GREAT! UNEXPECTEDImprinting technique

Energy Time

solver

problem


D-Wave 2X (ISING)

Energy Time

< 60 seconds

-129.533 -131.14(cutoff)

~ 9 hours

Sparse coding

Feature learning


Feature learning


100% adaptive features

OUTLINE

A. SPARSE CODING ON A QUANTUM D-WAVE B. IMPLEMENTATION ON D-WAVE MACHINE C. SPARSE CODING FOR OBJECT DETECTION D.SUMMARY AND FUTURE WORK

D. SUMMARY

first demonstration of sparse coding using a quantum computermapping of visual features to D-Wave 2X Chimerabenchmark results on standard image classification taskcompare D-Wave 2X performance with GUROBI obtained solutions to the problems where D-Wave 2X significantly outperforms GUROBI

CIF

AR

-10 airplane

automobile

ship

truck

32x32

30x30

edge color

work in progress…

D. (IN PROGRESS &) FUTURE WORK • optimize features • add colors • hierarchy model • TrueNorth comparison

Documents

SPARSE REPRESENTATIONS FOR IMAGE CLASSIFICATION … · B. IMPLEMENTATION ON D-WAVE MACHINE C. SPARSE CODING FOR OBJECT DETECTION D.SUMMARY AND FUTURE WORK. D. SUMMARY first demonstration