Deep belief nets experiments and some ideas. Karol Gregor NYU/Caltech

Preview:

Citation preview

Deep belief nets experiments and some ideas.

Karol GregorNYU/Caltech

Outline

DBN Image database experiments

Temporal sequences

Deep belief network

Input

H1

H2

H3

Labels

Backprop

Preprocessing – Bag of words of SIFT

Images Features (using SIFT)

Group them (e.g. K-means)

Bag of words

Image1 Image2Word1 23 11Word2 12 55Word3 92 33… … …

With: Greg Griffin (Caltech)

13 Scenes Database – Test error

Train error

- Pre-training on larger dataset- Comparison to svm, spm

Explicit representations?

Compatibility between databases

Pretraining: Corel databaseSupervised training: 15 Scenes database

Temporal Sequences

Simple prediction

t-1 t-2 t-3

t

X

Y

W

Supervised learning

With hidden units(need them for several reasons)

t-1,t-2,t-3 t

t

Ht-1,t-2,t-3

G

X Y

¡ E =WX Y Hi j k X iYj Hk +WY H

j k Yj Hk +WYj Yj +WH

k Hk

Memisevic, R. F. and Hinton, G. E., Unsupervised Learning of Image Transformations. CVPR-07

Example

pred_xyh_orig.m

Additions t-1 t

t

Ht-1

G

X Y

¡ E =WX Y Hi j k X iYj Hk +WY H

j k Yj Hk +WYj Yj +WH

k Hk

Sparsity: When inferring the H the first time, keep only the largest n units on

Slow H change: After inferring the H the first time, take H=(G+H)/2

Examples

pred_xyh.m

present_line.m

present_cross.m

Sensese.g. Eye (through

retina, LGN)

Muscles(through sub-

cortical structures)

Hippocampus

e.g. See: Jeff Hawkins: On Intelligence

Cortical patch: Complex structure(not a single layer RBM)

From Alex Thomson and Peter Bannister, (see numenta.com)

Desired properties

A B C D E F GH J K L E F H

1) Prediction

2) Explicit representations for sequences

VISIONRESEARCH

time

3) Invariance discovery

time

e.g. complex cell

4) Sequences of variable length

VISIONRESEARCH

time

5) Long sequences

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 1 2 3 5 8 13 21 34 55 89 1441 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ? ? 2 2 2 2 2 2 2 2 2 2 Layer1

Layer2

6) Multilayer

VISIONRESEARCH

time

- Inferred only after some time

7) Smoother time steps

8) Variable speed

- Can fit a knob with small speed range

9) Add a clock for actual time

Sensese.g. Eye (through

retina, LGN)

Muscles(through sub-

cortical structures)

Hippocampus

Sensese.g. Eye (through

retina, LGN)

Muscles(through sub-

cortical structures)

Hippocampus

In Addition

- Top down attention- Bottom up attention- Imagination- Working memory- Rewards

Training data

- Videos-Of the real world-Simplified: Cartoons (Simsons)

-A robot in an environment -Problem: Hard to grasp objects

-Artificial environment with 3D objects that are easy to manipulate (e.g. Grand theft auto IV with objects)