51
ichigan State University Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence Laboratory Computer Science and Engineering Michigan State University, Lansing, USA

Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Embed Size (px)

DESCRIPTION

Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways. Zhengping Ji Embodied Intelligence Laboratory Computer Science and Engineering Michigan State University, Lansing, USA. Outline. Attention and recognition: Chicken-egg problem - PowerPoint PPT Presentation

Citation preview

Page 1: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 1

Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and

“What” Pathways

Zhengping JiEmbodied Intelligence Laboratory

Computer Science and EngineeringMichigan State University,

Lansing, USA

Page 2: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 2

Outline Attention and recognition: Chicken-egg problem Motivation: brain inspired, neuromorphic, brain’s

visual pathway Saliency-based attention Where-what Network (WWN):

How to integrate the saliency-based attention & top-down attention control

How attention and recognition helps each other Conclusions and future work

Page 3: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 3

What is attention?

Page 4: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 4

Bottom-up Attention (Saliency)

Page 5: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 5

Bottom-up Attention (Saliency)

Page 6: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 6

Attention Shifting

Page 7: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 7

Attention Shifting

Page 8: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 8

Attention Shifting

Page 9: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 9

Attention Shifting

Page 10: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 10

Spatial Top-down Attention Control

Page 11: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 11

Spatial Top-down Attention Control

e.g. pay attention to the center

Page 12: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 12

Object-based Top-down Attention Control

Page 13: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 13

Object-based Top-down Attention Control

e.g. pay attention to the square

Page 14: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 14

Chicken-egg Problem

Without attention, recognition cannot do well: recognition requires attended areas for the

further processing. Without recognition, attention is limited:

not only bottom-up saliency-based cues, but also top-down object-dependant signals and top-down spatial controls.

Page 15: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 15

Problem

Page 16: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 16

Challenge High-dimensional space Background noise Large variance

Scale Shape Illumination View point …..

Page 17: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 17

Saliency-based Attention (I)

IHDR Tree

IHDR Tree Heading

Direction

Boundary Detection Part

The mapping from two visual images to correct road boundary type for

each sub-window

(Reinforcement Learning)

Action Generation Part

The mapping from road boundary type to correct heading

direction

(Supervised Learning)

e1Desired

Path

Win1Win2

Win3 Win4Win5

Win6

e2 e3 e4 e5 e6

Naïve way: attention window

by guessing

Page 18: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 18

Saliency-based Attention (II)

Low-level image

processing

Itti & Koch et al. 1998

Page 19: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 19

Review Attention and recognition: Chicken-egg problem Motivation: brain inspired, neuromorphic, brain’s

visual pathway Saliency-based attention Where-what Network (WWN):

How to integrate the saliency-based attention & top-down attention control

How attention and recognition helps each other Conclusions and future work

Page 20: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 20

Biological Motivations

Page 21: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 21

Challenge: Foreground Teaching

How does a neuron separate a foreground from a complex background? No need for a teacher to hand-segment the

foreground Fixed foreground, changing background

E.g., during baby object tracking The background weights are averaged out

(no effect during neuronal competition)

Page 22: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 22

Novelty Bottom-up attention:

Koch & Ullman in 1985, Itti & Koch et al. 1998, Baker et al. 2001, etc. Position based top-down control:

Olshausen et al. 1993, Tsotsos et al. 1995, Mozer et al. 1996, Schill et al. 2001, Rao et al. 2004, etc.

Object based top-down control: Deco & Rolls 2004 (no performance evaluation), etc.

Our work: Saliency is developed features Both bottom-up and top-down based control Top-down: either object, position or none Attention and recognition is a single process

Page 23: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 23

ICDL Architecture

ImageV1 V2

“what”-motor

40*40

11*11

11*1111*1121*21

“where”-motor

(r, c) 40*40 pixel-based

Size fixed: 20*20

global

global

Page 24: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 24

Multi-level Receptive Fields

Page 25: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 25

Layer Computation Compute pre-response of cell (i, j) at

time t

Sort: z1 ≥ z2 ≥ … zk… ≥ zm;

Only top-k neurons respond to keep selectiveness and long-term memory

Response range is normalized Update the local winners

Page 26: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 26

In-place Learning Rule Do not use back-prop

Not biologically plausible Does not give long-term memory

Do not use any distribution model (e.g., Gaussian mixture) Avoid high complexity of covariance matrix

New Hebbian like rule: With automatic plasticity scheduling: only winners

update Minimum error toward target in every incremental

estimation stage (local first principal component)

Page 27: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 27

Top-down Attention

Recruit & identify class invariant

features

Recruit & identify position invariant

features

Page 28: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 28

Experiment

Foreground objects defined by “what” motor (20*20)

Attended areas defined by “where” motor

Randomly Selected background patches (40*40)

Page 29: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 29

Developed Layer 1

Bottom-up synaptic weights of neurons in Layer 1, developed through randomly selected patches from natural images.

Page 30: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 30

Developed Layer 2

Bottom-up synaptic weights of neurons in Layer 2.

Not Intuitive for understanding!!

Page 31: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 31

Response Weighted Stimuli for Layer 2

Page 32: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 32

Experimental Result I

Recognition rate with incremental learning

Page 33: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 33

Experimental Result II

(a) Examples of input images; (b) Responses of attention (“where”) motors when supervised by “what” motors. (c) Responses of attention (“where”) motor when “what” supervision is not available.

Page 34: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 34

Summary

“What” motor helps to direct attention of network to features of particular object;

“Where” motor helps to direct attention to positional information (from 45% to 100% accurate when “where” information is present);

Saliency-based bottom-up attention, location-based top-down attention, and object-based top-down attention are integrated in the top-k spatial competition rule;

Page 35: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 35

Problems The accuracy for the “where” motors is not good:

45.53% Layer 1 was developed offline; More layers are needed to handle more positions Where motor should be given externally, instead of

retina-based representation No internal iterations especially when the number of

hidden layers is larger than one No cross-level projections

Page 36: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 36

Fully Implemented WWN (Original Design)“where”-motor

Image(40*40)

V1(40*40)

V2(40*40)

V4(40*40) “what”-motor: 4 objects

11*1111*11

11*1121*21

V3LIP

31*31

IT (40*40)

MT PP

(r, c) 25 center

Fixed size motor

global global

Page 37: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 37

Problems The accuracy for “where” and “what” motors are not good:

25.53% for “what” motor and 4.15% for “where” motor Too many parameters to be tuned Training is extremely slow How to do the internal iterations

“Sweeping” way: always use recently updated weights and responses.

Always use p-1 weights and responses, where p records the current number of iterations.

The response should not be normalized in each lateral inhibition neighborhood.

Page 38: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 38

Modified Simple Architecture

ImageV1 V2

“what”-motor : 5 Objects

40*40

11*11

11*1111*1121*21

“where”-motor

(r, c) 5 centers

Size fixed: 20*20

global

global

Retina-based supervision

Page 39: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 39

Advantage

Internal iterations are not necessary Network is running much faster Easier to track neural representations and

evaluate performance Performance evaluation

What motor reaches 100% accuracy for disjoint test

Where motor reaches 41.09% accuracy for disjoint test

Page 40: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 40

Problems

Top-down projection from motor

+Bottom-up responses

Top-down responses

Total responses

Dominance by Top-down Projection

Page 41: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 41

Solution

Sparse bottom-up responses by only keeping local top-k winner of bottom-up responses

The performance of where motor increases from around 40% to 91%.

Page 42: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 42

Fully Implemented WWN (Latest)“where”-motor

Image(40*40)

V1(40*35)

V2(40*40)

V4(40*40)

“what”-motor: 5 objects(smoothing by Gaussian)

11*1111*11

11*1121*21

MT(r, c) 3*3 center

Fixed-size: 20*20

(smoothing by Gaussian)(40*40)

Each cortex: Modified ADAST

Page 43: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 43

Modified ADAST

Previous Cortex

L4 L2/3

L6 (ranking)

L5 (ranking)

L2/3Next Cortex

Page 44: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 44

Other improvements

Smooth the external motors using Gaussian function

Where motors are evaluated by regression errors Local top-k is adaptive by neuron positions The network does not converge by internal

iterations learning rate for top-down excitation is adaptive by

internal iterations. Using context information

Page 45: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 45

Layer 1 – Bottom-up Weights

Page 46: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 46

Layer 2 – Response-weighted Stimuli

Page 47: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 47

Layer 3 (Where) – Top-down Weights

Page 48: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 48

Layer 3 (What) – Top-down Weights

Page 49: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 49

Test Samples

Input “Where” motor (ground truth) “What” motor (ground truth)

“Where” output (Saliency-based) “Where” output (“What” supervised) “What” output (Saliency-based) “What” output (“Where” supervised)

Page 50: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 50

Performance Evaluation

Without supervision

Supervise “Where”

Supervise “What”

“Where” motor (regression error: MSE)

4.137 pixels N/A 4.137 pixels

“What” motor (classification

error: %)12.7% 12.1 % N/A

Average error for “where” and “what” motors (250 test samples)

Page 51: Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways

Michigan State University 51

Discussions