13
Top-down Neural Attention SELECTIVE ATTENTION FROM A DEEP NEURAL NET 1 General's Family by Octavio Ocampo

Top-down Neural Attention

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Top-down Neural Attention

Top-down Neural AttentionS E L E C T I V E AT T E N T I O N F R O M A

D E E P N E U R A L N E T

1

General's Family by Octavio Ocampo

Page 2: Top-down Neural Attention

Background

2

Understanding Artificial Neural Networks

© Jianming Zhang, derivative work. Original image credit: soul wind / stock.adobe.com

Page 3: Top-down Neural Attention

Problem Definition

3

Deep CNN

• animal• elephant• zebra• grass• africa

elephant

Top-down Attention Map Top-down Signal

Page 4: Top-down Neural Attention

Probabilistic Winner-Take-All

4[1] Tsotsos et al. “Modeling Visual Attention via Selective Tuning.” Artificial Intelligence, 1995.

Winner-Take-All [1]

Marginal Winning Probability (MWP): Equivalent to an Absorbing Markov

Chain process.

output layer

Probabilistic WTA

Page 5: Top-down Neural Attention

Excitation BackpropAssumptions:§ The response of the activation neuron is non-negative.§ An activation neuron is tuned to detect certain visual features. Its response is positively

correlated to its confidence of the detection.

5

ActivationLayer N

ActivationLayer N-1

+++_

Inhibitory Neuron

Excitatory Neuron

Page 6: Top-down Neural Attention

Excitation BackpropAssumptions:§ The response of the activation neuron is non-negative.§ An activation neuron is tuned to detect certain visual features. Its response is positively

correlated to its confidence of the detection.

6

Page 7: Top-down Neural Attention

A Common Issue: Insensitiveness to Top-down Signals

7

zebra elephant

Dominant neurons always win

Page 8: Top-down Neural Attention

Contrastive Attention

8

zebra elephant

elephant zebra

Page 9: Top-down Neural Attention

Negating the Output Layer for Contrastive Signals

9

zebraclassifier

non-zebraclassifier

zebra map non-zebra map

Thanks to our Excitation Backprop formulation:§ Contrastive attention map can be computed by a single pass§ The pair of maps are well normalized with our probabilistic framework§ The pair of maps are positive-valued

Page 10: Top-down Neural Attention

Evaluation: The Pointing Game§ Task:

› Given an image and an object category, point to the targets.§ Metric:

› Pointing accuracy. › Pointing anywhere on the targets is fine.

§ Dataset:› VOC07 (20 categories)› COCO (80 categories)

§ CNN Models:› CNN-S [Chatfield et al. BMVC’14]› VGG16 [Simonyan et al. ICLR’15]› GoogleNet [Szegedy et al. CVPR’15]

§ Model training:› Multi-label cross-entropy loss

10

credit: elena milevska / stock.adobe.com

Page 11: Top-down Neural Attention

Results

11

Mean Accuracy over Object Categories in the Pointing Game

Page 12: Top-down Neural Attention

Qualitative Comparison

12

Page 13: Top-down Neural Attention

Text-to-Region Association§ Visualizing the top-down attention of a CNN classifier for ~18K tags.

13