1
What do gates learn?Visualization & Analysis
Zhengbo [email protected]
Shanghai Jiao Tong University
2
Outline
1. Why we want to visualize?
2. How can we visualize?
3. Experiment results
4. Future work
3
Why visualize?
What do gates learn? Weights! Correct but meaningless for human beings.Weight matrix between two adjacent layers:
4
Why visualize?
We have much more ability in dealing with pictures than numbers
We have powerful eyes: over 500 million pixels
5
About data & network
Data: 1000 random rectangles on 10 by 10 gridExample:
Structure [100, 10, 100], fully connected
6
How to visualize
If Gate learns some property, we may expect relationship between property and activationfor each gate:
7
How to visualize
If Gate learns some property, we may expect relationship between property and activationfor each gate:
8
Experiment Results
For 10 hidden gates:
9
What can we find?
For one of the gates, it seems that there is a linear relationship between the activation and area, we can call it linear gate.
For many independent experiments, we can always find exactly one linear gate.
It may help if we visualize while training process
10
Start of training
11
Middle of training
12
End of training
13
Some explanation
Auto encoder → dimension reduction
Less redundancy between reduced dimensions
Linear gate + linear gate = linear gate, thus it is better to have only one linear gate!
14
Visualization method 2
If a gate learns some property, the gate is activated when a input rectangle has this property.
For each gate, add the rectangles that can cause high activation for this gate.
15
Experiment Result 2
16
Explanation 2
Gates learn about rectangles by being sensitive to different areas of the gird.
17
Future work
1. Find more visualization methods.
2. Try different network structures
3. Try different set of input (circles, diamonds)
4. Formalize the problem and try to prove something.
18
Thank you for listening!
Comments & Suggestions?