Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Learning, Inference, and Replay of Hidden StateSequences in Recurrent Spiking Neural Networks
Dane Corneil, Emre Neftci, Giacomo Indiveri & Michael Pfeiffer
Institute of Neuroinformatics, ETH Zurich & University of Zurich; Laboratory of Computational Neuroscience, EPFL
NCL
Abstract
Learning to recognize, predict, and generate spatio–temporal patternsand sequences of spikes is a key feature of nervous systems, andessential for solving basic tasks like localization and navigation. Howthis can be done by a spiking network, however, remains an openquestion. Here we present a STDP-based framework extending aprevious model [1], that can simultaneously learn to abstract hiddenstates from sensory inputs and learn transition probabilities [2] betweenthese states in recurrent connection weights.
Network Structure
WTAz1 zK...
STDP-basedlearning rule
sensoryevidence
spiking neuronsW
lateralinhibition
xmx1 . . .
recurrentexcitation
y1 . . . yn
ff
Wrec
Zt-1
Xt-1
Zt
Xt
Zt+1
Xt+1
zt-1 zt zt+1
xt-1 xt xt+1
Each output neuron zk encodes a hidden cause over the input, where
p(zk fires at time t) ∝ exp(uk(t)− I(t))
uk(t) =N∑i=1
wffki · yi(t) +
K∑k′=1
wreckk′ · zk′(t)
wffki = log p(yi = 1|zk = 1,w)
wreckk′ = log p(zk = 1|zk′ = 1,w).
Trained on input generated by a Hidden Markov Model, the activity ofthe Winner-Take-All (WTA) network evolves as a single–sample(unitary) particle filter [3].
Recurrent Learning Rule
Depression on every presynaptic spike; weight– and time– dependentpotentiation if postsynaptic neuron fires within a time window.
∆wreckk′ ∝
{−1 after presynaptic spike, ande−wkk′·zk′(t) after pre–post pair
E[∆wreckk′] = 0⇔ wrec
kk′ = log p(zk = 1|zk′ = 1,w)
Learning a Hidden Markov Model
A four–state HMM was presented to the network, with states definedby differing Poisson firing statistics over 225 input neurons.
Inp
ut
Neu
rons
A
0 2 4Time (s)
Sta
te N
euro
ns
B
C
D
0.018(0.019) 0.023
(0.025)
0.035(0.035)
0.035(0.034)
0.018(0.018)
0.047(0.047)
0.953(0.951)
0.947(0.947)
0.965(0.963)
0.959(0.956)
Optimal
A
BC
D
(Learned)
State–dependent Inference
A network model with additional transition neurons for encoding andlearning Finite State Machines (FSMs) was trained on observationsfrom a four–state maze traversal task. After training, the networkresolved the current state given only transition symbols.
1 2
34
Left
Right
Down
Up
Left
Right
Down
Up
4
1 2
3
Right1 to 2
Left2 to 1
Right4 to 3
Left3 to 4
Up4 to 1
Up3 to 2
Down1 to 4
Down2 to 3
Symmetric Asymmetric
Training Condition
0.30
0.35
0.40
0.45
0.50
0.55
Pro
bab
ility
of
Counte
r-C
lock
wis
e T
urn
Trained Probabilities
Learned Probabilities
Inference Results
0 5 10Time [s]
Right (4 to 3)Left (2 to 1)
Down (2 to 3)Up (4 to 1)
Right (1 to 2)Left (3 to 4)
Down (1 to 4)Up (3 to 2)
Room 4Room 3Room 2Room 1
Sta
te N
euro
ns
Input
Neuro
ns
StationaryRight
LeftDown
Up
0 5 10
0 5 10Time [s]
Right (4 to 3)Left (2 to 1)
Down (2 to 3)Up (4 to 1)
Right (1 to 2)Left (3 to 4)
Down (1 to 4)Up (3 to 2)
Room 4Room 3Room 2Room 1
Input
Neuro
ns
Stationary
0 5 10
Moving
Sta
te N
euro
ns
Temporal Replay
The network was trained on input defined by a spatio–temporal patternwith two stochastically branching trajectories.I Input neurons indicated the current color (black or white) of the pixelsI Network learned both the spatial attributes (feedforward weights) and the temporal progression
(recurrent weights) of the input pattern
67%
33%31.7%
35.0%
65.0%
68.3%
After training, the network produced samples of states corresponding toa stochastic “replay” of observed trajectories.
0.0 0.5 1.0 1.5 2.0 2.5
Time (s)
Conclusions
We have presented a recurrent spiking neural network architecture,which can be trained to perform dynamical Bayesian inference of hiddenstates. The same network can be used to generate random sampletrajectories through the state space by spontaneous activity in the WTA(using the recurrent connections as a generative Bayesian model) or canbe used as a probabilistic FSM in which transitions are activelytriggered by movement signals.
References
[1] B. Nessler, M. Pfeiffer, L. Busing, and W. Maass. Bayesian computation emerges in generic cortical
microcircuits through spike-timing-dependent plasticity. PLoS Comp. Biol., 9(4):e1003037, 2013.
[2] Nessler B. Kappel, D. and W. Maass. STDP installs in winner-take-all circuits an online approximation to
Hidden Markov Models learning. PLoS Computational Biology, 2014, in press.
[3] C. Fox and T. Prescott. Hippocampal formation as unitary coherent particle filter. BMC Neuroscience,
10(Suppl 1):P275, 2009.
[4] Senn W. Brea, J. and J-P Pfister. Sequence learning with hidden units in spiking neural networks. In
Advances in Neural Information Processing Systems 24, pages 1422–1430, 2011.
[5] P. Baldi and Y. Chauvin. Smooth on-line learning algorithms for Hidden Markov Models. Neural
Computation, 6(2):307–318, 1994.