101
Transforming the BCPNN Learning Rule for Spiking Units to a Learning Rule for Non-Spiking Units ANTOINE BERGEL Master of Science Thesis Stockholm, Sweden 2010

Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Transforming the BCPNN

Learning Rule for Spiking Units to a Learning Rule

for Non-Spiking Units

A N T O I N E B E R G E L

Master of Science Thesis Stockholm, Sweden 2010

Page 2: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Transforming the BCPNN

Learning Rule for Spiking Units to a Learning Rule

for Non-Spiking Units

A N T O I N E B E R G E L

Master’s Thesis in Biomedical Engineering (30 ECTS credits) at the School of Computer Science and Engineering Royal Institute of Technology year 2010 Supervisor at CSC was Örjan Ekeberg Examiner was Anders Lansner TRITA-CSC-E 2010:059 ISRN-KTH/CSC/E--10/059--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.kth.se/csc

Page 3: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

AbstractThe Bayesian Confidence Propagation Neural Network (BCPNN) Modelhas been developed in the past thirty years for specific tasks like, amongothers, classification, content-addressable memory and data mining. Ituses a Bayesian-Hebbian learning rule, which exhibits fairly good per-formances, both as an counter model and in a continously operatingincremental learner. This learning rule has never been up and runningin spiking units networks and one is bound to use the outcome of thelearning for non-spiking units and to transpose it to the spiking contextafterwards, which is highly restrictive.

The aim of Master Thesis Project is to transform the existing BCPNNlearning rule for non-spiking units, including the bias term, to thedomain of spiking neural networks based on the Hodgkin-Huxley cellmodel. One aims to have a modelisation running in NEURON, whichcan exhibit the same features observed with non-spiking units. The sec-ondary goal of this paper is to compare the new learning rule to the oldone, and also with other Spike-Timing Dependent Plasticity learningrules.

To achieve such a goal, we introduce a new version of the BCPNNlearning rule, which can account for spiking input activities. This learn-ing rule is based on the use of synaptic traces as local variables to keepa trace of the frequency and timing between spikes. It includes threestages of processing, all based on low-pass filtering with three differ-ent temporal dynamics, in order to give an evaluation of the probabil-ities used to compute the Bayesian weights and biases. The Bayesianweights are mapped to a synaptic conductance, updated according tothe values of these synpatic traces, and we map the bias term to anactivity-regulated potassium channel.

We exhibit results of the new spiking version of the BCPNN learningrule in single-synapse learning and retrieval. We implement two mainmodels : the first based on abstract units in MATLAB and another onebased on Hodgkin-Huxley spiking units in NEURON. The last modelaccounts for spike-frequency adaptation and can be used to study the ef-fect of exact timing between presynaptic and postsynaptic spikes, underrepeated stimulations.

Page 4: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH
Page 5: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Acknowledgements

I would first like to thank Anders Lansner for allowing me to work at thedepartment Computational Biology and Neuroscience at KTH, for devoting timeand patience to assuming both roles of supervisor and examiner of this MasterThesis, and for always helping me, guiding me and finally leaving me in the bestconditions to produce valuable work. This first step into research at a high-levelscientific department has been a very enriching experience, which I will alwaysremember. I would also like to thank Örjan Ekeberg, for accepting to tutor thisMaster Thesis from abroad at first, and later for all the precious comments aboutthe report, presentation and structure of this work.

This passed year, at the department, I have had the chance to meet a lot ofpeople, from different backgrounds and countries. They have contributed to makethe atmosphere of work very special and especially warm and welcoming : Claudia,who has been here from the very beginning, Charles, for his ping-pong and chessskills when a break was needed, Aymeric, Dave, Simon, Pawel, Pierre and all theothers for making me discover new bars and restaurants. I want to give a specialthank to Mikael, for interesting talk, Pradeep and David, for their disponibility,kindness and help with NEURON, and finally to Bernhard, who has been not onlyalways eager to answer my numerous questions and investigate new problems withme, but also a devoted friend, who proposed me tremendous support and help, whentime pressure was high.

I cannot cite all the people that I have met these past two years, but I wantto say how getting to know all of them, all the conversations and moments wehad together, have changed me and made me realise that there exist no geographicborder to friendship and love. So, I want to thank Natasha, for the time she spenton improving the language in my report, and simply for being always supportiveand making me feel that she was here with me, though at the other side of the world.This year would have been so much different without my lovely room-mates Birte,Isabel, Stefan F., Stefan T. and Volker, for August mondays among other things,my two French buddies Fred and Joseph, for lunchbreaks, poker sessions and crazyparties. I want to give a special thank to my two Italian friends who showed thatbeyond neighbour rivalry, we just have so much in common and so much to share :Enrico, the craziest person I have ever lived with and Sara, probably the best pizzaand focaccia cooker ever.

Finally, I want to thank my parents who have always helped me with all the

Page 6: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

problems one can encounter when studying abroad for two years : I know how luckyI am to have them with me and I hope they measure the respect I have for them.A little word to my syblings, my sister Karen and my brother Samuel, who I willbe very happy to meet and live with again.

Page 7: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Contents

1 Introduction 11.1 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 The BCPNN Model 52.1 Context and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Bayesian Confidence Propagation . . . . . . . . . . . . . . . . . . . . 8

2.2.1 Using Neurons as probability estimators . . . . . . . . . . . . 82.2.2 Derivation of Network Architecture . . . . . . . . . . . . . . . 92.2.3 Bayesian-Hebbian Learning . . . . . . . . . . . . . . . . . . . 9

2.3 Gradual Development of the BCPNN model . . . . . . . . . . . . . . 92.3.1 Naive Bayes Classifier . . . . . . . . . . . . . . . . . . . . . . 92.3.2 Higher Order Bayesian Model . . . . . . . . . . . . . . . . . . 112.3.3 Graded units . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3.4 Recurrent Network . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4 BCPNN Learning Implementations . . . . . . . . . . . . . . . . . . . 142.4.1 Counter Model . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4.2 Incremental Learning . . . . . . . . . . . . . . . . . . . . . . 15

2.5 Performance Evaluation and Applications . . . . . . . . . . . . . . . 16

3 A spiking BCPNN Learning Rule 193.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.1 Synaptic traces as local state variables . . . . . . . . . . . . . 213.2.2 Spike-timing Dependence . . . . . . . . . . . . . . . . . . . . 223.2.3 Delayed-Reward Learning . . . . . . . . . . . . . . . . . . . . 233.2.4 Long-term Memory . . . . . . . . . . . . . . . . . . . . . . . . 253.2.5 Probabilistic features . . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Biological relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Abstract Units Implementation 274.1 Pattern presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Page 8: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

4.1.1 Non-spiking Pattern Presentation . . . . . . . . . . . . . . . . 284.1.2 Spiking frequency-based Pattern Presentation . . . . . . . . . 284.1.3 Spiking Poisson-generated Pattern Presentation . . . . . . . . 29

4.2 Learning Rule Implementation . . . . . . . . . . . . . . . . . . . . . 314.3 Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5 Hodgkin-Huxley Spiking Implementation in NEURON 355.1 Cell Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1.1 Hodgkin Huxley Model . . . . . . . . . . . . . . . . . . . . . 355.1.2 Spike Frequency Adaptation . . . . . . . . . . . . . . . . . . . 37

5.2 Pattern presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.3 Learning Rule Implementation . . . . . . . . . . . . . . . . . . . . . 39

5.3.1 Synaptic Integration . . . . . . . . . . . . . . . . . . . . . . . 395.3.2 Bias term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.4 Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6 Results 476.1 Abstract units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.1.1 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476.1.2 Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

6.2 Hodgkin-Huxley Spiking Units . . . . . . . . . . . . . . . . . . . . . 516.2.1 Steady-State Current Discharge . . . . . . . . . . . . . . . . . 516.2.2 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.2.3 Parameter tuning . . . . . . . . . . . . . . . . . . . . . . . . . 586.2.4 Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596.2.5 Spike Timing Dependence . . . . . . . . . . . . . . . . . . . . 60

7 Discussion 637.1 Model Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

7.1.1 Learning Rule Parameters . . . . . . . . . . . . . . . . . . . . 637.1.2 Pattern Variability . . . . . . . . . . . . . . . . . . . . . . . . 657.1.3 Learning-Inference Paradigm . . . . . . . . . . . . . . . . . . 65

7.2 Comparison to other learning rules . . . . . . . . . . . . . . . . . . . 667.2.1 Spiking vs Non-spiking Learning Rule . . . . . . . . . . . . . 667.2.2 Spike-timing dependence and real data . . . . . . . . . . . . . 687.2.3 Sliding threshold and BCM Rule . . . . . . . . . . . . . . . . 69

7.3 Further Developments and limitations . . . . . . . . . . . . . . . . . 717.3.1 Network implementation . . . . . . . . . . . . . . . . . . . . . 717.3.2 RSNP cells and inhibitory input . . . . . . . . . . . . . . . . 717.3.3 Hypercolumns, basket cell and lateral inhibition . . . . . . . 727.3.4 Parallel computing . . . . . . . . . . . . . . . . . . . . . . . . 73

8 Conclusion 75

Page 9: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Bibliography 77

Appendices 79

A NMODL files 81A.1 Synapse modelisation . . . . . . . . . . . . . . . . . . . . . . . . . . . 81A.2 A-Type Potassium Channel . . . . . . . . . . . . . . . . . . . . . . . 83

B Hodgkin-Huxley Delayed Rectifier Model 87B.1 Voltage Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87B.2 Equations for Gating Variables . . . . . . . . . . . . . . . . . . . . . 87

C NEURON stimulations parameters 89

Page 10: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH
Page 11: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 1

Introduction

1.1 Context

Since 1949 with Hebb’s theory, synaptic plasticity (the ability for the synap-tic connection between two neurons to change its strength according to a certainconjunction of presynaptic and postsynaptic events) is thought to be the biologicalsubstrate for high level cognitive functions like learning and memory. This ideais actually much older and was fomalized by the Spanish neuroanatomist SantiagoRamón y Cajal in 1894, who suggested “a mechanism of learning that did not re-quire the formation of new neurons”, but proposed that “memories might instead beformed by strengthening the connections between existing neurons to improve theeffectiveness of their communication” [29]. Hebb went a step further by proposinghis ideas about the existence of a metabolic growth process associating neurons thattend to have a correlated firing activity [13].

For the brain to be able to form, store and retrieve memories, as well as learn spe-cific tasks, the biological changes at the synapse level need to be long-lasting. This iscalled long-term potentiation (LTP) or depression (LTD) which means a persistentincrease or decrease in synaptic strength which is said to be the key mechanism un-derlying learning and memory. The biological mechanisms responsible for long-termpotentiation are not exactly known, but specific protein synthesis, second-messengersystems and N-methyl D-aspartate (NMDA) receptors are thought to play a criticalrole in its formation [20].

In 1995, Fuster defined memory as “a functional property of each and all ofthe areas of the cerebral cortex, and thus of all cortical systems”. He distinguishesseveral types of memories : short-term/long-term, sensory/motor, declarative/non-declarative and individual/phyletic. He proposes that all memory is associative andits strength depends on the number of associations we make to a specific word ormental object [11]. He introduced several key concepts like working memory, asa gateway to long-term memory waiting to be consolidated, and active memoryas a cortical network of neurons with an activity that is above a certain baseline.Also, his perception-action cycle suggesting a constant flow of information between

1

Page 12: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 1. INTRODUCTION

sensory and motor memory, has been proved to be a matter of interest for futureexperimentation.

More recently, investigations have focused on spike-timing-dependent plasticity(STDP), which refers to synaptic changes sensitive to the exact timing of action po-tentials between two connected neurons : one refers to pre-post timing or positively-correlated timing, when the presynaptic neuron fires a few milliseconds before thepostsynaptic neuron, and to post-pre timing or negatively-correlated timing, whenit goes the other way around. STDP has become a popular subject since the ex-perimental work of Bi and Poo [6] who first demonstrated the strong influence ofexact timing (typically a time-window of 20 ms for cultured hippocampal neurons)of presynaptic and postsynaptic spikes on synaptic long-term potentiation. Theirwork with culture of hippocampal neurons, seconded by the work from others, e.g.Rubin et al. 2005 and Mayr et al. 2009 [30, 25], has resulted in the formulation ofSTDP type learning rules [27, 9].

One must be aware, however, that these rules are considered rather crude ap-proximations by relevant experimentalists. There is a constant duality of the twopossible ways to approach neuroscience : some aim to understand the biologicalmechanisms at the cell and membrane level, so that they can build up models toreproduce them, whereas others aim to reproduce cell behaviour for applicationsand fit their model to experimental data, rather than to theory. Both approacheshave their justification and it is likely that both approaches are complementary.However, if some results arise, our understanding of the mechanisms of the brain isstill partial and a great deal remains to be done.

In this project, we focus on the Bayesian Confidence Propagation Neural Net-work (BCPNN) model, which has been first studied by Lansner and Ekerberg (1989)[22] and Kononenko (1989) [18]. Its main features are a network architecture di-rectly derived from Bayes Rule and unit activities representing the probability ofstochastic events. The BCPNN model will be thoroughly described in Chapter 2.

1.2 Motivations

In 2003, Sandberg et al. proposed that “a possible future extension of theexisting BCPNN model would be to implement the model using spiking neurons tofurther examine its generality and properties, such as the effect of spike synchronyin memory reset and the effects of AHP modulation on network dynamics” [32]. Atthat time, the model had just been improved from a counter model to a continouslyoperating incremental learning model. In this respect, the work presented is in thecontinuity of what has already been done and seeds the need to have such a learningrule operating in spiking context.

Artificial neurons are a very crude approximation of real neurons : given inputfrom other neurons they generate an output through an activity function. Spikingneurons, however, mimic the behaviour of real neurons : in particular, they exhibitspikes (they “fire” and take a high positive value) only when their potential crosses

2

Page 13: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

1.3. OUTLINE

a threshold and for a very short amount of time. These neurons simulate all-or-nothing behaviour and action potentials observed in real neurons [20]. The variablesattached to them, such as membrane voltage, capacitance, synaptic conductance,have a real biological meaning.

Since the large-scale implementations of neural networks are often based onspiking units, it is valuable to have such a formulation to be able to run on-linelearning also in large-scale spiking neural networks. The project aims to end upwith a network model with the new online learning rule operating and use it to testsome of the emergent phenomena. Evaluating the model by comparing it to theoriginal BCPNN rule, other STDP rules, as well as some experimental data on LTP[30], is our prime motivation. Because of its specific features (both Hebbian andBayesian), the BCPNN learning rule can always be used as a reference for otherSTDP learning rules to be implemented in the future. With regard to the biasterm a comparison can also be made with the threshold in the BCM learning rule,developed in 1982 by Bienenstock, Cooper and Munroe [7].

The transformation of the BCPNN learning rule to a spiking neuron environ-ment is somewhat challenging and has never been done before. This opens uptremendously the extent of our work and the scope of this Master Thesis is to belimited, for the sake of simplicity. We narrow our work to two main objectives: the comparison to other existing learning rules, as explained above, is the first.The second, somewhat more abstract, is to conciliate the probabilistic features ofthe original BCPNN learning rule and spike-timing dependent features developedin STDP models (Bi and Poo 1998, Morrisson 2008, Clopath 2008) [6, 27, 9]. Thenew learning rule presented in Chapter 3 is built to take into account STDP-likefeatures, and we aim to fit our model to existing experimental data, relating tothe spike-timing dependent plasticity window (Bi and Poo 1998) [6] and intrinsicexcitibality (Jung et al. 2009) [19], following a phenomenological approach to theproblem.

An further improvement of our work would be to modify our learning rule sothat it could run on parallel computers in a large-scale context. This work is notmeant to state decisive results, or to study exhaustively one specific feature of theBCPNN model, but rather to trigger the conversion of the BCPNN model to spikingunit environment.

1.3 Outline

We will first redefine, in Chapter 2, the basics of the BCPNN model and itsmathematical context, from its most basic form (Naive Bayes Classifier) to more re-cent ones (Higher Order Model, Recurrent Network). We will also relate the existingimplementations (counter model, incremental learning) and their applications. InChapter 3, the ‘spiking’ version of the learning rule is presented, its new features andtheir biological motivation. The including two following chapters contain the coreof the thesis : we develop how we implemented the new learning rule respectively

3

Page 14: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 1. INTRODUCTION

with abstract units in MATLAB (Chapter 4) and in a spiking context in NEURON(Chapter 5). The results are presented in Chapter 6, both single-synapse, networkimplementation and phemenological approach to fit STDP data. Dependence onmodel parameters and comparisons to other existing learning rules are discussedin Chapter 7. Finally, Chapter 8 is dedicated to further developements and to theconclusion.

4

Page 15: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 2

The BCPNN Model

2.1 Context and Definitions

Artificial Neural Networks

An artificial neural network (ANN) is a computational model that aims tosimulate the structure and/or functional aspects of biological neural networks. Itconsists of a group of computational units, connected by weighted links throughwhich activation values are transmitted

The reader can find documentation about ANNs in the literature and the pur-pose here is not to discuss Neural Networks in a general fashion. Still, we think it’svaluable in our context to relate the main features of artificial neural networks.

Nodes The functional unit or node of the network is the basic element constitutingit. Even if, in the first place, it has a biological equivalent, like a neuron ormore recently a minicolumn, it is an abstract unit, which means that thevariables attached to it are artificial and do not have an intrinsic biologicalmeaning. A node i is assigned a random variable xi that can be either binary,discrete or continous. It takes its input from other units xj and generates anoutput yi.

Activity Function The activity function or transfer function is the function giv-ing the input-output relationship for one node. Common activity functionsinclude linear, thresholded and sigmoïd functions. The input-output rela-tionship for one unit i is given by yj = φ(βi + ∑n

j=1 ωijxj) where φ is theactivity function, ωij the weight between unit i and unit j and βi the bias ofunit i.

Learning Rule The learning rule is an algorithm that modifies connections be-tween units, the so-called weights, in response to the presentation of an inputpattern. It is often the key point of the implementation, because it determinesthe response of the network to specific input, hence its applications. Classical

5

Page 16: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 2. THE BCPNN MODEL

learning rules include Perceptron Learning, Delta rule and Error Backpropa-gation.

Network Architecture A network can have several topologies. It can be com-posed of layers (single-layer, multi-layer networks) that can communicate inonly one direction (feedforward network) or in both directions (backpropaga-tion or recurrent network). Connections between units in the network can besparse or all-to-all. They can include one or several hidden layers (internalcomputational units which are not accessible from the network interface, butused to create a specific internal representation of the data).

Input and Output units In a feedforward network, a network receives informa-tion from input units and proposes an interpretation available at the outputunits. In a recurrent network, though, the difference between input and outputunits is less clear : input consists of an activation of a set of units representingan input pattern and an output pattern is read from the activity of the unitsafter a phase called relaxation.

Learning and RetrievalA network can be used in one of the two following modes : learning or retrieval.

During the learning phase, the network input units are clamped to a certain value(clamping means that the units are assigned a value set by the operator through acontroled process) during a certain amount of time (a set of input units representsan input pattern). During clamping, the learning rule operates, so that the weightsare updated, and retain the information contained in the pattern that has beenpresented to them. In other words, during learning, the network adapts to reality(the clamped input pattern) and changes its internal connection to remember it inthe future : learning is said to be stimulus-driven.

During the retrieval phase the weights of the network are assumed to be fixed,keeping the internal structure of the network unchanged. Distorted, incomplete ordifferent patterns than the ones used during learning are presented to the networkand an output pattern is generated. In the case of layered networks, the inferenceis realized by feeding a pattern to the input units and collecting it at the outputunits. In other words, the network interprets the input data, using its internalrepresentation or knowledge.

For a recurrent network, however, the input pattern is fed to all input units (allunits in the network except for the hidden units), and the network starts a phasecalled relaxation. Relaxation of the network consists of taking a pattern as inputand incrementally updating the units’ activities according to an inference rule ;this stops when stability is reached, i.e. when the change in the units’ activities isquite small. When the weight matrix is symmetric, convergence is guaranteed andrelaxation always converges to a stable attractor state [16].

For correct knowledge to be acquired, one must learn a pattern (learning phase)and then check if the pattern has been stored correctly (retrieval). It is important

6

Page 17: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

2.1. CONTEXT AND DEFINITIONS

however to alternate these two phases, so that the information stored by the networkis constantly updated and corrected. One must pay attention that a network doesnot learn its own interpretation of the data, by shutting off the learning phase fromtime to time.

Hebb’s postulate

Introduced by Donald Hebb in 1949, Hebb’s postulate, also called cell assemblytheory, is one of the earliest rules about synaptic plasticity. It has been formulatedas follows :

When an axon of cell A is near enough to excite a cell B and repeatedlyor persistently takes part in firing it, some growth process or metabolicchange takes place in one or both cells such that A’s efficiency, as oneof the cells firing B, is increased [13].

The theory is often summarized as “cells that fire together, wire together" andis commonly evoked to explain some types of associative learning in which simulta-neous activation of cells leads to pronounced increases in synaptic strength. Suchlearning is known as Hebbian learning. The general idea is that cells or group of cellsthat are repeatedly active at the same time will tend to become associated, so thatactivity in one facilitates activity in the other [1]. Work in the laboratory of EricKandel has provided evidence for the involvement of Hebbian learning mechanismsat synapses in the marine gastropod Aplysia Californica [21].

Associative Memory

Fuster describes associative memory as “a system of memory, usually constitutedby associations between stimuli and reinforcement” [11] as opposed to recognitionor episodic memories. However, according to him, association is an attribute of allmemories from the root of their genesis to their evocation. More widespread is thedefinition of auto-associative and hetero-associative memories as a form of neuralnetworks that enables one to retrieve entire memories from only a tiny sample ofitself. Hetero-associative networks can produce output patterns of a different sizethan that of the input pattern (mapping from a pattern x to a pattern y with anon-squared connection matrix W ), whereas auto-associative networks work witha fixed size of patterns (mapping of the same pattern x with a squared connectionmatrix W ).

The Hopfield network (Hoppfield 1982 [16]) is the most implemented auto-associative memory network and serves as content-addressable memory with binarythreshold units. Under the following restrictions : wii = 0 (no unit has a connectionwith itself) and wij = wji (connections are symmetric), convergence to a local min-imum of a certain Energy Function is guaranteed. During learning, the connectionmatrix W is modified to allow for attractor dynamics, so that relaxation of thenetwork causes the input pattern to converge towards the closest attractor state.

7

Page 18: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 2. THE BCPNN MODEL

2.2 Bayesian Confidence Propagation

2.2.1 Using Neurons as probability estimatorsThe main idea underlying the BCPNN learning rule is to use neurons as proba-

bility estimators. The input and output unit activities represent probabilities. Theneuron is used to estimate its probability of firing in a given context, i.e. know-ing the information carried out by the activities of others neurons in the network.Confidence propagation relies on the fact that the conditional probability of a givenneuron yi to fire given the context x P (yi|x) is a better approximation than the apriori probability P (yj). By updating units like this, one propagates the confidenceof one unit to the other units in the network.

Figure 2.1: Using Neurons as probability estimators

The BCPNN learning rule is based on a probabilistic view of learning and re-trieval ; input unit and output unit activities representing respectively confidenceof feature detection (the input to unit i from unit j is a number between 0 and 1representing the confidence that xj is a part of this pattern) and posterior probabil-ities of outcomes (the output to unit j is a number between 0 and 1 representingthe probability of outcome of xj knowing the pattern context)

One drawback of the fact that we use neurons as probabilistic estimators isthat we have to separate the signal. Indeed, the observation of the absence of anattribute in a given vector is somewhat different than the absence of observationof this attribute. However, if we only map one attribute but one unit, then theBCPNN model will interprate zero input to this unit as an absence of informationon this attribute, and it will compute the a posteriori probabilities of other units,discarding the input from this unit. To solve this problem, we need to separate thedata, i.e. we need to create units for all possible values of an attribute. In the case

8

Page 19: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

2.3. GRADUAL DEVELOPMENT OF THE BCPNN MODEL

of binary units, this corresponds to having two units a and a for attribute A. Whenno observation is made on this attribute, the network will discard input from bothof these units.

2.2.2 Derivation of Network Architecture

The Bayesian Confidence Propagation Neural Network (BCPNN) has been de-veloped gradually (Lansner and Ekeberg 1989, Lansner and Holst 1996, Sandberget al. 2002, Sandberg et al. 2003) [22, 23, 31, 32]. Starting from Bayes Theorem(equation 2.1), we derive a network architecture, meaning that we identify the termsin our mathematical formulae to weights ωij , biases βj , input xi and output unitactivities yj . The purpose of the learning phase will then be to update weights andbiases so that their value fits the one in the mathemical derivation of the network.According to the complexity of the training set we use, the network architecture canbe a single-layer (see Naive Bayes Classifier), multi-layer (see Higher Order Model)or fully-connected network (see Recurrent Network).

2.2.3 Bayesian-Hebbian Learning

The BCPNN learning rule derived in the next section uses Bayesian weightsand biases (equation 2.4). It exploits the statistical properties of the attributes inthe training set (frequencies of activation of one attribute xi and co-activation oftwo attributes xi and xj) in order to evaluate the probabilities P (xi) and P (xi, xj)used to update the weights and biases. It also shows Hebbian features becauseit reinforces connections between simultaneously active units, weakens connectionsbetween units independent from one another, and makes connections inhibitorybetween anti-correlated units.

As we shall see later in this paper, when applied to a recurrent attractor network,it gives a symmetric weight matrix and allows for fixed point attractor dynamics.The update of the weights in the network resembles what has been proposed asrules for biological synaptic plasticity (Wahlgren and Lansner 2001) [33].

2.3 Gradual Development of the BCPNN model

2.3.1 Naive Bayes Classifier

The Naive Bayesian Classifier (NBC) aims to calculate the probabilities of theattributes yj given a set x = (x1, x2, ... , xi, ... , xn) of observed attributes. Both areassumed to be discrete (for now, we only consider binary inputs). The main assump-tion in this case is the Independence Assumption, which states that the attributes xiare independent (P (x1, ... , xn) = ∏n

i=1 P (xi)) and conditionally independent givenyj (P (x1, ... , xn|yj) = ∏n

i=1 P (xi|yj))

9

Page 20: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 2. THE BCPNN MODEL

The Bayes Theorem is given by the following equation for x and y two randomvariables.

P (y|x) = P (x|y)P (y)P (x) (2.1)

Using this and the Independence Assumption, we can calculate the conditionalprobability πj of the attribute yj given the observed attributes xi

πj = P (yj |x) = P (x|yj)P (yj)P (x) = P (yj)

n∏i=1

P (xi|yj)P (xi)

= P (yj)n∏i=1

P (xi, yj)P (xi)P (yj)

Now, we assume that we only have partial knowledge of the attributes xi. Weare given completely known observations xi when i ∈ A ⊆ 1, ... , n and have noinformation at all about the attributes xk when k ∈ 1, ... , n\A. Then, we get

πj = P (yj |xi, i ∈ A) = P (yj)∏i∈A

P (xi, yj)P (xi)P (yj)

Then, taking the logarithm of the last expression, we obtain :

log(πj) = log(P (yj))+∑i∈A

oi log[P (xi, yj)P (xi)P (yj)

]= log(P (yj))+

n∑i=1

oi log[P (xi, yj)P (xi)P (yj)

](2.2)

where the indicator variable oi equals 1 if i ∈ A (which means that the ith attributexi is known) and equals 0 otherwise.

We finally end up with the following equation

log(πj) = βj +n∑i=1

ωij oi (2.3)

withωij = log

[P (yj ,xi)P (yj)P (xi)

]βj = log(P (yj))

(2.4)

This can be implemented as a single-layer feedforward neural network, with inputlayer activations oi, weights ωij and biases βj . In this way, the single-layer feedfor-ward neural network calculates posterior probabilities πj given the input attributesusing an exponential transfer function.

The weights and biases given in the equation 2.4 are called Bayesian weights.We can point out the Hebbian character of these weights : ωij ∼ 0 when xi andyj are independent (weak connection between independent units), ωij ∼ log(1

p) > 0when the units xi and yj are strongly correlated, since in this case P (xi, yj) ≈P (xi) ≈ P (yj) ∼ p > 0 (strong connection between simultanously active units) andωij → −∞ when they are anti-correlated, because in this case P (xi, yj)→ 0 (stronginhibitory connection betwen anti-correlated ints).

The bias term βj gives a measure of intrinsic excitability of the unit xj , as weshall see later in details. We observe that βi → 0 when pi → 1 so that the bias term

10

Page 21: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

2.3. GRADUAL DEVELOPMENT OF THE BCPNN MODEL

has no effect in computation when unit xi is strongly activated and βi → −∞ whenpi → 0 thus muting the information carried out by unit xi when it has been activatedseldom. This process is democratic in the sense that it gives more importance tothe units who have ‘a lot to say’ and shuts off the ones not taking part in patternactivation, considered irrelevant for learning and inference.

2.3.2 Higher Order Bayesian ModelThe problem encountered in practical applications is that the Independence As-

sumption is often violated, because it is too restrictive. The standard way to dealwith this, as when facing a non-linearily separable training set, is to introduce ahidden layer with an internal representation in which classes are separable. Here,we use a structure of the hidden layer consisting of feature detectors organized inhypercolumns.

Starting from the previous model we assume independence between all attributesand conditional independence given yj

(P (x1, ... , xn) =n∏i=1

P (xi)) and (P (x1, ... , xn|yj) =n∏i=1

P (xi|yj))

However if two variables xi and xj are found not to be independent, they canbe merged into a joint variable xij , giving :

P (x1, ... , xn) = P (x1)...P (xij)...P (xn)

and a similar method may be used for the conditional probabilities. This meansthat in the network we get one unit for each combination of outcomes of the originalvariable xi and xj . For example, if two groups of units corresponding to primaryfeatures A = a, a and B = b, b are not independent, we insert in their place agroup of complex units AB = ab, ab, ab, ab making up a composite feature. Thehypercolumn structure formed produces a decorrelated representation, where theBayesian model is applicable.

We note that all formulae above are unchanged. We have just introduced ahidden layer that increases internal computation but the external environment isunchanged. The structure of our network now resembles the structure in figure 2.2.

This process relies on a measure of independence of the attributes xi of an inputpattern x. A partially heuristic method (Lansner and Holst 1996) [23] is to mergetwo columns if the measure of correlation (like the mutual information) betweenthem is high :

Iij =∑

xi∈Xi,xj∈XjP (xi, xj) log( P (xi, xj)

P (xi)P (xj)) (2.5)

A major drawback of this method is that the number of units increases expo-nentially with their order, i.e. how many input attributes they combine (Lansnerand Holst 1996, Holst 1997) [23, 15].

11

Page 22: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 2. THE BCPNN MODEL

Figure 2.2: Architecture of the BCPNN with a hidden unit for internal decorrelatedrepresentation

2.3.3 Graded unitsThus far, we have only considered treated binary inputs. However, it is valuable

too that the network handles graded input : for instance, if an attribute is unknownor its value uncertain, graded input would then be a ‘confidence’ value between 0(no) and 1 (yes). This cannot be coded directly as a graded input activity betweenzero and one, because this would be interpreted as a probability in the BCPNN.Thus we use a kind of soft interval coding to a set of graded values.

Suppose that each attribute i can take Mi different values, xii′ would be abinary variable describing the probabilty for the ith attribute to take the i’th valuexii′ = 1⇔ xi = i′. Making the necessary labellings in the previous formulae, weget

πjj′ = P (yjj′ |xik) = P (yjj′)∏A

P (xik, yjj′)P (xik)P (yjj′)

where for each attribute i ∈ 1, ..., n a unique value xik is known, where k ∈1, ...,Mi. Similarly it follows that

πjj′ = P (yjj′)n∏i=1

Mi∑i′=1

P (xii′ , yjj′)P (xii′)P (yjj′)

oii′

with indicators oii′ = 1 if i′ = k and zero otherwise. oii′ can be seen seen as adegenerate probability oXi(xii′) = δxik(xii′) = PXi(xii′) of the stochastic variablesXi which is zero for all xii′ except for the known value xik (Sandberg et al. 2002)[31].

12

Page 23: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

2.3. GRADUAL DEVELOPMENT OF THE BCPNN MODEL

Taking the logarithm of the previous expression leads to

log(πjj′) = log(P (yjj′)) +n∑i=1

log

Mi∑i′=1

P (xii′ , yjj′)P (xii′)P (yjj′)

oii′

(2.6)

The corresponding network now has a modular structure. The units ii′ in thenetwork, where i ∈ 1, ...,Mi, explicitly representing the values xii′ of Xi may beviewed as a hypercolumn as discussed above. By definition the units of a hypercol-umn i have a normalized total activity ∑Mi

i′=1 oii′ = 1 (the variable xi can only haveone value k at a time).

Transforming these equations to the network setting yields

hjj′ = βjj′ +n∑i=1

log

Mi∑i′=1

ωii′jj′oii′

(2.7)

with

ωii′jj′ =P (yjj′ ,xii′ )P (yjj′ )P (xii′ )

βjj′ = log(P (yjj′))(2.8)

where hjj′ is the support of unit jj′, βjj′ is the bias term and ωii′jj′ is the weight.πjj′ = f(hjj′) = exp(hjj′) can be identified as the output of unit jj′, representing theconfidence (heuristic or approximate probability) that attribute j has value j′ giventhe current context. We also need to normalize output within each hypercolumn byˆπjj′ = f(hjj′) = exp(hjj′ )∑

j′ exp(hjj′ ).

Figure 2.3: Architecture of the BCPNN with a hidden unit and an additive sum-mation layer for graded input handling

13

Page 24: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 2. THE BCPNN MODEL

Figure 2.3 shows a ‘pi-sigma network’, able to handle graded input. The notionof a support unit is used to update the units simultaneously and not one by one: calculations are first stored in the support units for all units and the transferfunction is then used to update the units all at once.

2.3.4 Recurrent NetworkNow, because both the input oii′ and the output ˆπjj′ of the network represent

probabilities, we can feed the output back into the network as input, creating a fullyrecurrent network architecture, which can work as an autoassociative memory. Thecurrently observed probability oii′ = PXi(xii′) is used as an initial approximation ofthe true probability of Xii′ and used to calculate a posterior probability, using thelearning parameters βjj′ and wii′jj′ , which tends to be a better approximation. Thisis then fed back and the process is iterated until a consistent state is reached, whichis guaranteed because the weight matrix is symmetric. The reader should note thatwe have now incorporated the yjj′ among the xii′ , thus dropping the notions ofinput and output units.

In the recurrent network, activations can be updated either discretely or con-tinuously. In the discrete case, ˆπjj′(t+ 1) is calculated from πii′(t), or equivalently,the hjj′(t+ 1) from hii′(t) using one iteration of the update rule

hjj′(t+ 1) = βjj′ +n∑i=1

log

Mi∑i′=1

ωii′jj′ f(hii′(t))

(2.9)

In the continuous case hjj′(t) is updated according to a differential equation,making the approach towards an attractor state continuous:

τcdhjj′

dt= βjj′ +

n∑i=1

log

Mi∑i′=1

ωii′jj′ f(hii′(t))

− hjj′(t) (2.10)

where τc is the ‘membrane time constant’ of each unit. Input to the network isintroduced by clamping the activation of the relevant units (representing knownevents or attributes). As the network is updated the activation spreads, creatingthe a posteriori beliefs of other attribute values.

2.4 BCPNN Learning Implementations

2.4.1 Counter ModelThis model has been developed and described (Lansner and Ekeberg 1989) [22].

The purpose is to collect statistics of unit activity and co-activity of pairs of units,to be able to estimate the probabilities P (xi) and joint probabilities P (xi, xj) usedto calcultate W and βj values. An input pattern consists of a stimulus strength inthe range [0,1] for each unit in the network. Here, the network is entirely ‘stimulus-driven’ during learning, otherwise the network would first interpret the input and

14

Page 25: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

2.4. BCPNN LEARNING IMPLEMENTATIONS

then learn its own interpretation, which is to be avoided. This allows a reductionin computing time during learning, because no time is used to infer from the data(no internal computation).

The basic idea behind the counter model, is to estimate the probabilities P (xi),P (xj) and P (xi, xj) by counting occurences and co-occurences in the training set.With an estimate of p = c

Z , we obtain

βi = log(P (xi)) = log[ciZ

]and ωij = log

[P (xi, xj)P (xi)P (xj)

]= log

[cijZ

cicj

](2.11)

whereZ =

∑α

κ(α) ci =∑α

κ(α)πi cij =∑α

κ(α)πiπj (2.12)

Here, πi is the output of unit i, α is an index over the patterns in the trainingset, and κ is the significance attributed to a certain learning event. It providesa mechanism for over-representing subjectively important learning examples andignoring unimportant ones. This technique is similar to boosting used in classifica-tion, which is the over-representation of hard examples in order to increase accuracyof the classifier. Special care has to be taken when counters come out as zero. Inthe case when ci or cj is zero, wij is also set to zero. If ci and cj are both non-zerobut cij is zero, wij is set to a large negative value, log( 1

Z ). This also happens for βiwhen ci is zero.

The counter model provides a simple and fast implementation of BCPNN learn-ing, but when the maximum capacity of the network is reached, catastrophic for-getting occurs (i.e. all memories are lost when the system is over-loaded).

2.4.2 Incremental LearningIn order to avoid catastrophic forgetting, incremental learning using exponen-

tially running averages has been implemented (Sandberg et al. 2002, Sandberg etal. 2003) [31, 32]. The idea is to introduce intrinsic weight decay (forgetting) inthe network, so that the system never becomes over-loaded. A time constant α isused to control the time-scale of this weight decay, allowing for short-term workingmemory behaviour as well as for long-term memory.

A continuously operating network will need to learn incrementally during oper-ation. In order to achieve this, P (xii′)(t) and P (xii′ , xjj′)(t) need to be estimatedgiven the information x(t′), t′ < t. The estimate should include the followingproperties:

1. It should converge towards P (xii′)(t) and P (xii′ , xjj′)(t) in a stationary envi-ronment.

2. It should give more weight to recent than remote information.

3. It should smooth or filter out noise and adapt to longer trends, in other wordslower frequency components of a non-stationary environment.

15

Page 26: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 2. THE BCPNN MODEL

(1) is the prime constraint. Our estimate needs to converge to these probabilitiesbecause they are needed to compute the Bayesian weights and biases. (2) makes themodel operate as a ‘palimpsest memory’ meaning that recent memories constantlyoverwrites old ones. Thus a pattern has to be reviewed not to be forgotten. (3)is a stability constraint in a non-stationary environment. The low-pass filteringoperation is to be investigated again in Chapter 3.

The incremental Bayesian learning rule proposed here achieves this by approxi-mating P (xii′)(t) and P (xii′ , xjj′)(t) with the exponentially smoothed running aver-ages Λii′ of the activity πii′ and Λii′jj′ of coincident activity πii′ ˆπjj′ . The continuoustime version of the update and learning rule takes the following form:

τcdhii′(t)dt

= βii′ +n∑j=1

log

Mj∑j′=1

ωii′jj′(t) ˆπjj′(t)

− hii′(t) (2.13)

πii′(t) = exp(hii′)∑i′ exp(hii′)

(2.14)

dΛii′(t)dt

= α([(1− λ0)πii′(t) + λ0]− Λii′(t)) (2.15)

dΛii′jj′(t)dt

= α([(1− λ20)πii′(t) ˆπjj′(t) + λ2

0]− Λii′jj′(t)) (2.16)

ωii′jj′(t) = Λii′jj′(t)Λii′(t)Λjj′(t)

(2.17)

βii′(t) = log(Λii′(t)) (2.18)

The above probability estimates converge towards the correct values given sta-tionary inputs for sufficiently large time constants. Since the weights of the networkdepend more on recent than on old data, it appears likely that a Hopfield-like net-work with the above learning rule would exhibit palimpsest properties.

Special care has to be taken to avoid logarithms of zero values (see Sandber et al.2002) [31]. In addition, the parameter α provides a means to control the temporaldynamics of the learning phase (from short-term working memory to long-termmemory), It also allows us to switch off learning when the network needs to be usedin retrieval mode, allowing for change in the network activity without correspondingweight change, because when α = 0 the running averages ‘freeze’ to their currentvalues.

2.5 Performance Evaluation and Applications

Performance EvaluationThere are many criteria available to evaluate the performance of a model. Of

course, no model is better than the others on every level, nor it is designed forevery purpose. Nevertheless, in order to be accepted and developed in the future, a

16

Page 27: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

2.5. PERFORMANCE EVALUATION AND APPLICATIONS

model needs to gather some basic features : robustness, reasonable time of executionand stability are required to consider a model efficient. Here, we present the maincriteria we use to evaluate the BCPNN model.

Frequency of correct retrieval This is the most used criterion to evaluate theperformance of the network. Feeding a list of input patterns to the network, wewant to know how well the network learns them, by counting the occurencesof successfully completed patterns after learning. An important parameteris the age of the pattern, because recent patterns tend to be retrieved moreaccurately than old ones. Numbers of patterns, their complexity and time ofpresentation are to be taken into account too.

Storage Capacity The storage capacity is the amount of patterns that a networkcan store. Hopfield network capacity has been investigated (Hopfield 1982)[16]. In our counter model, the capacity is fixed, thus it is susceptible to catas-trophic forgetting, whereas the incremental learner has a capacity dependantfrom its spontaneous forgetting (short-term memories with fast weight de-cay dynamics are protected from catastrophic forgetting because capacity ishardly reached, whereas long-term memories are more exposed to it).

Noise Tolerance In reality, patterns fed to the network are always a little noisyand it is important that the attractor dynamics of the network overcomethis. To test this, we feed distorted patterns to the network and count thefrequency of retrieving the original ones. A special case is the one of competinginterpretations when a mixture of two stored patterns is fed to the network.

Convergence speed The convergence speed for relaxation of the network is alsoan important trait of our model. Inference has to be fast enough so thattesting patterns do not take too long and, on the other hand, it has to usesmall enough timesteps for it not to skip any attractor state with a narrowconvergence domain. Convergence speed increases substantially for distortedand ambigous patterns (because they are distant from stable attractors in theattractor space) (Lansner and Ekeberg 1989 [22]).

ApplicationsThe domain of applications of the Bayesian Confidence Propagation Neural

Network is wide. Because of its statistically-based method of unsupervised learning,it can be implemented in a series of different contexts. We present some of itsapplications here.

Classification The BCPNN is first designed to evaluate probabilities from a setof observed features or attributes, so it is natural that the BCPNN is usedfor classification tasks, which aim to label a pattern and assign it to a cor-responding class. The network architecture of these networks is single ormulti-layered, depending on the complexity of the data set. The input units

17

Page 28: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 2. THE BCPNN MODEL

correspond to the attributes, and the output units to the class units. BCPNNand classification has been investigated exhaustively (Holst 1997) [15].

Content-addressable memory When used in a recurrent network architecture,the BCPNN model performs quite well as a content-addressable memory. Ittakes into account to statistical properties of the data and performs betterwith patterns for which attributes can be considered independent, like pixelgrey-levels in an image, letters in a list of words or digits in a list of numbers.The capacity has to be large enough to avoid memory overloading.Because of its associative character, BCPNN memory networks can performpattern completion (restoring a pattern from only a sample of it) and patternrivalry (decision between ambigous patterns or a mixture of two existing ones).A good example for pattern rivalry is found in optical illusions and ambigousimages.

Pharmacovigilance and Data Mining The BCPNN has been used for high-lighting drug-ADR pairs for clinical review in the WHO ADR database aspart of the routine signal detection process (Bate et al 1998, Lindquist et al.2000). The recurrent BCPNN has also been implemented a tool for unsuper-vised pattern recognition and has been tested on theoretical data, and showneffective in finding known syndromes in all haloperidol reported data in theWHO database (Bate et al. 2001, Orre et al 2003). More recently, Ahmedet al. revisited Bayesian pharmacovigilance signal detection methods, in amultiple comparison setting (Ahmed et al. 2009).

18

Page 29: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 3

A spiking BCPNN Learning Rule

In this chapter, we introduce the new ‘spiking’ version of the BCPNN learningrule. We give its mathematical formulation and discuss its specific features and howthey account for biologically observed phenomena.

In order to have a mapping from the original BCPNN learning to the spikingversion of it, we need to match one descriptor of the activity of the biological neuronsto the input and output of the abstract units. The most natural choice seems to bethe frequency or rate of firing of one neuron. Thus the range [0, 1] of the units inthe non-spiking network will be mapped to a range [0, fmax] where fmax representsthe maximum firing frequency of one neuron.

3.1 Formulation

The version of the learning rule that we are going to implement in a spikingneuron context has the following form :

dzidt

= yi − ziτi

z0i = 1

Mi(3.1)

dzjdt

= yj − zjτj

z0j = 1

Mj(3.2)

In this first stage of processing (equations 3.1 and 3.2), we filter the presynapticand postsynaptic variables yi and yj , which exhibit a ‘spiking-binary’ behaiour mostof the time, with a low-pass filter of respective time constant τi and τj (note thatthey can be different). The resulting variables zi and zj are called primary synaptictraces. Mi and Mj are the number of units in the pre-hypercolumn and the posthypercolumn respectively, and are only used in a network context. In single-synapse

19

Page 30: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 3. A SPIKING BCPNN LEARNING RULE

learning, we set Mi = Mj = 10. The typical range of τi and τj is 5 to 20 ms.

deidt

= zi − eiτe

e0i = 1Mi

(3.3)

dejdt

= zj − ejτe

z0j = 1

Mj(3.4)

deijdt

= zjzj − eijτe

e0ij = 1MiMj

(3.5)

In the second stage of processing (equations 3.3, 3.4 and 3.5), we filter theprimary traces zi and zj with a low-pass filter of constant τe (note that it is thesame for the 3 equations). The typical range of τe is 100 to 1,000 ms. The resultingvariables ei,ej and eij are called the secondary synaptic traces. We note the intro-duction of a secondary mutual trace eij , which keeps a trace of the mutual activityof yi and yj and will be used to later compute P (xi, xj). Note that a mutual traceis impossible to get at the first stage of processing since the direct product yiyj iszero most of the time. This is because yi and yj are ‘spiking’ variables and thusequal zero except on the occurence of a spike, so yiyj would be non-zero only whenyi and yj spike at the exact same time, which almost never happens.

dpidt

= κei − piτp

p0i = 1

Mi(3.6)

dpjdt

= κej − pjτp

p0j = 1

Mj(3.7)

dpijdt

= κeij − pij

τpp0ij = 1

MiMj(3.8)

In the third and last stage of processing (equations 3.6, 3.7 and 3.8), we filterthe secondary traces ei,ej and eij with a low-pass filter of constant τp (note that itis the same for the 3 equations). The typical range of τp is 1,000 to 10,000 ms. Theresulting variables pi,pj and pij are called the tertiary synaptic traces. We also notethe presence of a mutual tertiary trace that is a direct approximation of P (xi, xj).

βi =

log(ε) if pi < εlog(pi) otherwise (3.9)

ωij =

log(ε) if pijpipj

< ε

log( pijpipj) otherwise (3.10)

The equations for updating the weights and biases (equations 3.9 and 3.10)are the classical Bayesian weights and biases equations. Note that these equationschange a little in the case of ‘pi-sigma’ higher order networks with graded input(equations 2.7 and 2.8). Because we deal only with binary input, we keep theseequations unchanged. When pi takes a small value it is set to a minimum valueε in order to avoid a logarithm of zero. The same is done when pij

pipjbecomes

20

Page 31: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

3.2. FEATURES

too small. We note the presence of the parameter κ. It is a global ’print-now’signal that regulates the update of tertiary traces, while leaving unchanged theinternal structure of the network (primary and secondary traces). We will explainits function in further detail later.

The spiking version of the BCPNN learning rule is the set of these 10 equations.It relies on 3 stages of processing that perform the same operation (low-pass filtering)with different temporal dynamics. The set of parameters that can be controled arethe time-constants τi,τj ,τe and τp, the initial values of the traces and the print-nowsignal κ.

3.2 Features

3.2.1 Synaptic traces as local state variables

The implementation of local synaptic state variables such as synaptic tracesin the above learning rule is a common approach of STDP learning rules [27, 25].These variables are used to keep a trace or memory of a presynaptic or postsynapticevents such as the occurence of a spike. In addition, low-pass filtering enables us tomanipulate continous variables rather than ‘spiking variables’ which is problematicwhen we want to estimate, for example, a joint probability P (xi, xj), since thedirect product of two spiking variables is likely to be zero, due to the ‘impulse’nature of a spike. Indeed a spike has a very short duration and is often describedas a discontinous variable, that is non-zero only on the occurence of a spike.

Scaling these variables between 0 and 1 is very useful because it makes theirquantitative use easier. One can deal with different types of synaptic traces.

Additive trace The additive trace updates the local state variable x(t) by a con-stant value A. The particularity of this trace is that it can be greater than 1when a lot of events occur in a short time. It is implemented by the followingequation

dx

dt= −x

τ+∑ts

Aδ(t− ts)

where ts denotes the time occurence of a spike.

Saturated trace The saturated trace updates the local state variable x(t) to aconstant value A. This trace is always in the range [0,1] and it keeps onlythe history of the most recent spike, because it’s invariably reset to 1 on theoccurence of a spike. It is implemented by the following equation

dx

dt= −x

τ+∑ts

(1− x−)δ(t− ts)

where ts denotes the time occurence of a spike and x− is the value of x justbefore the occurence of the spike.

21

Page 32: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 3. A SPIKING BCPNN LEARNING RULE

Proportional trace Here, the local state variable x(t) is updated to a value pro-portional to its deviation to 1. This trace is always in the range [0,1] and itrealizes a synthesis of the effects of the two traces above. It keeps a valueof x(t) close to 1 when many spikes occur in a short time and it is easy toevaluate the occurence of the last spike by looking at the exponential decayat a time t. The proportional trace is the one we use later. It is implementedby the following equation

dx

dt= −x

τ+∑ts

k(1− x−)δ(t− ts)

with ts, x− as described above and k is the proportion of update. Typicallywe use k ∈ [0.5, 0.8]. Figure 3.1 shows the dynamics of the 3 different synaptictrace types.

Figure 3.1: Different types of synaptic traces - The upper figure correspondsto a spike train and the lower figure displays the three different synpatic traces :the black, blue and red curves correspond respectively to the additive, saturatedand proportional traces

3.2.2 Spike-timing DependenceThe first stage of the processing of our learning rule (equations 3.1 and 3.2)

allows us to create the primary synaptic traces. These variables with very fastdynamics are used as recorders of the spikes : on the occurence of a spike they areset a certain value (since we use proportional traces, this value is proportional to thedeviation between 1 and the value of the synaptic trace just before the spike x(t))and decay exponentially until another spike occurs. Proportional traces convey two

22

Page 33: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

3.2. FEATURES

pieces of informations : history of the last spike by looking at the current decay (ifthe last spike occured recently, the trace is steep and decays fast) and global historyof the past events (when numerous spikes occur in a short period of time the tracevalue comes close to 1).

The dynamics of the primary traces zi and zj are controled by the time constantsτi and τj . Since these constants can be different, pre-post timing can be promotedover post-pre timing, and the other way around. For instance, if we set τi = 20 msand τj = 1 ms, then zj will decay much faster than zi. Then, if a postsynaptic spikeoccurs 10 ms after a presynaptic spike, the product zizj will be non zero shortlyafter the occurence of the postsynaptic spike. On the other hand, if a presynapticspike occurs 10 ms after a postsynaptic spike, then the product zizj will still bezero because of the fast decay of zj . By setting τj to a small value compared to τi,we have given a priority to pre-post timing (see figure 3.2).

The values of these two time-constants define a spike-timing time-window, (seeBi and Poo 1998 [6]). The width and symmetry of this can be controled by manip-ulating these constants.

Figure 3.2: Different effects of pre-post and post-pre timing on the primarysynaptic traces - The upper figure corresponds to a regular spike train post-pre-post. Since primary traces have different time-constants (τi = 50 ms and τj = 5 ms)pre-post timing is promoted over post-pre timing, because the resulting product zizj(not displayed here) is much bigger after regular pre-post timing than after post-pretiming.

3.2.3 Delayed-Reward Learning

It can be a little puzzling to realize that our learning rule has three stageof processing of the data while we always perform the same operation (low-pass

23

Page 34: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 3. A SPIKING BCPNN LEARNING RULE

filtering). However, these three filtering procedures perform three very specific anddifferent tasks. As observed in previous models (Bi and Poo 1998, Rubin et al. 2005,Morrisson et al. 2008, Mayr et al. 2009) [6, 30, 27, 25] exact spike timing betweenpresynaptic and postsynaptic plays a crucial role in LTP. Moreover, a time-windowof 20 ms before and after a postsynaptic spike seems to exist, so that no long lastingchange occurs if delays between spikes are greater than 20 ms.

However, the activity in the network needs to be long-lasting and to reverberateon a much greater time-scale. In the context of delayed reward learning [28] andreinforcement learning, the reward, which triggers the induction of LTP, occurs witha delay on a time-scale of hundreds of milliseconds to seconds. Worse, this delayisn’t predictable so that one cannot know when the reward and the actual learningwill take place. In order to solve this problem, we include secondary traces thatextend the reverberation of activity in the network.

Then, when a spike occurs, activity is recorded in the primary and secondarytraces. After a few hundred milliseconds, the activity has disappeared in the primarytraces, but is still reverberating in the secondary traces ei, ei and eij (equations 3.3,3.4 and 3.5). Thus, if the print-now signal, representing the reward, is set to 1, thesecondary traces convey the information and learning can still take place.

Figure 3.3: Temporal dynamics of the different synaptic traces - Thin curvescorrespond to the primary traces, thicker curves to the secondary ones and boldcurves to the tertiary traces. Blue corresponds to presynaptic traces, red corre-sponds to postsynaptic variables and black corresponds to mutual traces - Thetemporal dynamics are the slowest for the tertiary traces that build up and de-creases slowly. The combination of these three levels of processinf enables us toachieve different goals.

It is important to stress that both of these traces are required if we want toaccount for the following phenomena : the exisence of a spike-timing window inthe order of tens of milliseconds (about 20 ms for spike delays) outside of which

24

Page 35: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

3.2. FEATURES

no significant weight change takes place, and the fact that the reward enhancingthe learning process comes with a delay on a time-scale of hundreds of milliseconds.As we will see later, there are biological equivalents to this print-now signal anddelayed synaptic traces.

Figure 3.3 shows the temporal dynamics of the primary, secondary and tertiarytraces for a pattern stimulation followed by no activity.

3.2.4 Long-term Memory

Finally, the third stage of processing (equations 3.6, 3.7 and 3.8) computessynaptic state variables that have much slower dynamics. Typically, the pi, pj andpij account for long-term memory, meaning that they store events that have beenrepeated on a set of actions and experiments.

We assume that our learning rule operates in the context of delayed rewardlearning and we take the example of an animal, a rat for instance, being proposedseveral buttons to open doors, behind which some food (reward) is present or ab-sent. The primary traces’ activities with fast dynamics record the precise spiketiming when activity spreads in the network consequently to taking actions (stim-ulus, button pressing). The secondary traces account for the delayed obtention ofthe reward, which comes as a delayed result of action-taking. If the rat accesses thereward, then the ‘print-now signal’ is set to 1 and long-term memory is triggered.The tertiary traces are activated when delayed reward has been obtained severaltimes and that stimulus has been reinforced. This means that pi, pj and pij buildup when the activities of the secondary traces have been above a certain baselineon a repeated scale. Then, reinforcement occurs and memories can be stored.

It is singular however that the print-now signal κ shows up on this stage ofprocessing. It could have done similarly on the equations 3.3, 3.4 and 3.5, butthe biological equivalent of the print-now signal suggest that the metabolic changesoccur even if it is not activated, whereas only the weights are overwritten if theprint-now is active. Thus, it makes more sense for it to appear right before theweight update.

3.2.5 Probabilistic features

It is important to keep in mind that our spiking version of the BCPNN learningrule is not another implementation of a STDP pair-based learning rule. Indeed,the state variables that we calculate represent probabilities and their values havean intrinsic meaning on their own. This is the main reason why feeding gradedinput to the network is not trivial, because it interprets activities in the network asprobabilities. As discussed previously, input to the units represents the confidenceof feature detection and the output represents the posterior probability of outcome.

In the original counter model, P (xi) and P (xi, xj) were quite easy to approxi-mate by counting occurences and co-occurences of the features within the trainingset. Due to the spiking structure of the input variables yi and yj it is a bit trickier

25

Page 36: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 3. A SPIKING BCPNN LEARNING RULE

to evaluate the probabilities P (xi) and P (xi, xj). The use of synaptic traces allowsus to create mutual traces eij and pij that convey the information about correlationbetween spikes.

3.3 Biological relevanceThis new version of the BCPNN learning rule shows the biological relevance

on different levels. The first is the use of synaptic traces which are thought tohave a direct biological meaning. For instance, when a presynaptic spike arrivesat a synapse, there is a quantified release of neurotransmitters. According to thenature of the synapse, the additive trace or saturated trace might be used : thefirst when the amount of transmitters is small compared to the synapse size, so thatthe occurence of a new spike has an additive effect because enough free receptorsare available for synaptic transmission, and the second when the quantity of neuro-transmitters released reaches the maximum capacity of the synapse, which meansthat the synapse saturates all of its available receptors on the occurence of eachpresynaptic spike.

Another direct equivalent is the ‘print-now signal’ that can be seen as a memorymodulator concentration like dopamine, which is thought to have a direct enhancingeffect on learning and memory when present in high quantities. The delayed-rewardmechanism has indeed a direct biological relevance and has been observed experi-mentally (Potjans et al. 2009) [28].

As explained before, the mixture of variables with slow and fast temporal dynam-ics makes sense and fits to what has been observed. The concentration of calciumions in the postsynaptic site is thought to play a key role in synaptic plasticity [30]with much faster dynamics than the protein synthesis governing the transition fromearly-LTP to late-LTP [9].

Clopath et al. [9] present a model to account for transition from early to late-LTP, containing three different phases of Tag-Trigger-Consolidation. A synapse canbe in one of the three following states : untagged, tagged for LTP (high state) ottagged for LTD (low state), depending on a presynaptic and postsynaptic event. Ifthe total number of tagged synapses exceeds a threshold, a trigger process occursand opens up for consolidation (long lasting changes in synaptic efficacy). Whatis similar in our model is the three different temporal dynamics. The secondarymutual trace eij can be seen as an equivalent to a tagging procedure : if its valuestays above a threshold for a long enough time, then metabolic changes, such asspecific protein synthesis, occur allowing for conversion from working memory tolong-term permanent memory.

26

Page 37: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 4

Abstract Units Implementation

In the two next chapters, we present different implementations of the spikingversion of the BCPNN learning rule presented previously. The first implementationconsists of abstract units in MATLAB and serves as a gateway towards spikingneuron models in NEURON. For each model, we explain how we present patternsto the cells, implement the learning rule and use the model in retrieval mode.

Due to its ability to handle vectors and matrices, MATLAB serves as a con-venient computational tool to build up artificial neural networks. The built-infunctions allow a great variety of 2D and 3D graphic display. One can also importdata computed elsewhere into MATLAB and process it according to its wishes.

But MATLAB loses all of its computational power when it has to process dataprocedurally which is the case for our differential equations. In our learning rule,we have to update and compute multiple variables at each time step, because wedeal with three sets of first order linear differential equations (equations 3.1 to 3.8).Since these computations cannot be gathered in a matrix and all treated in batchfashion, MATLAB is structurally inefficient for our task.

However, we can use it for single-synapse learning (only two units : one presy-naptic and one postsynaptic) on reasonable time-scales (between 1,000 ms and10,000 ms) and exploit its graphical display facilities, which is the reason why wefirst implemented our learning rule in MATLAB. The aim is qualitative : displayingweights and biases corresponding to different input patterns and giving an insightto the synapse’s internal dynamics (primary, secondary and tertiary traces’ time-courses).

4.1 Pattern presentation

In this section, we explain how we presented patterns to the units ; in otherwords how input is fed to the network. We have three ways to present patterns: non-spiking, frequency-based spiking and Poisson generated spiking. It is to benoted that all along the following chapters we focus on single-synapse learning,meaning that we deal with two units (presynaptic and postsynaptic) connected by

27

Page 38: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION

a single synapse.

4.1.1 Non-spiking Pattern Presentation

As a starting point of our investigations and a reference for our further results,we will test our learning rule by feeding patterns in a similar process to what hasbeen done before with non-spiking units (Sandberg et al. 2002) [31]. To achievethis, we clamp input to the presynaptic and postsynaptic units yi and yj to therespective values ki and kj during a certain time of presentation of about tens ofmilliseconds. The values ki and kj can take only binary values or a continous valuein the range [0, 1] (graded input). Patterns are fed to the network sequentially.

For instance, if the set of input patterns, we want to learn is (1, 1), (0, 0), (1, 1), (0, 1), (1, 1),then yi will be clamped to the following set of values (1,0,1,0,1) and yj will beclamped to (1,0,1,0,1). The input variables yi and yj are ‘stepped’ and disconti-nous (see Figure 4.1a). Hence, abstract units are artifical, because no biologicallyobserved variable takes constant values or exhibits such a discontinous time-course.

The time of presentation is important because it needs to be long enough forthe primary traces to retain pattern activities (the longer the pattern is seen, thestronger the memory) but it is also valuable to impose some resting time betweenpatterns. Indeed, during each pattern presentation, the network needs to adapt toit and rearrange its internal structure. In addition, between patterns, it needs torest for a short while, so that the fast dynamics internal variables return to theirbaseline. An explanation of this is when we are learning different things and wealways need some adaptation to jump from one thing to another. We will expandon this in the Discussion section.

On the other hand, when we want to teach a concept to our network throughout atemporal series of patterns, the time-scale of the learning phase needs to be smallerthan the dynamics of the long-term memory traces pi, pj and pij , otherwise thesynapse forgets what has been fed to it in the past. If the long-term memory time-constant τp equals 1 second, then after 5 seconds, past events will be discarded. So,in this case, it doesn’t make sense to have a learning procedure that takes longerthan 5 seconds. In a nutshell, learning procedures should not exceed the forgettingof our long-term memory.

In MATLAB, the function generate_pattern_nonspiking generates a drivinginput x(t) from a series a of parameters : delay, the resting between patternpresentation, dur, the duration of presentation of one pattern, T, the length ofthe output and pattern, a vector containing the values for the driving input x(t).Figure 4.1a shows the input activity of an abstract unit fed with the pattern x =[1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75].

4.1.2 Spiking frequency-based Pattern Presentation

Because of its biological irrelevance, the previous pattern presentation schemeis limited. This time, we try to mimic the ‘spiking behaviour’ of membrane voltage

28

Page 39: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

4.1. PATTERN PRESENTATION

observed in real experiment. Still, our spike generation in MATLAB is artificialbut we are making progress in trying to imitate spiking behaviour. So, we build upartificial spiking voltages by setting the input variable yj to 1 on the occurence ofa spike and to zero otherwise. If ts denotes the time occurence of a spike for unit i,then our input variable yi variable can be rewritten

yi(t) =∑ts

δ(t− ts)

Pattern presentation to the input units is now based on their firing frequencyrather than on a fixed stepped value. The idea is to realise a linear mapping froma value of xi between 0 and 1 (representing the confidence of feature detectiondeveloped in previous chapters) to a frequency fi. To achieve this, the value 1 forxi will be mapped to a maximum frequency fmax and other values between 0 and1 to a value directly proportional in the range [0, fmax] (i.e. 0.5 will be mappedto fmax

2 , 0.25 will be mapped to fmax4 , and so on). By doing this, we have just

created an input filter that transcripts graded input xi(t) between 0 and 1 to aspiking time-dependent variable yi(t). We will later refer to the stepped value xi(t)as the driving input and to yi(t) as the actual input activity, the first being usedonly for pattern presentation and the latter to compute the synaptic traces, hencethe weights and biases.

An important feature in the frequency-based pattern presentation is that itallows us to easily control the timing between presynaptic and postsynaptic spikes.This offers an implementation possibility when we want to investigate the effects ofexact spike timing on the weight modification in our learning rule.

In MATLAB, the function generate_frequency_spiking generates an inputactivity y(t) from a driving input x(t). The series a of parameters is similar withthe previous section and includes a value fmax, which corresponds to the maximumoutput frequency (when x(t) takes a value of 1). In order to generate spikes, wediscretize the time-scale by intervals of 1 milisecond : when a spike occurs at aspecific time t0, the value y(t0) is simply set to 1. Figure 4.1b shows the inputactivity of an abstract unit fed with the pattern x = [1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75].

4.1.3 Spiking Poisson-generated Pattern PresentationWe make one more step in the direction of mimicking neural-like data by im-

plementing Poisson spike trains, to feed input units. In the cortex, the timing ofsuccessive action potentials is highly irregular and we can view the irregular inter-spike interval as a random process. It implies that an instantaneous estimate ofthe spike rate can be obtained by averaging the pooled responses of many individ-ual neurons, but precise timing of individual spikes conveys little information Thebenefit of the Poisson process for spike generation is that it adds randomness anddiscards the determinism in our simulation (each random seed will give differentspike trains). Thus, we focus on the parameters underlying this random processrather than modeling precise coincidences of presynaptic and postsynaptic events.

29

Page 40: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION

(a) Non-Spiking Pattern Presentation

(b) Spiking Frequency-based Pattern Presentation(c) Spiking Poisson-generated Pattern Presentation

Figure 4.1: Abstract Units Pattern Presentations corresponding to the pattern x =[1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75]

We assume here that the generation of each spike depends only on an underlyingsignal r(t) that we will refer to as an instantaneous firing rate. It follows that thegeneration of each spike is independent of all the others spikes, which is called thespike independent hypothesis. Plus, we will make the assumption that the firing rater(t) is constant over time (actually r(t) is updated by steps but for one pattern wecan suppose than r(t) = r). The Poisson process is then said to be homogenous.

In a Poisson process, the probability that n events with a instantaneous rate roccur in ∆t is given by the formula :

P (n spikes during ∆t) = e−r∆t

[(r∆t)nn!

](4.1)

By setting n = 0 and ∆t = τ , we obtain P (next spike occurs after τ) = e−rτ andit follows that

P (next spike occurs before τ) = 1− e−rτ (4.2)

One way to implement a Poisson spike trains is to use equation 4.2 : we generatea random number between 0 and 1 and the inter-spike interval is given by the valueof τ that realizes the identity. But, the drawback of this method, is that the spiketrain has to be created sequentially. We can create a whole Poisson spike train atonce by doing as follows.

30

Page 41: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

4.2. LEARNING RULE IMPLEMENTATION

The average spike count between t1 and t2 can be defined from the instantaneousfiring rate by 〈n〉 =

∫ t2t1r(t) dt and for sufficiently small intervals, t1 = t − δt

2 andt2 = t+ δt

2 , the average spike count can be approximated by 〈n〉 = r(t)δt = rδt afterthe homogenous poisson process hypothesis. Furthermore, when δt is small enough,the average spike count equals the probability of the firing of a single spike

P (one spike occurs during the interval (t− δt2 , t+ δt

2 )) = rδt (4.3)

Now, assuming δt is small enough (usually 1 ms), if we want to create a spiketrain af arbitrary length T using 4.3, we need to generate T

δtrandom numbers pi

between 0 and 1. Then if pi < r, we generate spike at the time correponding to theindex of pi and if not, no spike is generated.

The Poisson Spike generation is an intermediate stage to NEURON implemen-tations. It allows us to account for random rate-based spike generation. This isvaluable, because this process is easy to implement and gives us an idea if ourmodel responds well to noisy or random data. Later, some noisy spike trains maybe added to our data so that it resembles what is observed in vivo.

In MATLAB, the function generate_poisson_spiking generates an input ac-tivity y(t) from a driving input x(t). The series a of parameters is similar with theprevious section and the rate r is set to the same value as the frequency fmax usedbefore. We stress the fact that the Poisson-generation of spike trains is based on arandom process. Thus, each seed gives a different input activity y(t) for the samedriving input x(t). By setting the same seed in two runs, they become identiti-cal. Figure 4.1c shows the input activity of an abstract unit fed with the patternx = [1, 0, 1, 0, 0.5, 1, 0.25, 0, 0.75].

4.2 Learning Rule ImplementationIn order to solve the differential equations in MATLAB, we used the solver

ode45. Its use is quite straight-forward, except that this is achieved by using func-tions handles, which makes it tricky to control intrinsic equation parameters, likethe time-constant τi or the print-now signal κ. If the implementation of the learningrule follows the set of equations 3.1 to 3.10, a non-negligible phenomenon arises :when spikes are modeled by a sum of unit impulse functions in MATLAB and itis likely that the solver might miss them, because at each time-step, in order toevaluate the derivatives of a point it uses points in its neighbourhood. Not only thespiking variables are highly discontinous, but they are also zero most of the time,which prevents the solver ode45 to detect any activity.

A solution to this problem is to introduce a ‘duration’ δt for the spikes (typicallyδt equals 1 to 2 milliseconds), so that the mathematical modelisation of a spikeswitches from an impulse function to a door function of width δt centered in ts.But in that case, 1

τiis an upper bound for of dyi

dt (see equation 3.1), which resultsin a small increase of the primary trace zi(t). This propagates to the secondary

31

Page 42: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION

and to the tertiary traces, which, as a result, hardly overcome 0.001 which is highlyundesirable, because they are supposed to represent probabilities of activation.

To bypass these problems, we decide to split the set of equations 3.1 to 3.10in two phases : First we update the primary traces with the help of an auxiliaryfunction generate_primary_trace, which solves the equation 4.4.

Zs(t) = z−i (t) + r(1− z−i (t)) if xi(t) = 1zi(t) = Zse

− t−tsτi

(4.4)

where ts records the occurence of the last spike and Zs is updated according to theproportional trace update.

The set of equations 3.3 to 3.8 is done separately using the solver ode45. Specialcare has to be taken with the time-step increment in order to find a trade-off betweencomputing time and accuracy. Also, the weight update 3.10 and bias update 3.9are straight-forward.

Finally, it is important to mention that we have implemented ‘off-line learning’,in the sense that weights and biases are updated independtly from each other.Everything happens as if there was no connection at all between the cells. This issomehow not a constraint during learning, but on the contrary rather convenient.This remains to be investigated : when learning should occur and when inferenceshould take over. In our model of abstract units though, the boundary betweenlearning and inference is clear because they are governed by different formulae usedin a different context.

4.3 RetrievalIf the learning phase is central in our implementation, it is also crucial to check

that the stored patterns can be retrieved correctly. The aquired knowledge is to beeasily retrieved, especially when we use the BCPNN as an auto or hetero-associativememory.

Thus, in this section, we assume that a certain learning phase has already oc-cured and that the weights ωij and bias βj are set. Our goal is to present anincomplete pattern and to check if the network is able to complete it correctly.Since we only deal with one synapse, input will be fed to the presynaptic unit andoutput will be collected at the postsynaptic unit.

Because, we have three different pattern presentation schemes in our abstractunits model, inference is done in three different fashions. In all cases, however,the retrieval phase aims to realise an input-output mapping from unit i to unit j.Quantitative results are presented in the next Chapter, we focus here on the methodthat enables us to achieve this.

Non-Spiking InferenceThis case is the simplest because the activity of a unit is constant over time

(for the duration of one pattern presentation). In other words, because there is no

32

Page 43: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

4.3. RETRIEVAL

difference between the driving input xi(t) and the input activity yi(t), the input-output mapping is straight forward. Assuming that unit i is fed an input patterncorresponding to the driving input xi = ki, we first compute the support value hjof unit j with hj = ωijxi + βj , and then we update the output unit activity withxj = ehj . Finally, the input-output mapping is realized by the equation 4.5.

xj = eωijxi+βj (4.5)

In order to produce the input-output relationship curve, we compute the outputxj according to equation 4.5, for a set of input values xi, regularly spaced between 0and 1. We end up with a corresponding output vector y mapped to an input vectorx. It is to be noticed that the above equation is the same as the equation presentedin Chapter 2 (section 2.3.4), with only two units. If the learning phase has beensuccessful, xj is nothing but the a posteriori probability of unit j, knowing unit i.

Spiking Inference

In the Spiking Frequency-based Poisson-generated pattern presentation schemes,the input activity yi(t) is not constant over time. A value ki of the driving inputxi(t) corresponds to a firing frequency fi in one case, and to a firing rate in theother case. Thus, the process of inference for one value of xi is not given by a directcalculation like in equation 4.5, but instead, it is dependent on the time-course ofthe spiking input activity yi(t), governed by the driving input value xi of unit i.

This input activity yi(t) needs to be processed to calculate a correspond outputvalue xj . In order to map an input value xi to a number xj between 0 and 1, weprocess as follows :

1. We generate a regular spiking input yi(t) with frequency fi (FS) or a Poissonspike train with rate fi (PS), during a time Tinf equal to 5 seconds. The firingfrequency or rate obeys fi = xi.fmax with xi ∈ [0, 1].

2. We compute a support activity sj(t) according the relation

sj(t) = ωijyi(t) + βj

3. The support activity sj(t) is then low-pass filtered by a filter with a hightime-constant τf and slow update value k.

dsjdt

= k(sj − βj)− sjτf

4. We take the exponential of the filtered support activity sj(t).

5. xj is finally set to the mean stationnary value of the output activity 〈yj(t∞)〉.

33

Page 44: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 4. ABSTRACT UNITS IMPLEMENTATION

Figure 4.2: Spiking Inference with abstract units - Different stages of processing

Figure 4.3 shows these different stages of processing. This technique, despite itsapparent complexity, gives a good fit to the previous non-spiking mapping. The keyprocedure occurs at step 3, when we filter the support input activity sj(t). Thisvariable is equal to the bias term βj when the input unit does not spike, and it isset to the value βj +ωij on the occurrence of a spike. When we filter with a specificlow-pass filter (high time constant, small update rate), we can generate a filteredsupport activity sj(t), which works as an additive trace. Hence, the value of sj(t)at the end of the stimulation gives a measure of the firing frequency of the cell.The direction of update k.(sj − βj) is proportional to the weight value ωij , whichallows negative or positive build-up according to the sign of ωij . Typically, we useτf = 500 ms and k = 1

fmax.

Step 4 is needed to keep the inference equation homogenous. It is crucial thatsj(t) stays in the range ]−∞, 0], because we want to get a value of xj between 0 and1. This can be controled either by the value k, or by modifying the filtering equationlike in the case of saturated traces (see Chapter 3). The biological model is composedof steps 3 and 4, because we can draw an analogy of these processes with what occursat the synapse level. Shortly, the filtering accounts for synaptic integration withlow release of neuro-transmitters and slow degradation. The exponentiation in step4 is observed in the mapping current-frequency of a cell (called current-dischargerelationship).

For the Poisson-generated spike trains, the underlying random process gives dif-ferent output at each run. Thus, we have to compute average values, after repeatingthe same inference process over several runs (between 5 and 10 runs). There is atrade-off between discarding the randomness by increasing the number of runs andcomputing time for simulations. Also it is important to keep the randomness intro-duced with the Poisson process because it accounts for irregular spiking observedin real neurons.

34

Page 45: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 5

Hodgkin-Huxley SpikingImplementation in NEURON

NEURON is a simulation environment for modeling individual neurons andneural networks. It was primarily developed by Michael Hines, John W. Moore,and Ted Carnevale at Yale and Duke. Documentation about NEURON and how toimplement models in NEURON is given in the NEURON book [8].

NEURON, which is associated with the object-oriented NMODL language, offersan efficient means to run simulations of highly connected network of neurons. Builton the paradigm of C language, it does not suffer under procedural processing ofdata and uses efficient and fast algorithms to solve differential equations. Thecomputing time of the abstract units model is thereby reduced by a factor 10.

5.1 Cell Model

5.1.1 Hodgkin Huxley ModelIn 1952, Alan Lloyd Hodgkin and Andrew Huxley proposed a model to explain

the ionic mechanisms underlying the initiation and propagation of action potentialsin the squid giant axon [14]. They received the Nobel Prize in 1963 in Physiologyand Medicine for this work and the model has since been refered to as the Hodgkin-Huxley model. It describes how action potentials in neurons are initiated andpropagated with the help of a set of nonlinear ordinary differential equations whichapproximates the electrical characteristics of excitable cells such as neurons andcardiac myocytes [2].

The main idea behind the Hodgkin-Huxley formalism is to give an electricalequivalent to each bioligical component of the cell that plays a role in the trans-mission of acion potentials, which is the support of signaling within the cell. Thecomponents of a typical HodgkinâHuxley model, shown in Figure 5.1.1, include :

• A capacitance Cm, representing the lipid bilayer. A cell, considered as whole,is electrically neutral but the neighbourhood surrounding the cell membrane

35

Page 46: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON

Figure 5.1: Hodgkin-Huxley model of a cell

is not. Membrane voltage is the consequence of the accumulation of chargedparticles on both sides of that bilayer, impermeable to ions. A typical valuefor Cm is 1 nF.

• Nonlinear electrical conductances gn(Vm, t), representing voltage-gated ionchannels. Their behaviour is described by gating variables that describe open,closed and reverberatory states (see Appendix for equations). These conduc-tances are both voltage and time-dependent : gn(Vm, t) where n denotes aspecific ion species. In addition, they exhibit fast dynamics because they ac-count for the cell regenerative properties implied in the propagation of actionpotentials.

• A linear conductance gleak, for passive leak channels, these channels thatare not ion-selective, always open and contribute to the resting membranepotential. A typical value for gleak is 20 µS.cm−2.

• Generators En, describing the electrochemical gradients driving the flow ofions, the values of which are determined from the Nernst potential of the ionicspecies of interest.

This model can be extended by modeling ion pumps with the help of currentsources (the sodium-potassium ion pump is responsible for the concentrations equi-librium inside and outside the cell). More elaborate models, include chloride andcalcium voltage-gated current, however, we only deal here with two ionic currents :sodium and potassium and one leakage channel.

36

Page 47: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

5.1. CELL MODEL

Further, our cell model will contain additional channels (see figure 5.3.2) : a slowdynamics voltage-gated potassium channel accounting for spike-frequency adapta-tion (see section 5.1.2) and an activity-dependent potassium channel modeling in-trinsic excitability (see section 5.3.2).

As a convention we will use I > 0 when ions flow from the outside to the insideof the cell, so that, in the normal cell dynamics, the sodium current takes positivevalues and the potassium current takes negative values. The voltage equation isgiven by the relation between applied current Iapp, capacitive currents Ic and thesum of ion and leak currents Iion :

Iapp = Ic+ Iion = CmdVmdt

+ INa + IK + Ileak (5.1)

We see that when Iapp > 0 then dVmdt > 0 and the membrane voltage becomes more

positive (depolarization). The dynamics of the detailed voltage and gating variablesequations are given in Appendix.

5.1.2 Spike Frequency Adaptation

Spike-frequency adaptation is a type of neural adaptation that plays a key rolein firing frequency regulation of neurons. It is characterized by an increase of theinterspike interval when a neuron is current-clamped. Among other mechanisms,various ionic currents modulating spike generation cause this type of neural adapta-tion : voltage-gated potassium currents (M-type currents), the interplay of calciumcurrents and intracellular calcium dynamics with calcium-gated potassium channels(AHP-type currents), and the slow recovery from inactivation of the fast sodiumcurrent (Benda et al. 2003)[35]. Spike-frequency adaptation can account for thefindings in burst firing (Azouz et al 2000) [5].

Figure 5.2: Spike-frequency Adaptation : Membrane voltage and state variable p

37

Page 48: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON

In our model, spike frequency adaptation is taken into account by adding aslow-dynamics voltage-gated potassium channel. The conductance of this channelis non-linear and depends on membrane voltage Vm. It is described by an activationvariable p, that works in a similar way to an additive synaptic trace (see figure 5.1.2).

Dynamics of the channel are given by the following equations

gkim(Vm, t) = gkim.p (5.2)

dp

dt= p∞(Vm, t)− p

τ(Vm, t)(5.3)

Figure 5.1.2 describes the build up of the trace p and the conductance gkim,which is responsible for the increase of the interspike interval along the stimulation.Because of its slow decay, the delay between stimulation must be much longer thanthe stimulation itself for the p variable to return to baseline. The slow dynamics ofthis channel suggest that repeated strong transient stimulation has a better effectthan long-lasting stimulation.

5.2 Pattern presentationNow that our cells are no longer modeled by artificial units, but instead by

complex spiking Hodgkin-Huxley units, input and output need to be matched toreal variables. As presented above, the Hodgkin-Huxley model is based on therelation between membrane potential Vm and the individual ionic currents Ii. Thusit is natural to feed input to one cell by injecting a certain amount of current Iappinto it and reading the output as the membrane potential firing frequency f . Toachieve this, we will use current electrodes to present patterns to the network :Some current is injected inside the cell membrane, which depolarizes it and triggersaction potentials, and the membrane voltage is recorded as the difference of potentialbetween two electrodes, one inside and the other outside of the cell (see Kandel 1995[20] about current-clamp technique).

So the input-output relationship (which is similar to the activity function inartificial networks) is given by mapping the injected current Iapp to the membranefiring frequency f . The curve giving the firing frequency of one unit versus the in-jected input current is called steady-state current discharge. This curve is presentedfor our units in the next chapter (see figure 6.6). For weak currents, no active firingis triggered (the depolarization induced by the current injection is too small for themembrane to reach the threshold and no action potential is recorded). For cur-rents which are too strong, the Hodgkin-Huxley voltage-gated potassium channelsbecome inefficient to repolarize the cell and the membrane voltage stabilizes to asupra-thresholded value. Thus, we must feed input currents that belong to a range[0, Imax], where the steady-state current discharge is approximately linear.

During learning, we feed input patterns sequentially. It is entirely frequency-based, meaning that input corresponding to a value between 0 and 1 is mapped to afiring frequency. The current-frequency relationship will be used to find the current

38

Page 49: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

5.3. LEARNING RULE IMPLEMENTATION

clamp value in order to obtain the right frequency. Let’s assume that the set ofinput patterns we want to learn is (1, 1), (0, 0), (1, 1), (0, 1), (1, 1), then the uniti must fire with the following frequency behaviour (fmax, 0, fmax, 0, fmax) and unitj with (fmax, 0, fmax, fmax, fmax). Using the steady-state current discharge curve,we inject corresponding currents in order to obtain the desired firing frequencies.

5.3 Learning Rule Implementation

In this section, we present how weights and biases are represented and updatedduring learning, in the NEURON spiking environment. We use the object-orientedNMODL language to create new mechanisms for simulations in NEURON. Thedetails of the code are given in Appendix.

5.3.1 Synaptic Integration

Modeling the synapse

In the artificial context the weight ωij between two units quantifies the strengthof the connection between them. If ωij is high, then the connection between the twounits is strong and they influence one another significantly. On the other hand, ifωij is close to zero, the connection is very weak and the corresponding units behaveas if they were not connected. The simplest way to represent this in our spikingcontext is to map ωij to a synaptic conductance gij between two units i and j. Thisconductance would be time-dependent and closely related to the presynaptic andpostsynaptic events.

So, we create a model of a synapse which has intrinsic properties fulfiling theweight update equation of our spiking learning rule 3.10, and call it a BCPNNSynapse. It is defined as a point-process in NMODL, which means that one canimplement as many instances of this mechanism as long as one specifies a location(a section in NEURON). All local variables associated with the section it has beenattached to become available to the point-process (membrane voltage, ionic cur-rents, etc.). As a convention, we will always place a synapse on the postsynapticcell soma.

Conductance Expression

In our model, the synaptic conductance gij(t) is a product of three quantities :

gij(t) = gmax.gcomp(pi, pj , pij , t).αi(yi, t) (5.4)

gmax is the maximum conductance of the synapse : it regulates its strength (abil-ity to conduct current) and can temporarily be set to zero if one wants to operateoff-line learning.gcomp(pi, pj , pij , t) is directly computed from the tertiary traces pi, pj and pij simi-

39

Page 50: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON

larly to equation 3.10 :

gcomp(pi, pj , pij , t) =

log(ε) if pijpipj

< ε

log(pijpipj

)otherwise

(5.5)

αi(yi, t) is a gating variable that allows current to flow through the synapse onlyafter the occurence of a presynaptic spike and during a period controled by a time-constant τα. In our implementation we used αi(yi, t) = zi(t) and τα = τi, but thesetwo variables should not confused, because they do not represent the same biologicalprocesses, although assimilating them here is appropriate.

Solving the differential equations

Solving the set of 8 differential equations 3.1 to 3.8 is required to compute gcomp(pi, pj , pij , t).We will use the method cnexp. The 8 traces are defined as STATE variables andthe set of equations is implemented in DERIVATIVE block. In order to avoid theproblems we could encounter if we were to implement the equations directly (seesection Learning Implementation in the previous chapter), equations 3.1 and 3.2 areimplemented by direct update (equations 5.6 and 5.7).

zi(t) is given by =

dzidt = − zi

τiin the absence of presynaptic activity

zi = k(1− z−i ) when a presynaptic spike is detected(5.6)

Another modification, which is not introduced with abstract units, is added inthe spiking context : Because, we want to investigate spike-timing dependence,we will have to give different values to the parameters τi and τj . However, whenthe primary traces have different time-constants, it can be demonstrated that thispropagates to the secondary and tertiary traces. Indeed, the secondary trace eiquantifies the integral (in terms of surface or area under a curve) of the primarytrace zi. When zi(t) and zj(t) have different time-constants, one of the secondarytraces becomes much greater than the other, which is equivalent to silencing oneunit over the other and is undesirable.

To overcome this we introduce the quotient τiτj

in equations 3.2 and 3.5. Thisheuristic method allows us to manipulate time-constants independtly without spoil-ing the activity of secondary and tertiary traces. Equation 5.7 garantees that, afterthe occurence of a spike, the integral of the two primary traces zi(t) and zj(t) isequal (so that secondary traces ei(t) and ej(t) stay in the same range), and equation5.8 corrects the mutual trace eij(t) to be influenced by the change in time-constants,because its calculation is based on the product ei(t)ej(t).

zj(t) is given by = dzj

dt = − zjτj

in the absence of postsynaptic activityzj = k( τiτj − z

−j ) when a postsynaptic spike is detected

(5.7)

40

Page 51: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

5.3. LEARNING RULE IMPLEMENTATION

deijdt

=τiτjzizj − eijτe

(5.8)

Besides these changes, the learning rule implementations follows the set of equationspresented in Chapter 3. It is important to mention that the modification introducedin equation 5.8 makes all secondary traces decay with the same time-constants. Thusno change needs be added to the tertiary traces equations.

Connecting neurons to the synapse

(a) Single-synapse NEURON implemen-tation

(b) Mutli-synapse NEURON implementation

Figure 5.3: Schematic representation of BCPNN Synapses - The synapse isalways places on the postsynaptic cell and links to the presynaptic cell with a certaindelay, threshold and weight. The synapse has a virtual link to the postsynaptic cellaccounting for backpropagating action potentials with a short delay of 1 ms

The remaining difficulty is the method used to ‘detect’ spikes. A BCPNN synapseis a point-process and can only access the local variables of the section it has beenattached to : the postsynaptic cell. Since we do not want to use pointers or importpre-calculated data to the synapse, we use NetCon objects to create a connection.A NetCon object is a built-in process that enables one to attach a source of eventss (usually a membrane voltage) to a target t. When this source of events crosses athreshold thresh in the positive direction, a weight w is send to the NET_RECEIVEblock of the target, with a delay del, and the code in this block is executed. In ourcase, the target is the synapse and the presynaptic membrane voltage is the sourceof events. When the voltage crosses a threshold (typically -20 mV), a positive weightw is sent to the synapse with certain conduction delay. We did not give any value tothis conduction delay, assuming that presynaptic and postsynaptic events arrive atthe synapse at the same time. However, our model can account for different delays

41

Page 52: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON

according to the cell originating the action potential. Typically, a value between 5and 10 ms is realistic for a conduction delay along the presynaptic axon.

However, postsynaptic spikes also need to be detected in order to update zj(t),as in equation 5.6. This time postsynaptic membrane voltage is available at thesynapse level, but we need to construct a function detect that checks at each timestep if a postsynaptic spike has occured and updates the trace zj to the desiredvalue. This is possible, but the main problem is that it would make us use differentmethods to detect presynaptic and postsynaptic spiking.

The ‘trick’ used to bypass these problems, shown in figure 5.3, is to create avirtual link from the postsynaptic cell to all of its synapses. Each synapse receivesanother NetCon object, which source of events is nothing but the postsynaptic cellitself, with a short delay of about 1ms, accounting for the backpropagation actionpotential delay, and a negative weight indicating that the sender is the cell itself.At the synapse level, events coming from the presynaptic and postsynaptic cells aretreated similarly : the sign of the weight w is used to recognize the event sender andupdate only the corresponding primary trace zi(t) or zj(t), according to 5.6.

5.3.2 Bias term

In this section, we present how the bias is represented and updated during learn-ing, in the NEURON spiking environment. Again, the NMODL language is used tocreate a new mechanism for simulations in NEURON. The details of the code aregiven in Appendix.

Ion Channel Modeling

In the artificial context, the bias term βi of one unit quantifies its excitability.If βi is close to zero, unit i is normaly excitable. However, if βi takes a strongnegative value, the activity required to reach the threshold of the transfer functionis much higher. The bias term quantifies the intrinsic plasticity of the cell which isthe persistent modification of a neuronâs intrinsic electrical properties by neuronalor synaptic activity. A phenomenon called long-term potentiation of intrinsic ex-citability (LTP-IE) has been demonstrated. It is characterized by strong, transientsynaptic or somatic stimulation that tends to produce an increased ability to gener-ate spikes. It is often accompanied by a decrease in the action potential threshold,a shift in the steady-state inactivation curve of activity dependent potassium chan-nels and a reduction of After Hyperpolarization Phase (AHP) (Xu et al. 2005) [34].Moreover Jung et al. relate a biphasic downregulation of A-type potassium current: transient shift in the inactivation curve and long-lasting reduction of peak A-typecurrent amplitude (Jung et al. 2009) [19].

The assumption here is to implement the bias by an activity-regulated potassiumchannel : one way to achieve this is to map the bias term βi to a real parameter inour spiking context like the conductance of an activity-dependent A-type potassiumchannel gki, attached to unit i. This conductance would be time-dependent and

42

Page 53: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

5.3. LEARNING RULE IMPLEMENTATION

closely related to the past presynaptic events. However it does not depend on thesynaptic events, but only on the past activity of the cell. In NMODL, The A-TypePotassium channel is defined as a distributed mechanism, intrinsic to a cell. One cellcannot have more than one A-type K+ channel, but we assume that all pyramidalcells will include this A-Type Potassium channel.

Figure 5.4: Extended Hodgkin-Huxley cell model

Conductance Expression

In our model, the potassium conductance gki is a product of two quantities :

gki(t) = gkmax.gkcomp(pi) (5.9)

gkmax is the maximum conductance of the A-type potassium channel : it regulatesthe permeability of the channel and can temporarily be set to zero if one wants todiscard intrinsic plasticity effects during learning.gkcomp(pi) is directly computed from the tertiary traces pi according to :

gkcomp(pi) =

log(ε) if pi < ε− log(pi) otherwise (5.10)

In fact, equations 3.9 and 5.10 are similar, with the only difference being thatgcomp(pi) and βi have opposite values. The reason for this is that LTP-IE is equalto downregulation of A-type Potassium current : the cell is more excitable whenless potassium current flows through the channels because it reduces the after-hyperpolarization phase (AHP). This is consistent with our definition of gkcomp(pi),

43

Page 54: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 5. HODGKIN-HUXLEY SPIKING IMPLEMENTATION IN NEURON

which will decrease if pi increase, meaning that enhanced intrinsic plasticity occursafter strong activity in the cell.

Solving the differential equation

Now, in order to compute gcomp(pi), we only need to solve three differential equations(equations 3.1, 3.3 and 3.6). Because we only need the intrinsic traces of cell i :zi(t), ei(t) and pi(t), we do not have to take different actions according to whichneuron fires. We will use the method cnexp. The 3 traces are defined as STATEvariables and the set of equations is implemented in DERIVATIVE block.

Equations 3.3 and 3.6 are implemented directly, but the primary trace zi(t) is,once again, implemented by a disjunction of cases : exponential decay when noactivity is recorded and direct update when a spike is detected (equation 5.6). Themethod used to detect membrane voltage firing is radically different this time (seenext section).

Detecting spikes

This time, however, no NetCon object can be used, since only a point-process can beused as a target, whereas our A-Type Potassium current is a distributed mechanism.The solution used here is to build a function detect, called at each time-step, whichchecks if the membrane voltage yi has crossed a threshold. Using a set of flags, wemake sure that a value firing turns to 1 on the occurence of a spike, and goesback to zero on the very next time-step. We also add a counter so that update ofthe primary trace delay between spike detection and primary trace update, can becontroled.

This solution with a detect function is somewhat more time-consuming, butsince we want to modelize an intrinsic phenomenon, we avoid to reference to vari-ables computed elsewhere : all computation has to be done locally. The main reasonfor this choice is that we want the different learning mechanisms to be independent,because one is a point-process (BCPNNSynapse) and the other is a distributed mech-anism (A-TypePotassium). A constant update from values computed in the first tovalues used in the second might deteriorate simulations. Also the use of the delaycounter enables to match exactly the values pi(t) computed intrinsically in unit i tothe values obtained by computing it at the synapse level using NetCon objects.

In fact, there is only a short delay (about half of a millisecond) that can becorrected by changing the initial value of the counter. Our final cell model is shownin figure 5.3.2. It includes two more ion channels than the standard Hodgkin-Huxleymodel (figure 5.1.1). It can account for spike-frequency adaptation and intrinsicexcitability.

44

Page 55: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

5.4. RETRIEVAL

5.4 RetrievalBecause NEURON is build to model real neurons, the notion of connection

between to units exists constantly, as long as we have connected them via a NetConobject. This is somewhat different from what we have done with abstract units,where all was done during the learning phase as if neurons were not connected toeach other. In this respect, inference consists into feeding some input stimulation toone or several units and recording what we get during the simulation, at the outputunits. In the single-synapse context, a current-clamp is attached to the presynapticunit i with an input current proportional to the driving input xi. Inference is purelyfrequency-based, which means that input xi is characterized by its firing frequencyfi, and the output activity frequency gives the corresponding output value xj .

However, in the spiking context, inference is part of the simulation : we decidenot to stop running a stimulation to infer after a learning phase, but to includein a continously operating fashion, which is closer to what can be observed in realneurons. Thus, we need to update a parameter that allows us to switch from thelearning mode to the inference mode : this parameter is the print-now signal κ. Itcan be either read from a file or updated at a predefined time by implementing a spe-cific but, when it goes beyond a threshold value, weights and biases are “frozen” andthe retrieval phase takes place. In our implementation we bypass this by updatingtwo parameters glearn for the synaptic conductance and gklearn for the potassiumchannel conductance, which are set to the values of gcomp and gkcomp respectively,and only when the print-now signal is small enough. The update of the conductanceg and gk includes a test in the BREAKPOINT according to the value of the print-nowsignal. Updating the conductances according to the traces is done in the learningmode, when κ is large enough, otherwise the conductances are computed directlyfrom the updated values of glearn and gklearn, in the inference mode.

One thing we must pay attention to is when we want to exhibit the input-output mapping. In this very case, it is important that presynaptic stimulationis not current-clamp-driven because for low current values (when xi is close tozero), no spiking occurs because the action potential threshold is not reached by themembrane voltage. To overcome this problem, we use the built-in process NetStim,which enables us to control the spiking frequency of an input stimulation, even forreally low-valued inputs. As a result, the driving input xi is directly mapped to afrequency value fi, according to fi = xi.fmax.

45

Page 56: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH
Page 57: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 6

Results

6.1 Abstract units

6.1.1 LearningIn this section we present the outcome of a learning phase, for the three different

pattern presentations schemes. We are only dealing with one presynaptic i unit andone postsynaptic unit j connected by one synapse : this is called single-synapselearning In order to be able to compare the different methods, the same temporalseries of patterns for unit i, on the one hands, and unit j, on the other hand, ispresented in all three pattern presentation schemes. This means that the drivinginputs xi(t) and xj(t) are the same for the three pattern presentation schemes, but ofcourse the input activities yi() and yj(t) differ according to the pattern presentationused (non-spiking, frequency-based spiking or Poisson-spiking).

We aim to show the Hebbian features of our learning rule, so we divide thelearning phase (10 seconds) divided into five sequences (2 seconds), during whichthe presynaptic and postsynaptic units exhibit a statistical relation (correlation,anti-correlation, independance). Each 2-second sequence is divided into 10 patternpresentation intervals (200 ms), during which the driving inputs xi(t) and xj(t) takea constant value ki and kj , corresponding to the pattern x = [ki, kj ]. The learning iscomposed by the following sequences and the results are presented in figures 6.1.1,6.1.1 and 6.1.1 :

1. Strong Correlation (between 0 ms and 2,000 ms)Unit i and unit j are fed the same pattern : xi = xj = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1].

2. Independent Activation (between 2,000 ms and 4,000 ms)Unit i and unit j are fed independent patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1]and xj = [0, 1, 1, 0, 0, 1, 0, 0, 1, 1].

3. Strong Anti-Correlation (between 4,000 ms and 6,000 ms)Unit i and unit j are fed anti-correlated patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1]and xj = [0, 1, 0, 1, 0, 0, 1, 0, 1, 0].

47

Page 58: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

4. Presynaptic and Postsynaptic Muting (between 6,000 ms and 8,000 ms)no input to both units xi = xj = 0.

5. Presynaptic activation and Postsynaptic Muting (between 8,000 msand 10,000 ms)Unit i is fed a pattern and unit j is mute xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] andxj = 0.

Figure 6.1: Abstract Units Learning - Non-Spiking Presentation

In our simulation, τi = τj = 20 ms, τe = 200 ms and τp = 1,000 ms. Theprint-now signal κ is always equal to 1 and the floor value for the tertiary traces ε isset to 10−4. Each pattern is presented for 180 miliseconds and there is a relaxationlag of 20 miliseconds between each pattern. The maximum step-size for the ode45solver is set to 1, the maximum frequency fmax is 55 Hz and the proportionaltrace update value k is set to 0.8. Finally, a delay is introduced between detectionof a spike and primary trace update. This delay can be used to account for aconduction delay in synaptic transmission. But in practice, we use it to tune therecording of action potentials in the primary traces between our Hodgking-Huxleyspiking implementation in NEURON, and our spiking abstract units in MATLAB(this delay exists is present because input is fed to Hodgkin-Huxley units by currentinjection into cell, which does trigger action potential but not instantaneously). Inpractical, setting this delay to 25 ms gives a good tuning to our models. To discussthe results, we will refer to the three pattern presentation schemes with the followingabreviations : NS for non-spiking presentation, FS for frequency-based spiking andPS for Poisson-spiking.

48

Page 59: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

6.1. ABSTRACT UNITS

Figure 6.2: Abstract Units Learning - Spiking Frequency based Presentation

Figure 6.3: Abstract Units Learning - Spiking Poisson-generated Presentation

49

Page 60: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

During the correlation phase (from 0 ms to 2,000 ms), ωij increased rapidly to ahigh positive value (3.25 at 115 ms for NS, 2,80 at 140 ms for FS, 2,30 at 170 ms forPS) and decays to a lower positive value 1 after 10 patterns (0.73 for NS, 0,82 forFS, 0,72 ms for PS). This rapid increase is due to the first mutually active patternduring which the mutual trace pij equals both pi and pj . Later mutual inactivationis the cause of the decay after this high positive peak. The weight stabilizes to apositive value at the end of the sequence, accounting for a strong correlation betweenthe two units The bias terms βi and βj give a measure of the units’ intrinsic activity,they are equal during the whole sequence because xi(t) = xj(t). It is to be notedthat the value of -0.77 for βi and βj (at 2,000 ms for NS), is very close to log(0.5),and gives a very good approximation of the logarithm of the global units’ activation(1,080 ms over 2,000ms).

During the independence phase (from 2,000 ms to 4,000 ms), ωij decays grad-ually to zero (0.14 for NS, 0.22 for FS, 0,08 for PS, at 4,000 ms), accounting forstatistical independence between the two units. The bias terms values βi and βjstay in the same range as the previous sequence. Only for NS however, can weexploit the quantitative value of the bias terms. For FS and PS, their time-coursesexhibit the same dynamics but the value range is 2 to 3 times bigger.

During the anti-correlation phase (from 4,000 ms to 6,000 ms), ωij decays lin-earily to a strong negative value (-1.04 for NS, -1.05 for FS, -1,21 for PS, at 6,000ms), accounting for the strong anti-correlation between the two units. The lineardecay of the weight is somewhat characteristic of the logaritmic dependence on themutual trace pij , which decays exponentially to zero, because the secondary trace iszero when no unit is active together. The bias terms, however, are not affected bythis, because they only account for units individual activation. βi is always higherthan βj , because unit i is active more often than unit j.

When both units are mute (from 6,000 ms to 8,000 ms), ωij increases linearilyand both βi and βj decrease linearily. The slope of the three curve is only dependenton the time-constant τp. This is due to the fact that, when both units are silent,the three tertiary traces pi, pj and pij decay exponentially to zero, with the sametime-constant. As a result, the linear decrease of the bias term compensates exactlythe linear increase of the weight, which gives consistent results for inference whenno unit has been active during learning. This goes on until the one of the tracesreaches its floor value ε. Finally, when muting one unit and activating the other(from 8,000 ms to 10,000 ms), ωij reproduces anti-correlation results, by reachinga sstrong negative value, meanwhile the silent unit’s bias term keeps on increasinglinearily whereas the active unit’s bias term goes up again to account for activation.

The learning outputs in this three cases will be compared in the Chapter Dis-cussion, where we also have the output of the spiking implementation. However,we can note two things. First NS is more accurate quantitatively than FS and PS,simply because the primary traces are strongly influenced by the driving inputsxi(t) and xj(t). When we take a closer look, we can see that they stabilize to fixedstep values. Thus activity reverberates more in the secondary and tertiary traces,giving more accurate values to the weight and biases. Seondly, the time-courses of

50

Page 61: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

6.2. HODGKIN-HUXLEY SPIKING UNITS

the weight and biases are extremly similar, even if we use different pattern presenta-tion schemes, meaning that the qualitative output of the learning phase is stronglysimilar for the three implementations.

6.1.2 Retrieval

The retrieval phase occurs after the learning phase. Whereas there is no inter-ruption between these two phases in NEURON, it is implemented by two separateprocedures in MATLAB. We want to create an input-output mapping in order tounderstand what would be the outcome of the learning rule in a network context.So we assume that the learning phase described in the previous section has alreadyoccured and that we have stored the weights ωij and biases βi and βj , in a separatefile.

Now, we import the weights ωij and biases βi and βj values, at the end of each2-second sequence of the learning phase (i.e. at times 2,000 ms, 4,000 ms, 6,000 ms,8,000 ms and 10,000 ms). And we display the output values xj corresponding to aset of input values xi regularly spaced between 0 and 1. Figure 6.4 shows the pre-post mapping xj = f(xi) and the inverse post-pre mapping xi = f−1(xi), for thethree different pattern presentation schemes and figure 4.3 displays the time-coursesof the filtered support units activities yj(t) for FS and PS, for different input valuesof xi, and after each of the three first sequences of the learning phase.

The characteristics of the mapping are common to NS, FS and PS. The left-mostpoint of the mapping (corresponding to xi = 0) corresponds to the exponential of thebias term, which is the tertiary trace pi. The more unit i have been active during,the stronger the tertiary trace pi : we can easily see on the left mappings xj = f(xi),that the longer unit j stays mute, the smaller the value of left-most point. The slopeof the curve is only dependent on the weight ωij : the steeper the amplitude of ωij ,the stronger the slope. Moreover, the sign of the weight determines the directionof the curve, hence the fixed points of the mapping. After correlation, the slope ispositive and the only fixed point is 1. Conversely, the slope is negative after anti-correlation and the only fixed point is 0. After independence, the mapping takes aconstant value equal to the exponential of the bias term, which becomes the onlyfixed point. Fixed point dynamics is crucial for network implementation, becauseretrieval is a process that iterates an inference process until stability is reached.

6.2 Hodgkin-Huxley Spiking Units

6.2.1 Steady-State Current Discharge

Our pattern presentation in NEURON is somewhat more realistic than gener-ating spiking input as we did before. As mentioned earlier, we use current-clampsto feed input to one unit. But because, we want to work with frequency-basedlearning and inference simulations, it is important to have a direct mapping be-tween the amplitude of the current injected into one cell and its firing frequency.

51

Page 62: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

(a) Non-Spiking Mapping xj = f(xi) (b) Non-Spiking Mapping xi = f−1(xj)

(c) Spiking Frequency-based Mapping xj = f(xi) (d) Spiking Frequency-based Mapping xi =f−1(xj)

(e) Spiking Poisson-generated Mapping xj = f(xi)(f) Spiking Poisson-generated Mapping xi =f−1(xj)

Figure 6.4: Abstract Units Pre-post and Post-pre mappings for NS, FSand PS pattern presentations - Dark blue, green and red correspond respectivelyto inference after the correlation phase (T = 2,000 ms), the independence phase (T= 4,000 ms) and the anti-correlation phase (T = 6,000 ms). Inference after themuting of both units is displayed in light blue (T = 8,000 ms) and finally, inferenceafter sole activation of the presynaptic unit is shown in pink (T = 10,000 ms). Forthe spiking mappings, we chose fmax = 55 Hz and Tinf = 1,000 ms (duration of theinference simulation). Also, for the Poisson-spiking inference procedure, we take anaverage over 50 simulations.

52

Page 63: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

6.2. HODGKIN-HUXLEY SPIKING UNITS

(a) FS inference - Correlation phase (T = 2000 ms)(b) PS inference - Correlation phase (T = 2000 ms)

(c) FS inference - Independence phase (T = 4000ms)

(d) PS inference - Independence phase (T = 4000ms)

(e) FS inference - Anti-correlation phase (T = 6000ms)

(f) PS inference - Anti-correlation phase (T = 6000ms)

Figure 6.5: Abstract Units Spiking Inference Filtered Support units activitiesyj(t) for FS and PS after different input values xi and different sequences of thelearning phase : Blue, green and red curves correspond respectively to the inputvalues xi = 1, 0.5 and 0.25

53

Page 64: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

This relationship is called steady-state current discharge. In figure 6.6, we displaythe steady-state current discharge of our cell model. The input current is injectedduring 1,000 ms and its amplitude varies beween 0 nA and 0.5 nA. We gathered theresults for 100 cells, thus giving a precision of 0.005 nA.

Figure 6.6: Steady-state Current Discharge

However, we have to take into account the short-term adaptation features ofour cell model. Indeed, this phenomenon makes the inter-spike interval increasewhen the cell undergoes a prolonged stimulation. We display three curves : themean frequency (bold black curve) which is a count of the action potentials over1 second, the unadapted frequency (red curve), which is the spiking frequency atthe beginning of the spike train (typically, we take the frequency beween the firstand second spikes), and the adapted frequency, which is the spiking frequency atthe end of the spike train (calculated from the interspike interval between the lastand second last spike). As expected, the mean frequency is always smaller than theunadapted frequency and bigger that the adapted frequency. When dealing withhigh currents, some “edge effect” occurs for the count of action potential whichmakes the adapted and unadapted frequencies present irregularities.

The first thing we need to underline are the existence of a threshold for the inputcurrent, below which no action potential is recorded. Thus, it is somewhat difficultto create low-spiking activities with current clamp and we will use NetStim object toovercome this. Secondly, the curve is almost linear, or can at least be approximatedlinearily, for low currents. For strong currents however, the curve becomes concave,meaning that strong currents do not produce equally strong frequencies. Finally,we will work with a reference frequency of 30 Hz corresponding to an input currentof 0.1 nA.

54

Page 65: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

6.2. HODGKIN-HUXLEY SPIKING UNITS

6.2.2 Learning

In order to compare qualitatively and quantitatively the spiking implementationto the abstact units implementation, we will present the same learning procedureas the one exposed above : a 10 seconds simulation, divided into five 2-second se-quences including : a strong correlation phase (between 0 ms and 2,000 ms), duringwhich units are fed the same pattern : xi = xj = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1], an indepen-dent activation phase (between 2,000 ms and 4,000 ms), during which units are fedindependent patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] and xj = [0, 1, 1, 0, 0, 1, 0, 0, 1, 1],a strong anti-correlation phase (between 4,000 ms and 6,000 ms), during which unitsare fed anti-correlated patterns xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] and xj = [0, 1, 0, 1, 0, 0, 1, 0, 1, 0],a mute phase (between 6,000 ms and 8,000 ms), characterized by the absence of inputto both units xi = xj = 0, and a phase of presynaptic activation and postsynapticmuting (between 8,000 ms and 10,000 ms), during which unit i is fed a pattern andunit j is mute xi = [1, 0, 1, 0, 1, 1, 0, 1, 0, 1] and xj = 0.

Pattern presentations consist here of current-clamps attached to the cells withdifferent delays according to the pattern they represent, with a duration of 180 msand a relaxation period of 20 ms and an current amplitude of 0.1 nA. The synaptictraces show the same behaviour as what has been seen in the previous results, butit is to be noted that the range of values of the tertiary traces is smaller : theydo not get greater than 0.2. Also, a delay is observed for the tertiary traces, whenswitching from one sequence to the other. This is due to the slow-dynamics of thetertiary traces, whic time-constant τp is set to 1,000 ms. Membrane voltages andsynaptic traces are shown in figure 6.7.

The synaptic conductance and A-type Potassium-channel conductance timecourse is shown in figure 6.8 : here, we decide to only display the componentgcomp of the synaptic strength. The quantitative value of the effective synapticconductance g is weight by gmax (which will be later investigated for tuning) andthe alpha-function αi, restricting synaptic activity to presynaptic stimulation. Wewill refer to the gcomp as the synaptic conductance only in this section. Similarily,the potassium conductance presented here is the computed conductance gkcomp. Itneeds to be multiplied by a parameter gkmax (which value is set when tuning thechannel) to obtain the effective conductance gk.

The synaptic conductance (similar with the weight ωij for abstract units exhibitsthe same qualitative behaviour : During the learning phase it shows a rapid increaseto a high value (2.78 at 130 ms) and stabilizes to a lower positive value (1.15 at 2,000ms), accounting for strong correlation between the two units. During the indepen-dance activation phase, the synaptic conductance decays very slowly towards zeroending up to a non-negligible positive value (0.48 at 4,000 ms). Anti-correlation ofthe units leads to a linear decrease of the synaptic conductance towards a strongnegative value (-1.00 at 6,000 ms). The negative value for the conductance can beconsidered unbiological (we discuss this in the Conclusion), but it has the expectedeffect when we switch inference mode (inhibitory synapse) which makes our imple-mentation fit our need if we discard biological resemblance. The silent phase for

55

Page 66: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

(a) Presynaptic and Postsynaptic membrane voltages

(b) Presynaptic and Postsynaptic synaptic traces

Figure 6.7: Spiking Units Pattern Presentation

56

Page 67: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

6.2. HODGKIN-HUXLEY SPIKING UNITS

(a) Synaptic weight modification

(b) A Type Potassium Channel Conductance

Figure 6.8: Spiking Units learning

both units results in a linear increase for the synaptic strength which is expected,because all tertiary decay exponentially to zero with the same time-constant. Thelast phase is anologous to the observed anti-correlation phase.

The A-type Potassium-channel conductance computed here is the additive in-verse of the bias term in abstract units. The reason is that increasing of increasedexcitability results in decrease of A-type Potassium current [19] (Jung et al., 2009).It exhibits simple features : gradual decrease under stimulation of the unit and linearincrease when unit is silent, to balance the linear increase of the synaptic strength.Because unit i is more activated than unit j, its potassium channel conductance isalways smaller.

We have set the maximum synaptic conductance gmax to a very low value dur-ing learning (1 nS), so that our synapse operates almost in off-line learning condi-

57

Page 68: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

tions. This enables us to discard the inference effect during learning, to prevent thesynapse to learn its own interpretation of the stimulus. It also important to playwith the delay of conduction for the presynaptic input, because we assume thatthe postsynaptic potential has less distance to arrive to the synapse (backpropa-gating potential) whereas the presynaptic input needs to travel along the axon. Inpractice, we impose a presynaptic delay of 1 ms and a postsynaptic delay of 1 ms,assuming that these need the same amount of time to rach the postsynaptic soma.In reality, we might have to account for a conduction delay of about 5-6 ms alongthe presynaptic axon. The learning parameters are τi = τj = 20 ms, τe = 200 ms,and τp = 1,000 ms. The synaptic traces are initialized to 0.01, except for the mutualtraces initialized to 0.0001. The time-step of integration dt = 0.2 ms is significantlysmaller than in MATLAB, allowing for fine computation.

6.2.3 Parameter tuning

Potassium channel tuning

The dynamics of the A-type Potassium channel need to be tuned. The idea isto get the same quantitative output for retrieval as with abstract units. In fact, inthe spiking context this potassium channel needs to replace the effect of the biasterm. Not only it needs to be updated at each time-step, taking a crucial placein the learning rule implementation, but it also needs to account for spontaneousactivity, even in the absence of presynaptic activity.

In order to achieve this, we will modulate the resting membrane potential ofthe cell by changing the leakage channel resting potential Eleak. The idea is thatwhen this parameter is set above the action potential threshold, which is around-67 mV, the leak current (which is the only passive current in our model and thuscontrol fully the passive properties of the cell) will always drive the cell to fire in theabsence of an input. In fact, if no input is fed to the unit, the membrane potentialcannot reach its resting value, because the leak current will constantly depolarizethe cell, thus opening the Hodgkin-Huxley voltage-gated channels and triggeringaction potentials. However, if the A-type Potassium is set to a high conductance(consequent to low activity of the cell), it will inhibit the tendecy of the cell to fireand normal behaviour will be shown.

In order to achieve this, we will start with the hypothetical case when pi = 1(maximum activity) and the cell should fire to its maximum frequency. So we firstremove our potassium channel (we only set gkmax to zero and increase progressivelyEleak so that we get the same count of action potentials as with a current clampof 0.1 nA for a 1-second stimulation. We obtain Eleak = -29.0 mV (for a standardvalue of -70.3 mV).

Now we want our potassium channel to reach its maximum conductance whenthe cell has not been active at all. A first important thing is to set gk = gkmax

log(pi)log(ε) ,

so that gk = gkmax when pi reaches its floor value ε. Then the value assigned togkmax is chosen by fitting the input-output mapping for spiking units to the NS

58

Page 69: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

6.2. HODGKIN-HUXLEY SPIKING UNITS

mapping for abstract units. A gkmax of 54.8 µS.cm−2 gives fairly good results.

Synaptic conductance tuning

The second parameter to tune in order to get a quantitative fit from non-spikingunits to our implementation in NEURON is to fit the synaptic conductance. Theparameter gmax controls the amount of current flowing from one cell to the otherduring synaptic transmission. Of course, all other parameters involved in synaptictransmission need to be set and no further changed can be applied to them. We setτi = 20 ms, ε = 10−6, imax = 0.1 nA and Tpres = 180 ms. Then the value assignedto gmax is chosen by fitting the input-output mapping for spiking units to the NSmapping for abract units. gmax 500 pS gives fairly good reasults.

6.2.4 Retrieval

Now that our the model parameters have been tuned to mimic the resultsobtained with abstract units, we can use the spiking implementation in the retrievalmode to create an input-output mapping from one cell to the other. As explainedabove, this is done by attaching a current-clamp electrode to the presynaptic unit i,to achieve a firing frequency fi proportional to the driving input xi. We extrapolatethe input-current needed by looking at the steady-state current discharge presentedabove. The duration of the stimulation Tstim is set to 1,000 ms.

As opposed to the abstract units implementation, we can only get a spikingoutput (no anti-spikes or stepped value), and this simplifies our purpose : the outputxj is simply set to the firing frequency fj (divided by fmax) of the postsynaptic cell.It is to be noted that retrieval can only performed in one direction : because of theαi function postsynaptic input only will not trigger synaptic transmission. Thus, byadding an alpha-function, we have given an orientation to the link between neuronsin our biological network. Of course, there can be backpropagating connections butat one synapse, activity can only be triggered by the presynaptic cell.

Also, we have made the choice to perform learning and inference during the samestimulation, because this seems closer to what can be observed in real neurons. Theprint-now signal κ is used as a flag to switch from the learning mode to the inferencemode. Typically 10−6 is chosen as a limit for these two cases. Figure 6.9 presentsthe input-output mapping for our NEURON implementation.

The curve exhibits the same qualitative behaviour with what has been observedwith abstract units : Correlated units exhibit a positive slope, accounting for ex-citatory input from the presynaptic cell to its target. Anti-correlated units showthe opposite behaviour and independent units show no significant synaptic trans-mission. When the postsynaptic cell, gradually turns silent, its intrinsic excitabilitydecreases, resulting in a downward shift of the input-output relationship. As op-posed to the abstract units, it is nonsense to show the inverse mapping because oursynapse is oriented for pre-post transmission.

59

Page 70: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

Figure 6.9: Spiking Units implementation Input-Output mapping - Darkblue, green and red correspond respectively to inference after the correlation phase(T = 2,000 ms), the independence phase (T = 4,000 ms) and the anti-correlationphase (T = 6,000 ms). Inference after the muting of both units is displayed in lightblue and finally, inference after sole activation of the presynaptic unit is shown inpink.

Some fine tuning needs to be investigated, especially the expression of these con-ductances are a product of terms depending on the value of the tertiary traces. Also,using current-clamps is somewhat incomplete, because it misses some data pointsfor low firing frequencies. Indeed the current-discharge for these values shows athreshold and an exponential relationship between injected current and resultingfiring frequency. An alternative to this problem can be found by using NetStim ob-jects and control the exact spike train of the presynaptic cell during the simulation.We have good confidence that further investigation could give a precise fit of thecurves for both abstract units and spiking units.

6.2.5 Spike Timing Dependence

In this section, we want to investigate the spike-timing dependence of the newspiking version of the BCPNN learning rule. This is somewhat interesting to chal-lenge the BCPNN model, which has not been designed at a time when spike-timingdependent plasticity rules were investigated yet, with some classical STDP proce-dures. Our simulation reproduces the work from Bi and Poo [6], about the inves-tigation of the influence of pre-post and post-pre timing on the change in synaptic

60

Page 71: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

6.2. HODGKIN-HUXLEY SPIKING UNITS

strength (measured by the percentage of change of EPSC amplitude) in culturedhippocampal rat cells. In this work, they have shown the existence of a narrowspike-timing window with a width of 40 ms, where post-pre timing (within 20 ms)triggered LTD and pre-post timing (within 20 ms) triggered LTP.

We apply the same procedure to our spiking units : a repeated low-frequency(1Hz) stimulation of transient couples of pre-post or post-pre spikes during 1 minute.Because of limited computational resources, we modified this procedure as follows: We feed short transient stimulation to presynaptic and postsynaptic units at afrequency of 1 Hz, during 30 seconds. This short-lived stimulation produces a singlespike in each of the cells, and the timing between these spikes is obviously controlledWe measure the asolute weight change gcomp after the repeated stimulation.

As mentioned before we want to be able to set τi and τj to different values, sowe use the updated version of the learning allowing us to do so (equations 5.7 and5.8). Typically we will use τi = 10 ms and τj = 2 ms, to promote pre-post timingover post-pre timing. It is important to note that the stimulation frequency mustbe low enough to allow the secondary traces to decay to zero. We typically haveτe = 200 ms, which is adapted to a stimulation of one spike per second. Finally,τp can be increased from 1,000 ms up to 10,000 ms to fully integrate the 30-secondprocedure. Stimulation is given by strong short-lived (15 ms) current pulses (0.1nA). We mention that for this only task, we have muted adaptation current forspike-frequency adaptation and A-type potassium channel current, because theyaffect the interspike interval, whereas we want to investigate exact timing betweenspikes.

Figure 6.10: Spike-timing dependence window, τi = 10 ms and τj = 2 ms

61

Page 72: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 6. RESULTS

Figure 6.10 shows results for a spike-timing window width of 200 ms. We willlater compare our results to experimental data of Bi and Poo. As expected oursimulations show strong increase in synaptic strength (LTP) for correlated spike-timing (the curve takes a strong positive value of 3.09 when ∆t = 0 ms). and strongdecrease in synaptic strength (LTD) for uncorrelated spike timing (the curve takesa strong negative value when ∆t = 100 ms). The curve is not symmetric however,because we have decided to promote pre-post timing over post-pre timing. So thetime-window for LTP is smaller (−6 < ∆t < 0 ms) whereas the time-window forLTP in the pre-post timing is 5 times bigger (0 < ∆t < 30 ms)

It exhibits also a linear decay of the synaptic strength, which can be explained bylooking at the time-courses of the primary traces : The mutual trace eij is dependenton the product zizj , thus on the area commonly under the pi and pj curves. It canbe shown that this area decreases exponentially with the spike timing betweenpresynaptic and postsynaptic units. Because the individual traces pi and pj areindependent of the spike timing, the weight change depends only on the logarithmof the mutual trace pij , which decays exponentially with the spike-timing, thusresulting in a linear decrease on both sides of the curve.

No synaptic change occurs when the timing between spikes is so long that theproduct zizj is constantly zero (which is what happens on the edges of the curve).Thus, LTD is triggered in the absence of correlated timing. This is a specificity ofour learning rule that decrease in synaptic weight occurs under repeated uncorre-lated stimulation (whereas when no input is presented, spontaneous weight increaseoccurs). The LTD-level on the edges of the curves can be controlled by addingsome additive noise to the mutual traces (either at the level of eij or pij). This en-ables us to control the spontaneous level of weight decrease when the spike-timingis uncorrelated.

62

Page 73: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 7

Discussion

7.1 Model Dependencies

In this section, we investigate the parameters which influence our model. Wedo not present a quantitative investigation, but we stress the aspects which deservespecial care of the investigator.

7.1.1 Learning Rule Parameters

Time-constants

The time-contants play a crucial role in our learning rule. Not only they controlindividually the synaptic trace dynamics, allowing us to promote one specific phe-nomenon over another one, but they need also to be considered as a set of factorsas a whole, because they exhibit a strong interdependence.

Depending on the maximum stimulation frequency (fmax = 50 Hz here) used inour model, the time-constants τi and τj constants must be updated. In fact, they aredesigned not to exceed 20 ms, because they account for fast spiking dynamics. Aswe have shown, we can promote pre-post spiking over post-pre timing by increasingone time-constant over the other. We can also control the spike-timing window,thus controling the triggering of LTP or LTD. Recent investigations on STDP [27]include all these kind of local synaptic variables. According to their number andcomplexity, we can try to reproduce results in triplet or quadruplet models.

The secondary time-constant τe and τp control the long-term dynamics of ourmodel. As explained before, the secondary traces account for delayed-reward mech-anisms and the tertiary traces as a long-term memory. By adjusting these twotime-constants we can switch from a fast-operating working memory, to a slow-dynamics long-term memory. The range of τp is adaptable to the specificity of ourlearning task (from 1 second in fast learning sequences to 10 seconds or more forlong-lasting spike-timing dependence investigation).

63

Page 74: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 7. DISCUSSION

Floor value and Initialization

The ε value is a lower bound for all synaptic traces : they cannot get smallerthan this value. In our model we set ε = 10−6. It is important that this value istuned with the initialisation of the synaptic traces, which are 0.01 for the individualtraces and 0.0001 for the mutual traces, because they must not reach the floor valuetoo fast. Decreasing the floor value allows us to limit the spontaneous linear weightincrease and bias decrease, observed in the absence of input. By setting ε, wecan modelize a standard level of additive noise or baseline, accounting for irregularactivity from other neurons connected to the cell.

Primary synaptic trace type

The primary traces zi(t) and zj(t), controled by the time-constants τi and τj ,operate as a fast dynamics memory, keeping a trace of the spiking activity of thecell. According to the type of synaptic trace (additive, saturated or proportional)we choose, we can decide to give priority to the timing of recent spikes or to thefrequency over a certain time-scale.

The additive trace presented in Chapter 3, is used to model synapses wheresynaptic integration exhibit additive bahaviour, meaning that on the occurence ofevery new spike, a fixed quantity of neuro-transmitter is released in the synapticcleft and adds up to the synaptic resources already present on the postsynaptic site.In our model, we implement the additive trace with a low-time constant when wewant to have a measure of the amount of spikes, occuring in a given time period(see Inference section in Chapter 5)

The saturated trace, on the other hand, discard the past history of the cellwhen a new spike occurs. It is always updated to the same value, so that wecan always retrieve the occurence of the last spike, from the value of the trace,and this at any time. The saturated trace accounts for synapses including a smallnumber of receptors, which get saturated on the occurence of each spike. The neuro-transmitter concentration decays gradually at a certain speed, which is equivalentto the time-constant of the trace in our model.

We have used proportional traces for zi(t) and zj(t), which gives a compromiseto these two methods. We mention however that switching to additive traces wouldpromote the spike-timing features of the learning rule, whereas additive trace wouldstress the amount of spiking over a given period of time.

Time of presentation

Another important parameter is the time of presentation of each pattern, com-bined with the resting time between patterns. The longer the pattern is seen, thestronger the memory. But there needs to be some adaptation for our learner beforejumping to another pattern, because the synaptic traces need to relax for some timebetween events.

64

Page 75: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

7.1. MODEL DEPENDENCIES

Ideally, we would like to impose a lag corresponding to a decay of the secondarytraces (4τe) between each learning sequence, so that we ensure that the network ac-tivity is not polluted by its previous knowledge. A good example of what we wantto avoid appears in the NEURON implementation : Because the independence se-quence occurs right after the strong-correlation sequence, the synaptic conductancedoes not decay to its baseline. To solve this, we can impose a resting period or thelearning sequence itself can be extended.

The resting time between patterns allows us to present multiple patterns withoutintroducing undesirable behaviour. But this repeated resting time acts in the globalactivity of the cell, therefore on the values of the tertiary traces pi, pj and pij . Thisneeds to be taken into account when computing the activity of a cell or comparingit to the value of the tertiary traces.

7.1.2 Pattern Variability

Testing our implementation on noisy data is something that can be investigated.It seems likely that the learning rule is robust to noisy input for two reasons. Thefirst one is that we have implemented Poisson-generated process based on a randomprocess and this gives fairly good results. Thus, the exact spike-timing betweenpresynaptic and postsynaptic units is not crucial during learning, but rather theprobability of firing of one cell, which is based on the rate of firing in our Poissongeneration. The second reason is that we are dealing with spiking units, whichsynaptic traces will not be affected by noise. Indeed, as long as the membranedoes not fire action potentials, additive noise influencing the resting membranepotential will be discarded in our learning implementation. However, if the noise isstrong enough to trigger action potentials, stability is not garanteed, but if we runa simulation with reasonable stimulation frequencies and small enough values forthe primary time-constants τi and τj for a long time, we should be able to discardrandom noise affecting the traces.

The fact that input activities represent confidence of feature detection, allowsus to feed graded input patterns to the units, representing the relative confidence.If an attribute value has been observed for sure, the corresponding unit is fed withinput 1. However, if the attribute value of a pattern is less reliable (because itsobservation or recogintion is not garanteed), we can feed the corresponding unitwith a weak input, giving more weight to the bias term, which represents the aprioiri probability of a certain attribute. This allows us to use the spiking learningrule as a classifier for ambigous patterns. It can also be used in a recurrent networkto perform pattern completion or pattern reconstruction.

7.1.3 Learning-Inference Paradigm

The boundary between when our brain learns and integrates stimuli from itsenvironment, and when it infers from its aquired knowledge and gives its own inter-pretation of the data, is a difficult question. There is a fine line between inference

65

Page 76: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 7. DISCUSSION

and learning in a continously operating network, whereas it is completely clear inan off-line learner.

The main question we have to adress is : Do we learn and infer in a sequentialfashion, like when we learn a set of words or number ? Or do we copy the acquiredknowledge after a specific task onto another region of the brain where we processretrieval ? In our work, we make the first assumption, and we decide to switchfrom learning mode to inference mode in a sequential manner, but in the samesimulation. It would be valuable to investigate the difference between these twoparadigms. The print-now signal κ allows us to bridge the gap beween inferenceand learning. Indeed, when we are setting it to a really low value, we can assumethat the network is still learning, but the dynamics of the learning are tremendouslyextended, so that inference can operate in the meantime.

7.2 Comparison to other learning rulesIn this section, we discuss the comparison of our learning rule to other existing

learning rules. This has been an important motivation for the development of thisspiking BCPNN learning rule, that we could be able to compare not only to thenon-spiking BCPNN version but also to some spike-timing dependent plasticitylearning rules. We expand on three comparisons : evolution of weights and biasesduring learning in order to compare the two versions of the BCPNN learning rule,spike-timing dependence window to compare with real LTP data from Bi and Poo[6] and finally a discussion of the analogy with the BCM rule [7].

7.2.1 Spiking vs Non-spiking Learning RuleFigure 7.1 displays the time-courses of the weight ωij and presynaptic bias βi,

for abstract units learning (Green : NS, Blue : FS, Red : PS) and the synapticconductance gij and the presynaptic A-Type Potassium channel conductance gkitime-courses for the NEURON implementation (Light Blue).

First, we want to compare quantitatively the evolution of the weight ωij in theabstract units implementation. As expected, the three curves exhibit the very samedynamics each one on top of the other. The curve corresponding to the non-spikingpattern presentation (NS) is above the Spiking Frequency-based patternpresentation(FS), which is itself always above the curve corresponding to the Poisson-generatedpattern presentation (PS). This can be explained by the pattern presentation scheme: the longer the input activities yi(t) and yj(t) are set to a certain value, the more itreverberates into the traces. So for our non-spiking presentation, the mutual traceeij has time to be updated to a strong value, which enables a strong update valuefor the tertiary traces, hence the weights. For FS, the pattern is not printed asstrongly as for NS, but the exact timing of the spikes betweeen two units with thesame input makes it fit closely to the NS results.

It is striking however how these three curves reproduce the same behaviour,meaning that the option we choose for pattern presentation was relevant. The PS

66

Page 77: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

7.2. COMPARISON TO OTHER LEARNING RULES

Figure 7.1: Spiking/Non-Spiking Learning Comparison

(a) Weight ωij and Synaptic Conductance gij Time-courses

(b) Presynaptic bias βi and Potassium Channel Conductance gki Time-Courses

67

Page 78: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 7. DISCUSSION

implementation displayed here corresponds to a single run. It would be valuableto exhibit an average value over several runs, but it gives already a very goodquantitative fit to the data, meaning that our spiking learning rule is robust anddoes not suffer from noisy input (at least the exact spike timing doesn’t play a keyrole in the learning paradigm).

The synaptic conductance time-course resulting from the NEURON implemen-tation is a little different from the abstract units’ curves. During the early part ofthe correlation phase corresponding to the presentation of the first pattern (between0 and 200 ms), it sticks to the FS curve, which is explained by the fact that they areboth frequency-based implemenations firing at the same frequency. However, thecurve seems later to exhibit slower dynamics than the three others. This is due tothe slow-potassium current accounting for spike-frequency adaptation : in the earlypart of the learning phase the cell fire to 55 Hz but later, spike frequency adaptationresults into a firing frequency of 30 Hz for the rest of the stimulation. We note thevery same linear increase, when learning mute inputs (between 6,000 ms and 8,000ms), which slope is equal in all cases to the to value of the tertiary time-constant τp

The bias term and potassium conductance time-courses are easier to interpret,because they are only a display of the presynaptic tertiary trace pi(t) on a logarith-mic scale. We mention however that we display here, the additive inverse of thepotassium channel conductance to make the comparison easier. Once again, theNEURON curve fits tightly to the FS and gradually distinguishes from it.

The effect of spike-frequency adaptation is even more sensitive here, because thevalue of pi(t) is directly dependent on the firing frequency. For mute inputs (between6,000 ms and 8,000 ms) we note the same slope of the four curves corresponding tothe time constant τp. This negative slope is the exact counterpart of the positiveone for the weight, and this is what garantees stability during inference after sucha phase.

7.2.2 Spike-timing dependence and real data

We present here, the comparison between our spike-timing procedure simulationresults and the real neural data on cultured hippocampal neurons obtained by Biand Poo [6]. Figure 7.2 shows the comparison between the results obtained in bothcases.

Our curve shows qualitative similarities with real data : the existence of aspike-timing window, triggering Long-Term Potentiation, centered zero-value forthe spike-timing, the promotion of pre-post timing over post-pre timing to triggerincrease in synaptic strength, and Long-Term Depression on the edges of the curve,corresponding to uncorrelated timing between spikes. Even if the slopes of the twocurves cannot be compared quantitatively, because we have a percentage of changein current on the one hand, and absolute synaptic conductance change on the otherhand, we stress that the qualitative aspect of the curve shows the same behaviour,at least for spike-timing close to zero.

The BCPNN curve exhibits LTP for strongly correlated spike timings and LTD,

68

Page 79: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

7.2. COMPARISON TO OTHER LEARNING RULES

Figure 7.2: Spike-timing Dependance Comparison

(a) Original data from Bi and Poo - 1998 (b) BCPNN Spike-timing dependence window

as soon as the spike-timing amplitude becomes too strong. This is somewhat differ-ent from the STDP rules, which exhibit both LTP, LTD and no significant synpaticchange, when no exact pre-post or post-pre timing is recorded. In the data from Biand Poo, a decrease of about 40% in EPSC amplitude, accounting for LTD, occursfor negative spike-timings (for −20 < ∆t < 0 ms).

We have tried, by first introducing noise into the traces, and second updating thefloor value of the traces, to exhibit LTD only for small negative spike timings, and nosignificant change in the far edges of the curves, but our attempt was unsuccessful.We conclude that the BCPNN learning rule needs to include an active process toprevent LTD for strong negative spike timings, because the sole interplay of thetime-constants τi and τj is structurally unsufficient.

7.2.3 Sliding threshold and BCM Rule

The BCM rule [7] refers to the theory of synaptic modification first proposed byElie Bienenstock, Leon Cooper, and Paul Munro in 1982 to account for experimentsmeasuring the selectivity of neurons in primary sensory cortex and its dependencyon neuronal input. It is characterized by a rule expressing synaptic change as aHebb-like product of the presynaptic activity and a nonlinear function φ(yj , θM ), ofpostsynatic activity yj(t). For low values of the postsynaptic activity (yj < θM ), φis negative, and for y > θM , φ is positive.

The rule is stabilized by allowing the modification threshold θM to vary as asuper-linear function of the previous activity of the cell. Unlike traditional methodsof stabilizing Hebbian learning, this "sliding threshold" provides a mechanism forincoming patterns, as opposed to converging afferents, to compete. The BCM ruleis characterized by its biological relevance and was proposed to account for thedevelopment of neuron selectivity in the visual cortex. Several improvements ofthis learning have been proposed by Intrator and Cooper in 1992 [17], and Law and

69

Page 80: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 7. DISCUSSION

Cooper in 1994 [3]. A detailed exploration can be found in the book Theory ofCortical Plasticity [10].

There is a strong analogy between the bias term in the BCPNN learning rule andthe “sliding threshold” in the BCM rule : Both are dependent on the past activityof the cell and both define a threshold between Long-Term Potentiation and Long-term Depression. It would be extremly valuable to implement the BCM rule forabstract units and compare the evolution of the bias term in the BCPNN model,to the evolution of the threshold θM in the BCM context. Since, this rule accountfor biologically observed phenomena, we could improve substantially our model, bymodifying the bias term, in order to mimic a BCM-like threshold adaptation.

70

Page 81: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

7.3. FURTHER DEVELOPMENTS AND LIMITATIONS

7.3 Further Developments and limitations

In this section, we propose a series of possible developments for the spikingBCPNN learning rule. Most of the time, the seed is from the need to overcome theintrinsic limitations in our model.

7.3.1 Network implementation

All the work presented in this project deals only with two units and single-synapse learning. This gives some lack of consistence to our results. Especially thisoccurs when implementating one single synapse inference mode, which is somewhatartificial with only two units, because a retrieval mapping is meant to receive inputfrom a set of units and not only from one. Only in this case, can we see thestrengthening of a connection between correlated attributes or group of attributes.The fact that it is reduced to a pair of units gives poor extent to our inferenceresults. Therefore, a crucial development would be to introduce the learning rule ina network context.

The architecture of the network is also a matter of concern. Since biological neu-ral networks are very sparsely connected there is a trade-off to be found betweenincreasing size of the network (which makes the computational complexity increaseexponentially) and the percentage of connection between the neurons. When im-plemented in a reccurent architecture the learning rule exhibits lateral inhibitionbetween units that are not active together, and when a given unit is silent it getsinhibitory input from almost every unit in the network. This has to be taken careof during further development but introducing the new learning rule in a fully con-nected network is a thorny task.

7.3.2 RSNP cells and inhibitory input

For the sake of biological plausibility, we need to investigate a bit further aboutwhat happens when the synaptic conductance g(pi, pj , pij) in our model becomenegative. This arises when the function gcomp(pi, pj , pij) takes negative values, dueto a strong anti-correlation between two units. In our model, that does not cre-ate any major implementation issue : if the synaptic conductance gets negative,everything happens, current was flowing through the synaptic cleft in the oppositedirection. The resulting consequence on the membrane voltage is similar to theeffect of an inhibitory input.

However, this is biologically unrealistic, because we have stated that our synapsehave an orientation : information passes from the presynaptic cell to the postsy-naptic cell. To overcome this problem and stick to what is observed in real neurons,we proposed an implementation of BCPNNSynapses, with RSNP cells.

Each unit will now be composed of a pair of cells : a RSNP cell, and a pyramidalcell receiving an inhibitory connection from the first. The behaviour of the RSNPcell and its corresponding pyramidal cell is complementary : if the first is active,

71

Page 82: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

CHAPTER 7. DISCUSSION

(a) Single-synapse - RSNP cells (b) Mutli-synapse - RSNP cells

Figure 7.3: BCPNN Synapses implementing RSNP cells and inhibitory connections

the other is silent, and vice versa. Both the postsynaptic cell and its correspondingRSNP cell take input from other presynaptic pyramidal cells, but when the com-puted conductance becomes negative, the pyramidal-pre/pyramidal-post synapticconductance is set to zero, and the pyramidal-pre/RSNP-post connection becomesactive, triggering IPSP (Inhibitory Post-Synaptic Potential) in the postsynapticpyramidal cell. Figure 7.3 show a proposal for this implementation in single andmulti-synapse contexts.

7.3.3 Hypercolumns, basket cell and lateral inhibition

In order to account for the hypercolumnar structure, it would be valuable tointroduce the learning rule in a network composed of several minicolumns groupedin a hypercolumn. This modular structure has been observed in the cat visualcortex and its attractor dynamics have been implemented by Lundqvist et al in2006 [24]. The main idea is that only one neuron is active within a minicolumn(which imposes lateral inhibition between units in a minicolumn). Connectionsbetween pyramidal cell are thus very long and they are the only ones that enter orleave the hypercolumn.

A new type of cell called the basket cell has to take into account lateral inhibitionwithin a hypercolumn : Indeed, in each minicolumn one unit is sensitive to a specificvalue of one attribute or feature (shape orientation, colour), thus, a smentionedearlier, it is important that this attribute of feature takes only one value. In order toachive this, we include basket cells in each hypercolumn (the number of basket cellsequaling the number of pyramidal cells per minicolumn). Each backet cell receivesexcitatory connections from the pyramidal cell in each minicolumn, corresponding tothe specific feature, and gives an inhibitory input to the RSNP cells corresponding

72

Page 83: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

7.3. FURTHER DEVELOPMENTS AND LIMITATIONS

to latter pyramidal cells. Such an implementation garantees stability for gradedinput, but also dramatically increases the computing time.

7.3.4 Parallel computingDealing with complex units such as minicolumns and implementing a large

number of auxialiary cells for each unit might increase the computational time dra-matically. To be able to run long stimulations with complex units and networks,parallel computing is a precious tool. It is easily possible with the NEURON Sim-ulation environment to distribute network models and complex models of singleneurons over multiple processors to achieve nearly linear speedup [26]. So, thespeedup governed by a parallel implementation of the presented learning rule willmake it possible to include it in a large-scale network of biologically detailed neu-rons and investigate the effects emerging on the network level. Some modificationof the code given might be needed however.

73

Page 84: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH
Page 85: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Chapter 8

Conclusion

In this Master Thesis Project, we have presented and implemented an adapta-tion of the BCPNN learning rule for spiking units. The BCPNN model has beendeveloped thoroughly in the last thirty years and has been found relevant in manydomains, such as classification tasks (Holst 1997) [15], a Hebbian working memorymodel (Sandberg 2003) [32] and pharmacovigilance and data-mining (Lindquist etal. 2000). There has been a strong motivation in all of these works, to have a versionof this learning rule operating with spiking units. We propose here an implemen-tation in the NEURON language, based on a mapping from Bayesian weights toa synaptic conductance, and from the bias term to an A-Type activity-dependentpotassium channel conductance.

Our work present results in single-synapse learning and inference. Instead oftesting our learning rule in a network context, we have focused on fine tuning ofthe cell parameters in order to lay the foundations for the further development andtesting of this learning rule in specific tasks. This work can be extended in manydirections, but we think the results presented here constitute already a matter ofinterest. Not only no spiking version of the BCPNN learning rule had ever beenimplemented, but comparison to STDP-like real data enlarges the scope of ourwork. This opens the gate to hybrid learning rules trying to reconciliate spike-timing dependent and probabilistic features at once, in learning rule operating withspiking units.

We are positive that continuing this work will be a source of reward for the nextyears to come. Though it is based on a very old mathematical theory, the use ofBayesian-Hebbian Networks still prove to be valuable for many tasks. Trying tobridge the gap between a phenomenological approach and the theoretical study ofthe brain, one must seriously consider in the future to adapt existing algorithmsto the new operating stimulation languages. This kind of approach to the problemmight allow us to unify our knowledge, in order to adress what we do not know, ina more altruistic and efficient manner.

75

Page 86: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH
Page 87: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Bibliography

[1] http://en.wikipedia.org/wiki/Hebbian_theory

[2] http://en.wikipedia.org/wiki/Hodgkin_Huxley_model

[3] http://www.scholarpedia.org/article/BCM_rule

[4] http://www.klab.caltech.edu/ stemmler/ Martin Stemmler Notes on Informa-tion Maximization in Single Neurons

[5] Azouz R, Gray CM (2000) Dynamic spike threshold reveals a mechanism forsynaptic coincidence detection in cortical neurons in vivo, Proc Natl Acad SciUSA 97:8110â8115

[6] Bi G-Q, Poo M-M (1998) Synaptic Modifications in Cultured HippocampalNeurons : Dependence on Spike Timing, Synaptic Strength and PostsynapticCell Type, J. Neurosci. 18(24): 10464-72.

[7] Bienenstock E, Cooper L, Munro P (1982) Theory for the Development ofNeuron Selectivity : Orientation Specificity and Binocular Interaction in VisualCortex, The Journal of Neuroscience Vol. 2, No.1 (January, 1982) 32-48

[8] Carnevale N, Hines M (2006) The NEURON Book New York, Cambridge Uni-versity Press.

[9] Clopath C, Ziegler L, Vasilaki E, Busing L, Gerstner W (2008)Tag-Trigger-Consolidation: A Model of Early and Late Long-Term-Potentiation and Depression, PLoS Comput Biol 4(12): e1000248.doi:10.1371/journal.pcbi.1000248

[10] Cooper L, Intrator N, Blais N, Shouval H (2004) Theory of cortical plasticityWorld Scientific, New Jersey.

[11] Fuster J. M. (1995) Memory in the Cerebreal Cortex, Cambridge, Mas-sachusetts, The MIT Press.

[12] Gerstner W, Kistler W (2002) Spiking Neuron Models : Single Neurons, Pop-ulation, Plasticity New York, Cambridge University Press.

77

Page 88: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

BIBLIOGRAPHY

[13] Hebb D.O. (1949) The Organization of Behavior, New York, Wiley

[14] Hodgkin A, Huxley A (1952) A quantitative description of membrane cur-rent and its application to conduction and excitation in nerve, J. Physiol.1952;117;500-544

[15] Holst A (1997) The Use of a Bayesian Neural Network Model for ClassificationTasks, Disseration, Department of Numerical Analysis and Computer Science,Royal Institute of Technology, Stockholm, Sweden

[16] Hopfield J. J. (1982) Neural networks and physical systems with emergentcollective computational properties, Proc. Nat. Acad. Sci. (USA) 79, 2554-2558.

[17] Intrator N, Cooper L (1992) Objective Function Formulation of the BCM The-ory, Neural Networks (5) 3-17

[18] Kononenko I. (1989) Bayesian Neural Networks, Biological Cybernetics JournalVol. 61, pp. 361-370.

[19] Jung S-C, Hoffman D (2009) Biphasic Somatic A-Type K + Channel Downreg-ulation Mediates Intrinsic Pasticity in Hippocampal CA1 Pyramidal Neurons,Plos ONE 4(8): e6549. doi:10.1371/journal.pone.0006549

[20] Kandel E, Schwartz J, Jessel T (1995) Essentials of NeuroScience and Be-haviour, Appleton and Lange, Norwalk, Connecticut

[21] Antonov I, Antonova I, Kandel E, Hawkins R (2003) Activity-Dependent Presy-naptic Facilitation and Hebbian LTP Are Both Required and Interact duringClassical Conditioning in Aplysia, Neuron 37 (1): 135â147, doi:10.1016/S0896-6273(02)01129-7

[22] Lansner A, Ekeberg O (1989) A One-Layer Feedback Artificial Neural Networkwith a Bayesian Learning Rule, International Journal of Neural Systems Vol.1, No. 1 (1989) 77-87

[23] Lansner A, Holst A (1996) A Higher Order Bayesian Neural Network withSpiking Units, International Journal of Neural Systems Vol. 7, No. 2 (May,1996)115-128

[24] Lundqvist M, Rehn M, Djurfeldt M, Lansner A (2006) Attractor dynamicsin a modular network model of neocortex, Network: Computation in NeuralSystems, Volume 17, Issue 3 September 2006, 253-276.

[25] Mayr C, Partzsch J, Schuffny R (2009) Rate and PulseâBased Plasticity Gov-erned by Local Synaptic States Variables

78

Page 89: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

[26] Migliore M, Cannia C, Lytton W, Markram H and Hines M (2006) Parallelnetwork simulations with NEURON, Journal of Computational Neuroscience21:110-119.

[27] Morrison A, Diesmann M, Gerstner W (2008) Phenomenological Models ofSynaptic Plasticity based on Spike Timing, Biol Cybern (2008) 98:459478 DOI10.1007/s00422-008-0233-1

[28] Potjans W, Morrison A, Diesmann M (2009) A Spiking Neural Network Modelof an Actor-Critic Learning Agent Neural Computation 21, 301-339 (2009)

[29] Ramón y Cajal S (1894) The Croonian Lecture : La Fine Structure desCentres Nerveux Proceedings of the Royal Society of London 55: 444â468.doi:10.1098/rspl.1894.0063

[30] Rubin J, Gerkin C, Bi G-Q, Chow C (2005) Calcium Time Course as a Signalfor Spike-TimingâDependent Plasticity, J. Neurophysiol. 93:2600-2613.

[31] Sandberg A, Lansner A, Petersson K-M, Ekeberg O (2002) Bayesian attractornetworks with incremental learning, Network: Computation in Neural Systems13(2): 179-194.

[32] Sandberg A, Lansner A, Tegner J (2003) A working memory model based onfast Hebbian learning, Network: Computation in Neural Systems 14: 789-802.

[33] Wahlgren N, Lansner A (2001) Biological evaluation of a Hebbian-Bayesianlearning rule, Neurocomputing 38-40: 433-438.

[34] Xu J, Kang N, Jiang L, Nedergaard M, Kang J (2003) Activity-DependentLong-Term Potentiation of Intrinsic Excitability in Hippocampal CA1 Pyra-midal Neurons, The Journal of Neuroscience Vol. 25, No.7 (February 16, 2005)1750-1760, doi:10.1523/JNEUROSCI.4217-04.2005

[35] Benda J, Herz A. V-M A Universal Model for Spike-Frequency Adaptation,Neural Computation Vol. 15, No. 11 (November 2003), 2523â2564

79

Page 90: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH
Page 91: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Appendix A

NMODL files

A.1 Synapse modelisation

File BCPNNSynapse.modSynapse Implementation for the BCPNN Learning Rule

NEURON POINT_PROCESS BCPNNSynRANGE e, i, g, gmax, gcomp, glearnRANGE Tau_i, Tau_j, Te, Tp, eps, float, r, KRANGE zi, zj, ei, ej, eij, pi, pj, pijNONSPECIFIC_CURRENT i

UNITS (S) = (siemens)(pS) = (picosiemens)(mV)= (millivolt)(mA)= (milliamp)

PARAMETER Tau_i = 20.0 (ms) <1e-9,1e9>Tau_j = 20.0 (ms) <1e-9,1e9>Te = 200.0 (ms) <1e-9,1e9>Tp = 1000.0 (ms) <1e-9,1e9>eps = 1e-6r = 0.8e = 0 (mV)gmax = 500 (pS)glearn = 0K = 1 <1e-9,1e9>

81

Page 92: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

APPENDIX A. NMODL FILES

ASSIGNED g (pS)gcompv (mV)i (nA)

STATE zizjzijeiejeijpipjpij

INITIAL zi = 0.01zj = 0.01ei = 0.01ej = 0.01eij = 0.0001pi = 0.01pj = 0.01pij = 0.0001

BREAKPOINT SOLVE state METHOD cnexpgcomp = g_comp(pi,pj,pij)if (K<eps) g = gmax * glearn * zi else g = gmax * gcomp * zi i = 1e-6*g*(v - e)

DERIVATIVE state zi’ = -zi/Tau_izj’ = -zj/Tau_jei’ = (zi-ei)/Teej’ = (zj-ej)/Teeij’ = ((Tau_j/Tau_i)*zi*zj-eij)/Tepi’ = K*((ei-pi)/Tp)pj’ = K*((ej-pj)/Tp)pij’ = K*((eij-pij)/Tp)

82

Page 93: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

A.2. A-TYPE POTASSIUM CHANNEL

NET_RECEIVE (weight) if (weight >= 0) zi = zi + r*(1-zi) else zj = zj + r*((Tau_i/Tau_j)-zj)

FUNCTION g_comp(pi, pj, pij) if (pi < eps) pi = eps if (pj < eps) pj = eps if (pij < eps*eps) pij = eps*eps if ((pij/(pi*pj)) < eps) gcomp = log(eps) else g_comp = log(pij/(pi*pj))

A.2 A-Type Potassium Channel

File ATypePotassium.modA-Type Potassium current for intrinsic excitability

NEURON SUFFIX kaUSEION k READ ek WRITE ikRANGE gk, gkbar, gcomp, glearn, iRANGE Tau, Te, Tp, eps, float, r, KRANGE z, e, p, thresh, delay

UNITS (S) = (siemens)(uS) = (microsiemens)(mV)= (millivolt)(mA)= (milliamp)

PARAMETER Tau = 20.0 (ms) <1e-9,1e9>Te = 200.0 (ms) <1e-9,1e9>Tp = 1000.0 (ms) <1e-9,1e9>eps = 1e-6r = 0.8gkbar = 54.8 (uS/cm2)thresh = -20 (mV)delay = 7glearn = 0K = 1 <1e-9,1e9>

ASSIGNED

83

Page 94: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

APPENDIX A. NMODL FILES

v (mV)ek (mV)ik (mA)i (mA)gk (S/cm2)gcompfiringuptimecounterready

STATE zep

BREAKPOINT SOLVE states METHOD cnexpgcomp = g_comp(p)if (K<eps) gk = gkbar * glearn else gk = gkbar * gcomp i = 1e-6*gk*(v-ek)ik = i

DERIVATIVE states detect(v)if (firing == 1) z = z + r*(1-z) z’ = -z/Taue’ = (z-e)/Tep’ = (e-p)/Tp

INITIAL z = 0.01e = 0.01p = 0.01up = 0firing = 0time = 0counter = 0ready = 0

FUNCTION g_comp(p) if (p < eps) gcomp = 1

84

Page 95: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

A.2. A-TYPE POTASSIUM CHANNEL

else g_comp = log(p)/log(eps)

PROCEDURE detect(v (mV)) if ( v>thresh && up==0 ) counter = delayup = 1ready = 1if (ready==1 && counter>0) counter = counter-1 if( ready==1 && counter<=0) firing = 1ready = 0time = tif ( t>time ) firing = 0 if ( v<thresh ) up = 0

85

Page 96: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH
Page 97: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Appendix B

Hodgkin-Huxley Delayed RectifierModel

B.1 Voltage EquationsEach ionic current is given by Vm = 1

giIi+Ei so Ii = gi(V m−Ei) and it follows

that

Iion = (Ileak + INa + IK)= gleak(Vm − Eleak) + gNam

3h(Vm − ENa) + gKn4(Vm − EK)

with Eleak = −70.3 mV, ENa = +55 mV, EK = −75 mV, gleak = 20.5 µS.cm−2,gNa = 60.0 mS.cm−2 and gNa = 5.1 mS.cm−2. The final voltage equation is givenby

Iapp = CmdVmdt

+ gleak(Vm−Eleak) + gNam3h(Vm−ENa) + gKn

4(Vm−EK) (B.1)

B.2 Equations for Gating VariablesThis presentation of Hodgkin-Huxley formalism is fully presented in [4]. The gat-ing variables m,h, n, a and b that control the flow of current through the voltage-dependent conductances obey the equations

dm

dt= φ [αm(Vm).(1−m)− βm(V ).m]

dh

dt= φ [αh(Vm).(1− h)− βh(V ).h]

dn

dt= φ

2 [αn(Vm).(1− n)− βn(V ).n]

τa(Vm).dadt

= a∞(Vm)− a and τb(Vm).dbdt

= b∞(Vm)− b

87

Page 98: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

APPENDIX B. HODGKIN-HUXLEY DELAYED RECTIFIER MODEL

where φ = 3.8 is a temperature factor reflecting the difference between 6.3 degCof the original Hodgkin-Huxley experiments and the 18.5 degC of the Connor andStevens crustacean experiments.

αm(Vm) = 0.1(Vm + 29.7)1− exp

[− (Vm+29.7)

10

] βm(Vm) = 4 exp[−(Vm + 54.7)

10

]

αh(Vm) = 0.07 exp[−(Vm + 48)

20

]βh(Vm) = 1

1 + exp[− (Vm+18)

10

]αn(Vm) = 0.1(Vm + 45.7)

1− exp[− (Vm+45.7)

10

] βn(Vm) = 0.125 exp[−(Vm + 55.7)

80

]

a∞(Vm) =

0.0761. exp[

(Vm+94.22)31.84

]1 + exp

[(Vm+1.17)

38.93

]

13

.

b∞(Vm) =

1 + exp[(Vm + 53.3)

14.54

]−4.

τa(Vm) = 0.3632 + 1.1581 + exp

[(Vm+55.96)

20.12

] .τb(Vm) = 1.24 + 2.678

1 + exp[

(Vm+50)16.027

] .This choice of somatic spiking conductances allows spiking to occur at arbitrarilylow firing rates, as is typically observed in cortical cells.

88

Page 99: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

Appendix C

NEURON stimulations parameters

89

Page 100: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

APPENDIX C. NEURON STIMULATIONS PARAMETERS

Pyramidal cell layer 2/3nseg 1diameter 61.4 µmL 61.4 µmRa 150.0 Ωcm

Slow dynamics K channelgKim 0.07 mS/cm2

EKim -90 mVτmax,im 2269 ms

HH sodium channelgNa 0.06 S/cm2

ENa 50 mV

HH potassium channelgK 0.0051 S/cm2

EK -90 mV

Leak channelgleak 0.0205 mS/cm2

Eleak -70.3 mV

BCPNN synapseτi 20 msτj 20 msτe 200 msτp 1000 msε 10−6

r 0.8gmax 500 pS

A-type K channelτ 20 msτe 200 msτp 2000 msε 10−6

r 0.8gk 54.8 µSdelay 7 ms

Table C.1: Parameters describing the one-compartmental cell modeling a layer 2/3pyramidal neuron, the included ion channels and BCPNN synapse. Ra stands forthe axial resistivity and L for the length of the section. The cell has a leakage con-ductance, a voltage-gated sodium channel, a voltage-gated potassium channel, anactivity dependent potassium channel with slow dynamics and an A-type potassiumchannel.

90

Page 101: Transforming the BCPNN Learning Rule for Spiking Units to a - KTH

TRITA-CSC-E 2010:059 ISRN-KTH/CSC/E--10/059--SE

ISSN-1653-5715

www.kth.se