unel ersity - GitHub Pages · 2018. 9. 6. · M ´ 1986) J ij = 1 N X µ µ ⇠ µ i ⇠ µ j uously) • synapses (0.14! 1986) • ning) (p ⇠ p N 1994), states • and 2001) satisﬁed

Learning and memory in recurrent neural networks

Nicolas Brunel

Departments of Neurobiology and Physics, Duke University

Outline

1.Introduction:The

Hebbian/A

ttractorNeuralN

etwork

scenario

2.A

briefoverviewofthe

relevantneurobiology

3.The

Hopfield

approach

4.The

Gardnerapproach

5.O

penquestions

Outline

1.Introduction:

TheH

ebbian/Attractor

NeuralN

etwork

scenario

2.A

briefoverviewofthe


3.The

Hopfield

approach

4.The

Gardnerapproach

5.O

penquestions

Learningand

mem

ory:The

Hebbian

scenario

•E

xternalstimuli)

Changes

inneuronalactivity

Stim

ulusA)

•C

hangesofactivity)

changesin

synapticconnectivity

Stim

ulusA)

•C

hangesin

synapticconnectivity)

changesin

neuronalactivity/dynamics

•A

notherstimulus

triggeringactivity

ina

distinctsubsetofneurons...

Stim

ulusB)

•...w

illalsolead

tochanges

inconnectivity

Stim

ulusB)

)S

ynapticconnectivity

=superposition

oftracesleftby

externalinputs

Questions

•W

hatis/arethe

synapticplasticity

(‘learning’)rule(s)?

•W

hatarethe

statisticsofsynaptic

connectivity?

•H

owdoes

synapticplasticity

affectnetwork

dynamics?

•W

hatisthe

storagecapacity

ofnetworks?

Outline

1.Introduction:The

Hebbian/A

ttractorNeuralN

etwork

scenario

2.A

briefoverviewofthe


3.The

Hopfield

approach

4.The

Gardnerapproach

5.O

penquestions

Cerebralcortex

-feedforward

flowofinform

ation

Cerebralcortex

-neurons,synapses

•N

eurondensity⇠

105/m

m3

•N

euronsdivided

intw

om

ainclasses:

–E

xcitatory(80%

)-mostly

pyramidalcells

–Interneurons

(20%)-very

diverse

•S

ynapsedensity

⇠10

9/mm3

(10,000synapses

perneuron)

•A

pproximately

halfof

thesynapses

arelocal

(<

1mm

),otherhalflong-range

•S

trongrecurrentconnectivity

Corticalm

icrocircuit-connectivity

•A

natomy:

Potentiallyfully

connectednetw

orkat

shortspatialscales

(dendriteof

aneuron

‘touches’

theaxon

ofanyotherneighboring

neuronw

ithprob-

abilityclose

to1)

•E

lectrophysiology:C

onnectionprobabilities

⇠0.1

(E-E

),0.5(E

-I,I-E,I-I)atshortdistances

(<100µ

m)

Motifs:

pairs

•B

idirectionalconnectionsare

overrepresented

•Partially

symm

etricsynaptic

connectivity

Motifs:

triplets

Synaptic

connectivityis

plastic

Whatcontrols

synapticplasticity?

Spike

timing

Firingrate

Post-synapticm

embrane

potentialN

euromodulators

(e.g.dopamine)

Structuralplasticity

onlonger

time

scales

Electrophysiologicalrecordings

inbehaving

animals

Delay

match

tosam

pletask

Selective

persistentactivityin

DM

Stasks

inm

onkeys

FusterandJervey

1981

ITC,colors

Miyashita

1988,etc...

ITC,abstractpatterns

Nakam

uraand

Kubota1995

ITC,P

RC

,ER

C,TP

C,pictures

•C

onsistentwith

attractordynam

ics

Evidence

forattractor

dynamics

inm

ice

•Perturbation

experiments:R

esponseconsistentw

ithattractordynam

ics

Outline

1.Introduction:The

Hebbian/A

ttractorNeuralN

etwork

scenario

2.A

briefoverviewofthe


3.The

Hopfield

approach

4.The

Gardnerapproach

5.O

penquestions

Attractor

networks:

theH

opfieldm

odel(1982)

•Fully

connectednetw

orkofN

binaryneurons

(Si (t)

=±1);

•N

eurondynam

ics(atzero

temperature):

Si (t

+1)

=sign

X

j

Jij S

j (t) !

•H

owto

storep

i.i.d.randompatterns

⇠µi=

±1

with

prob.0.5/0.5as

fixedpointattractors?

•U

se‘H

ebbian’synapticconnectivity

matrix

Jij

=1N

X

µ

⇠µi⇠µj

Attractors

inthe

Hopfield

model

Inthe

largeN

limit:

•A

llpatternsare

attractorsofthe

dynamics

with

prob1

if

p<

N4log

N

•A

patternis

anattractorofthe

dynamics

with

prob1

if

p<

N2log

N

(Weisbuch

andFogelm

an-Soulie

1985,McE

lieceetal1987)

•There

existstablefixed

pointsclose

tothe

storedpatterns

if

p<

↵c N

where

↵c=

0.138(A

mit,

Gutfreund,

Som

polinsky1985,

usingthe

replicam

ethod)

Phase

diagram

•P

hasediagram

canbe

obtainedusing

thereplica

method

(Am

it,Gutfreund

andS

ompolinsky

1985)

•O

rderparam

etersquantifying

qualityof

retrievalof

mem

ories:Overlaps

with

storedpatterns

mµ=

1N

X

i

Si ⇠

µi

•P

hasediagram

obtainedusing

replicam

ethod(A

mit,

Gutfreund

andS

ompolinsky

1985)

•B

asinsof

attractionof

mem

oriesare

enormous

at

small↵

,vanishatm

aximalcapacity

Thelong

roadtow

ardsm

orerealistic

models

•M

odelsw

ith0,1

neuronsand

sparsem

emories

⇠µi=

0,1

with

prob1�f,f

(Tsodyks

andFeigel’m

an1988)J

ij=

1

f(1

�f)N

X

µ

(⇠µi�f)(⇠µj�

f)

↵c ⇠

1/(2

flog(1

/f)

inthe

sparse(f

!0)coding

limit

Information

storedpersynapse

increasesw

ithdecreasing

fup

to

1/(2

log2)⇠

0.721

bitspersynapse

when

f!

0

•M

odelsw

ithhighly

dilutedasym

metric

connectivity(D

erridaetal1987)

Jij=

cij

cN

X

µ

⇠µi⇠µj

where

cij=

1,0

with

probabilityc,1

�c

andc!

0

↵c=

pm

ax/C

=2/⇡⇠

0.64

inthe

sparseconnectivity

limit

•‘Palim

psest’models

(continuouslylearning

newpatterns,w

hileforgetting

oldones,

Mezard

etal1986)

Jij=

1N

X

µ

�µ⇠µi⇠µj

Decrease

incapacity

(priceto

payforbeing

ableto

learncontinuously)

•M

odelsw

ithdiscrete

synapses

Modestdecrease

incapacity

(0.14!0.1

forbinarysynapses,S

ompolinsky

1986)

•M

odelsw

ithdiscrete

synapsesand

one-shotlearning

)D

rasticdecrease

incapacity

(p⇠

pN

,Tsodyks1990,A

mitand

Fusi1994),

unlessm

emories

aresparse

orsynapseshave

alarge

numberofstates

•M

odelsw

ithE

-Iseparationand

spikingneurons

(Am

itandB

runel1997,Bruneland

Wang

2001)

Performas

AN

Ns

providedspecific

conditionson

connectivityare

satisfied

Inferringlearning

rulesfrom

invivo

data

Inferiortem

poralcortex(ITC

)

•ITC

:Where

perceptionm

eetsm

emory

(Miyashita

1993)

•Laststage

ofpurelyvisualprocessing

•C

ellsselective

tocom

plexvisualfeatures

(faces,objects,etc)

•S

toreslong-term

visualmem

ories(lesion

studies)

•Persistentactivity

hasbeen

observedin

DM

Stasks

(Fuster,Miyashita,D

esimone,...)

Electrophysiologicalrecordings

inITC

ofawake

monkeys

•H

owdoes

neuronalactivitychange

asan

initiallynovelstim

ulusbecom

esprogressively

familiar?

•D

atafrom

Woloszyn

andS

heinberg(2012):U

se125

novel/125fam

iliarimages

per

session

•Focus

onaverage

visualresponsein

the[75m

s,200ms]interval

Distribution

ofaveragevisualresponses

fornovel/fam

iliarstim

uli

•M

ean(Familiar)

<M

ean(Novel)form

ostneurons

•B

est(Familiar)

>B

est(Novel)form

ost‘putative’Eneurons

Questions

•C

anw

einferlearning

rule(s)fromchanges

infiring

ratedistributions?

•C

anlearning

rule(s)inferredfrom

datagenerate

attractorsin

recurrentnetworks?

How

dodistributions

offiringrates

evolvew

ithlearning?

•Take

arate

model,w

ithN

�1

neuronsdescribed

bya

firingrate

ri ;

•Totalinputs

toneuron

i

hi=

IiX

+1N

X

j

Jijrj

•Firing

rateri=

�(hi )

•W

hena

novelstimulus

isshow

n,ri=

vi

where

vi

isdraw

nfrom

Pnov (v)

•Induces

changesin

synapticconnectivity

Jij!

Jij+�J(vi ,vj )

•W

eassum

e�J(vi ,vj )

=f(vi )g(vj )

•W

hatisthe

newdistribution

ofratesforthe

(nowfam

iliar)stimulus?

How

dodistributions

ofratesevolve

with

learning?

•Firing

rateofthe

(nowfam

iliar)stimulus

ri=

vi+�vi

•Forsm

all�J

,�v,the

changesin

totalinputdueto

learningare

�hi

⇡1N

X

j

�Jijvj+

1N

X

j

Jij �

vj

⇡f(vi )g(v)v

+w�v

Neuron-specific

Global

•C

hangein

inputs�hi

dependon

thevisualresponse

vi ,through

f

)C

hangein

rateofneuron

i

)C

hangesin

thedistribution

offiringrates

Exam

ple:covariance

learningrule

•C

ovariancelearning

rule:

�J(v

i ,vj )

=↵(v

i �v)(v

j �v)

•C

hangesin

totalinputs

�hi

⇡1N

X

j

�Jij v

j+

1N

X

j

Jij �

vj

⇡↵(v

i �v)Var(v

)

•Linearf-Icurve:no

changein

mean

firingrates

�v=

0

•Individualneurons

changetheirrates

accordingto

�vi=

↵(v

i �v)Var(v

)

•B

roadeningofthe

distributionby

afactor1

+↵

Var(v)

•W

ithsupra-linearf-Icurves:increase

inm

eanfiring

rate.

•C

ovariancerule

inconsistentwith

data

Inferringtransfer

function

•Infertransferfunction

�,from

–E

mpiricaldistribution

ofratesfornovelstim

uli;

–A

ssumed

Gaussian

distributionofinputs

fornovelstimuli

Pnov (vi )

�(hi )

Transferfunctions

ofITCneurons

•S

upra-lineartransferfunctions,consistentwith

modeland

realneuronsin

fluctuation-drivenregim

e

Inferringlearning

rule

•G

oal:Inferplasticityrule

�J(vi ,vj )

=f(vi )g(vj )

fromPnov (vi )

andPfam(vi )?

•A

ssumptions:

–S

tationarity(currently

familiarstim

ulihad,when

theyw

erenovel,the

same

distributionas

currentlynovelstim

uli)

–Learning

rulepreserves

rank

•W

iththese

assumptions,itis

possibleto

inferf(vi )

-thedependence

oftherule

onthe

post-synapticfiring

rate-from

Pnov (vi )

andPfam(vi )

•g(v)

undetermined,butitshould

besuch

thatZ

g(v)Pnov (v)dv

=0

Zg(v)vPnov (v)dv

>0

Pnov (vi ),

Pfam(vi )

f(vi )

Learningrules

ofindividualITCneurons

Sim

plestmodelthatquantitatively

describesdata:

•H

ebbianrule

inapproxim

atelyhalfofE!

Esynapses,w

hosedependence

onpost-synaptic

firingrate

isnon-linear,and

biasedtow

ardsdepression

•N

oplasticity

insynapses

involvingIneurons

Conclusions

•Inferred

post-synapticdependence

oflearningrule

fromin

vivodata

•D

ataconsistentw

ithH

ebbianplasticity

inE

neurons,noplasticity

inIneurons;

•Firing

ratedependence

isconsistentw

itha

BC

Mrule

•S

parseningofrepresentations

inITC

•S

imple

readoutforstimulus

familiarity

(averagenetw

orkactivity)

Lim,M

cKee,Woloszyn,A

mit,Freedm

an,Sheinberg

andB

runel(Nat.N

eurosci.2015)

Dynam

icsofnetw

orksw

ithlearning

rulesinferred

fromdata

•D

ataconsistentw

itha

non-linearHebbian

rulew

hosepost-synaptic

dependenceis

dominated

bydepression

•D

oessuch

arule

leadto

attractordynamics?

•W

hatisthe

storagecapacity

ofsucha

rule?

Associative

mem

orym

odelconstrainedby

data

•G

eneratep

i.i.d.Gaussian

inputpatterns⇠µi

)W

ithappropriate

transferfunction�

,distributionoffiring

ratesautom

atically

matches

thedata;

Associative

mem

orym

odelconstrainedby

data

•Learning

rule�Jij=

f(rµi)g(rµj)

inferredfrom

data;

Associative

mem

orym

odelconstrainedby

data

•Finalconnectivity

matrix

Jij=

cij

cN

pXµ=1

f(rµi)g �

rµj �

Associative

mem

orym

odelwith

analogpatterns

andnon-linear

Hebbian

rule

•N

neurons,whose

firingrate

obey

⌧dri

dt=

�ri+�

0@Ii+

NXi6=j

Jijrj 1A

•p

randomuncorrelated

Gaussian

inputpatterns⇠µi⇠

N(0,1)

•C

onnectivitym

atrix

Jij=

cij

cN

pXµ=1

f(�

(⇠µi))g ��

(⇠µj) �

where

cij=

ER

‘structural’connectivitym

atrix(c

ij=

1w

ithprob.

c⌧

1),g

shouldbe

such

that RDxg(�

(x))

=0, R

Dxg(�

(x))�

(x)dx>

0.

Transferfunctions

andlearning

rulesinferred

fromdata

Transferfunction

�Postdependence

oflearningrule

f

•Fitboth

�and

fby

sigmoidalfunctions,forallneurons

with

significant‘Hebbian’

plasticityrules;

•Take

gas

asigm

oidalfunction,with

thresholdand

gainidenticalto

f,and

offsetset

suchthat R

Dxg(�

(x))

=0

•S

imulate

andanalyze

thedynam

icsofa

network

with

median

parameters

Mean-field

theory

•C

anthe

network

retrievea

storedpattern

(i.e.convergeto

anattractorthatis

correlatedw

iththe

pattern)?

•D

efineorderparam

eters

m=

h1N

X

i

g(⇠1i )r

i i(O

verlapw

ithpattern)

�2

=h1N2

Xµ>1,j

f2(⇠

µi)g

2(⇠µj)r

2j i(Q

uenchednoise

dueto

otherpatterns)

•In

thelim

itsN

!1

,N�

cN�

1,p⇠

cN,orderparam

etersgiven

byM

Fequations

m=

ZD⇠D

zg(⇠)�(f

(⇠)m+

�z)

�2

=↵

ZD⇠f

2(⇠) ZD⇠g

2(⇠) ZD⇠D

z�2(f

(⇠)m+

�z)

where

↵=

p/(cN).

•R

etrievalstates:Solutions

suchthatm

>0;

•S

toragecapacity:largest↵

forwhich

retrievalstatesexist.

Learningrules

inferredfrom

datalead

toattractor

dynamics

anddelay

periodactivity

Storage

capacity

•S

toragecapacity

formedian

parameters

closeto

0.6;

•C

loseto

optimalcapacity

(↵m

ax⇠

0.8),inthe

spaceofsigm

oidalfunctionsf

andg.

•O

ptimallearning

rulein

sucha

space:Both

fand

gare

stepfunctions

with

highthresholds)

Tsodyks-Feigel’man

model

Transitionto

chaosatstrong

coupling

•Increasing

couplingstrength

leadsto

chaoticretrievalstates

•S

imilarto

chaoticstates

insim

plerasymm

etricrate

models

(Som

polinskyetal1988,

TirozziandTsodyks

1991)

•R

eproducesstrong

irregularityand

diversityoftem

poralprofilesofactivity

seenin

delay

periodsin

PFC

Conclusions

•N

etwork

modelw

ithdistribution

ofpatternsand

learningrule

inferredfrom

dataexhibits

attractordynamics

•Learning

ruleinferred

fromdata

closeto

optimalin

terms

ofstoragecapacity

(inthe

spaceofH

ebbianlearning

rulesw

ithsigm

oidaldependenceon

preand

postrates)

•Transition

tochaos

atsufficientlystrong

coupling-leads

tostrong

irregularityand

diversityoftem

poralprofilesofactivity

inthe

delayperiod,sim

ilartoobservations

in

PFC

Pereiraand

Brunel(N

euron2018)

Outline

1.Introduction:The

Hebbian/A

ttractorNeuralN

etwork

scenario

2.A

briefoverviewofthe


3.The

Hopfield

approach

4.The

Gardner

approach

5.O

penquestions

Gardner

approach(1988)

•Instead

offocusingon

specificlearning

rules,studyspace

ofallpossi-

bleconnectivity

matrices

thatstorea

givensetoffixed

pointattractors,

with

agiven

robustnesslevel

•In

thecase

ofnetworks

ofbinaryneurons,each

neuronhas

tosolve

itsow

nperceptron

problem-find

ahyperplane

separatingtw

osets

of

patterns,thosein

which

itshouldbe

active/inactive

•C

ompute

thetypicalvolum

eofthe

subspaceofsolutions

usingreplica

method

(Gardner1988);

•S

toragecapacity

obtainedw

henvolum

egoes

tozero;

•This

storagecapacity

isan

upperboundvalid

forallpossiblelearning

rules.

•A

lternativederivation

usingthe

cavitym

ethod(M

ezard1989)

Storage

capacityofthe

perceptron

•H

owm

anyrandom

associationscan

aper-

ceptronw

ithN

binaryinputs

learn,in

the

largeN

limit?

•A

nswer:

p=

↵N

,where

↵stays

finitein

thelarge

Nlim

it

Unconstrained

weights,

f0=

0.5,

=

0

)↵=

2(C

over1965,Gardner1988)

•Tradeoffbetw

eencapacity

androbustness

(Gardner1988)

•S

ign-constrainedsynapses:

)↵=

1(A

mitetal1989)

•C

apacityincreases

with

decreasingoutput

codinglevel,

butinform

ationcontent

de-

creases(G

ardner1988)

Questions

•W

hatisthe

distributionofsynaptic

weights

inthe

volume

ofsolutions?

•O

therstatisticsofsynaptic

connectivitym

atrixlike

jointdistributionsofpairs

ofweights,

motifs?

•C

ompare

with

availabledata

Set-up

0

12

34

101

•Fully

connectednetw

orkofN

�1

binaryexcitatory

neu-

rons;

Si (t

+1)

=⇥

X

j

Jij S

j (t)�✓ !

where

Jij

�0

foralli,j

•C

onstraintson

synapticconnectivity:the

network

shouldhave

alarge

number

(p⇠

O(N

))offixed

pointattractorstates

Si=

⇠µi

(stablerepresentations

ofexternalstimuli)

•E

achattractorstate:random

binarypattern,coding

levelf

P(⇠

µi=

1)=

f,

P(⇠

µi=

0)=

1�

f

•R

obustnesslevel

(measures

sizeof

basinof

attractionof

eachattractor);

Constraints

•D

efine‘stabilities’�

µias

�µi=

(2⇠µi�1)

pN

0@X

j

Jij⇠µj�

✓ 1A

•S

ubspaceofsolutions

definedby

�µi

�K

foralli,µ

Jij

�0

foralli6=

j

•G

oal:Com

putedistributions

of�s

andJ

s

Com

putingstatistics

ofconnectivityusing

thecavity

method

•Follow

Mezard

(1989)

•A

ssume

thatwe

havealready

learnedp

patterns,ina

network

ofN

neurons;

•A

ddone

patternand

learnit;com

putethe

distributionof‘stabilities’ofthis

pattern.This

distributionis

afunction

ofthedistribution

ofsynapticw

eights.

•A

ddone

synapticw

eight:compute

thedistribution

ofthisnew

weight,as

afunction

of

thedistribution

ofstabilities.

•Leads

toself-consistentequations

forparameters

ofbothdistributions.

•A

ddn�

2synaptic

weights:com

putejointdistributions

ofweights/probabilities

of

motifs.

Add

anew

pattern,andlearn

it•

Add

anew

randomly

drawn

pattern~⇠;

•C

onsiderneuroni,and

define⇠=

(2⇠i �

1)

•A

ssociatedstability:�

=⇠

pN

X

j

Jij⇠j�

N✓ !

•D

istributionof

�over

thespace

ofw

eightssatisfying

thepreviously

learnedpattern

isG

aussian,with

mom

ents

h=

⇠pN

X

j

hJij i

(⇠j�

f)+

⇠fM

�2h

=1N

X

j

�⌦J2ij ↵

�hJij i

2 ��⇠j (1

�2f)+

f2 �

]

•Learning

thepattern

=rem

ovingfrom

thespace

ofw

eightsthose

for

which

�<

)

leadsto

atruncated

Gaussian,

P(�

,h)=

1�h

exp ⇣

�12 �

��h

�h

�2 ⌘

H

��h

�h

�⇥(�

�)

Averaging

overdistribution

ofpatterns

•Average

overdistributionofpatterns

gives:

h=

⇠fM

⇣h�

⇠fM

⌘2

=qf(1

�f)

�2h

=f(1

�f)(Q

�q)

where

1N

X

j

⌦J2ij ↵

=Q

1N

X

j

hJij i

2=

q

•Pattern-averaged

distributionofstabilities

P(�

)=X⇠=±1

p⇠ Z

dh

p2⇡qf(1

�f)exp

0@�12

h�⇠fM

pqf(1

�f) !

2 1AP(�

,h)

Distribution

ofstabilitiesatm

aximalcapacity

•A

tm

aximal

capacity,space

ofw

eightssatisfying

constraintsshrink

tozero.

•In

thislim

it,�h!

0,q!

Q

•Truncated

Gaussian

distributionsP(�

,h)

con-

vergeto

deltafunctions,

centeredeither

on

(if

h<

)or

h(ifh>

)

•Pattern-averaged

distributionbecom

es:

P(�

)=X

⇠

p⇠ 266664

exp

�

12 ✓��⇠fM

pf(1�

f)Q

◆2 !

p2⇡f(1

�f)Q

⇥(�

�K)+

H(⌧⇠ )�(�

�K) 377775

KeplerandA

bbott1988,Krauth

etal1988

Adding

onesynapse

andcom

putingits

distribution

•A

dda

singleinput,

i=

0to

neuroni,w

ithan

associatedw

eightJ0 .Values

ofthe

patternson

thisnew

inputare⇠µ0

,µ=

1,...,p.

•This

changesslightly

thestability

ofallthepatterns:�

µ!

�µ+

✏µ

where

✏µ=

⇠µ

pN

J0 (⇠µ0�f)

•The

distributionofw

eightsJ0

satisfyingallthe

pconstraints

is

Q(J0 )/

ZY

µ

P(�

µ,hµ)d�

µ⇥(�

µ+

✏µ�

)⇥(J0 )

•In

thelarge

Nlim

it:

Q(J0 )/

exp

✓(a�

c)J0 �

b2J20 ◆

⇥(J0 )

with

a=

1pN

X

µ

⇠µ(⇠µ0�

f)P(,hµ)

b=

1N

X

µ

�⇠µ0(1

�2f)+

f2 � ✓

P(,hµ)2+

@P

@�(,hµ) ◆

Again,a

truncatedG

aussian...

Averaging

(again)overthe

distributionofpatterns

•C

ompute

mom

entsofa

andb

overthedistribution

ofpatterns,

�2a

=↵f(1

�f) X

⇠

p⇠ Z

Du

1�2h

G�a

⇠ (u) �

2

H�a

⇠ (u) �

2

b=

↵f(1

�f) X

⇠

p⇠ Z

Du

1�2h

G�a

⇠ (u) �

2

H�a

⇠ (u) �

2�

a⇠ (u

)

�2h

G�a

⇠ (u) �

H�a

⇠ (u) �

a⇠ (u

)=

�

⇠fM

+u p

qf(1

�f)

pf(1

�f)(Q

�q)

•The

averageddistribution

ofweights

is

Q(J

0 )=

ZDu

exp ⇣�

b2J20+

J0(a

+u�a ) ⌘

⇥(J

0 )

R+1

0dwexp ⇣�

b2w

2+

w(a

+u�a ) ⌘

Maxim

alcapacity

•W

henq!

Q,both

aand

bdiverge

as1/(Q�q);

•A

gain,truncatedG

aussiansconverge

todelta

func-

tions;

P(J

ij )

atmaxim

alcapacity

P(Jij )

=S�(Jij )

+1

p2⇡�W

exp

"�12

✓Jij

�W

+J0 (S) ◆

2 #⇥(W

)

05

10

15

Syn

ap

tic we

igh

t

0

0.0

5

0.1

0.1

5

0.2

Probability density

ρ=

0

ρ=

1

ρ=

2

ρ=

3

02

46

8R

ob

ustn

ess p

ara

me

ter ρ

0

0.1

0.2

0.3

0.4

0.5

0.6

Connection probability

f=0

.5f=

0.1

f=0

.01

Robuststorage

implies

sparseconnectivity

•Increasing

robustnessm

eansin-

creasingthe

distanceto

thepattern-

associatedhyperplanes;

•B

utnotfromthe

Jij=

0hyperplanes

•A

sa

result,robust

solutionsare

con-

centratedon

alarge

fractionof

the

Ji=

0hyperplanes

w1

w2

K

Distribution

ofweights

belowcapacity

02

46

810

w / w

0.01

0.1 1

P(w)

Data:

intracellularrecordings

incorticalslices

Recordings

byJesperS

jostrom(S

jostrometal2001,S

ongetal2005)

Distribution

ofsynapticw

eights:experim

entvstheory

•Large

fractionofzero

weightsynapses

isconsistentw

ithdata:

–A

natomy:

nearbypyram

idalcells

arepotentially

fullyconnected

(Kalism

anetal2005)

–E

lectrophysiology:nearby

pyrami-

dalcellshave

connectionprobabil-

ityof⇠

10%(M

asonet

al1991,

Markram

etal1997,

Sjostrom

etal

2001,Holm

grenetal2003)

)Large

fractionof

zero-weight

(‘potential’or‘silent’)synapses.

05

10

15

Syn

aptic w

eig

ht

0

0.0

5

0.1

0.1

5

0.2

Probability density

ρ=

0

ρ=

1

ρ=

2

ρ=

3

02

46

8R

obustn

ess p

ara

mete

r ρ

0

0.1

0.2

0.3

0.4

0.5

0.6

Connection probability

f=0.5

f=0.1

f=0.0

1

01

23

45

EP

SP

am

plitu

de

(mV

)

0

40

80

12

0

16

0

Number of synapses

Sjostrom

etal2001;Song

etal2005

Two-neuron

connectivity

•C

alculationofthe

jointdistributionP(J

ij ,Jji )

usingcavity

method;

•Leads

totruncated

correlated2-D

Gaussian,w

ithdelta

functionson

Jij

=0

andJji=

0axis

(andproductoftw

odelta

functionsatJ

ij=

Jji=

0).

•O

ver-representationofbidirectionnally

connectedpairs

ofneurons,compared

torandom

uncorrelatednetw

orkw

ithsam

econnection

probability

0 1 2 3 4 5

0 1

2 3

4 5

wji

wij

0 0.0

2

0.0

4

0.0

6

0.0

8

0.1

0.1

2

0.1

4

0.1

6

0.1

8

0.2

02

46

8R

ob

ustn

ess p

ara

me

ter ρ

0 2 4 6 8

excess of reciprocal connections

f=0

.5f=

0.1

f=0

.01

Two-neuron

connectivity:experim

entvstheory

00.1

0.2

0.3

0.4

0.5

0.6

connectio

n p

robability

0

0.1

0.2

0.3

0.4

0.5

0.6

reciprocal connection probability

•M

arkramet

al(1997)

ratL5

so-

matosensory

cortex:3x

random;

•S

ongetal(2005)ratL5

visualcor-

tex:4x

random

•W

angetal(2006)

–ratP

FC:4

xrandom

;

–ratvisualcortex:2

xrandom

;

•Lefort

etal

(2009)m

ousebarrel

cortex:⇠random

Distributions

ofweights:

bidirectionalvsunidirectional

01

23

45

EP

SP

am

plitu

de (m

V)

0

0.4

0.8

1.2

1.6 2

Probability density (1/mV)

bid

irectio

nal, e

xp

unid

irectio

nal, e

xp

bid

irectio

nal, th

unid

irectio

nal, th

Conclusions

•M

aximizing

information

storagedrives

alarge

fraction(>

0.5)ofsynapses

tozero

•Increasing

robustnessincreases

sparseness

•N

etworks

storingfixed

pointattractorshave

aninterm

ediatedegree

ofsymm

etry

•N

etworks

storingsequences

areasym

metric

(nooverrepresentation

ofbidirectional

connections)

•Theoreticaldistributions

ofsynapticw

eightsm

atchdata

recordedin

cortex

•C

onsistentwith

optimality

ofinformation

storagein

cortex

Outline

1.Introduction:The

Hebbian/A

ttractorNeuralN

etwork

scenario

2.A

briefoverviewofthe


3.The

Hopfield

approach

4.The

Gardnerapproach

5.O

penquestions

Open

questions

•A

pplyG

ardnerapproachto

more

realisticnetw

orks

•G

eneralizeG

ardnerapproach:attractorstatescan

beata

finitedistance

frompatterns

(notidenticaltothem

)

•H

owclose

toG

ardnerboundcan

we

getwith

purelyunsupervised

learning?

•Inform

ationstorage

atlargerscales?

•S

torageoftim

e-varyingpatterns?

•M

echanisms

forstoringm

ultipleitem

sin

working

mem

ory?

•N

etworks

with

discretesynapses

andon-line

learning:Which

learningalgorithm

s

maxim

izecapacity?

•S

upposew

ecan

measure

thefullm

atrixJij

(‘connectome’).C

anw

einferstored

mem

oriesfrom

theconnectom

e?

Documents

unel ersity - GitHub Pages · 2018. 9. 6. · M ´ 1986) J ij = 1 N X µ µ ⇠ µ i ⇠ µ j uously) • synapses (0.14! 1986) • ning) (p ⇠ p N 1994), states • and 2001) satisﬁed