Upload
phamtuyen
View
222
Download
0
Embed Size (px)
Citation preview
MorethanMoorewithNeuromorphicComputingArchitectures
ColloquiumGSIDarmstadtDecember19,2017
KarlheinzMeier
Ruprecht-Karls-UniversitätHeidelberg
[email protected]@brainscales
On computable numbers, with an application to the Entscheidungsproblem published 1937 in Proceedings of the London Mathematical Society
„Wemaycompareamanintheprocessofcomputingarealnumberwitha
machinewhichisonlycapableofafinitenumberofconditions“
The Turing Machine
Modelled after a human executing a set of computational instructions: 1. Limited size of internal storage→ finite number of states 2. Infinite size of external storage but only a fraction can be used at a given time t 1. + 2. uniquely determine the state of the machine at time t+1
EachTuringmachineuniquelyrepresentsanalgorithm
Realizing the Turing Machine The von Neumann Architecture
– Data and instructions stored in memory – Content of memory addressable by location – Instructions executed sequentially unless order is explicitly modified – Memory and Computation physically separated
Memory Computation
Control
Data Instructions
©IBM
Realizing the von Neumann Architecture Claude Shannon : From algebra to logic circuits
©H.Eisenreich,TUDresden E.Coli
©XILINXInc.
RealizingShannonsGatesPhysicstakesover-DigitalisreallyAnalog
FigurecourtesyofKunleOlukotun,LanceHammond,HerbSutter,andBurtonSmith
2020 2025 2030Year
LimitsofTechnologyScalingPe
rformance
Transistors
ThreadPerformance
ClockFrequency
Power(watts)
#Cores
180nm
65nm
28nm10nm
K.Boahen,2017
Nomoreprogressfromsmallertransistors
NewARCHITECTURESsuddenlyinteresting!
Non-vonNeumann
Non-Turing
TheBrain– ExtremeMatter
1011nodes(neurons)
1015connections(synapses)
Dynamiclong-rangeandshort-rangeinteractions
Opensystemdrivenby
externalworld
Timescalesfrommillisecondstoyears
Stochasticonthemicroscopiclevel
AssetsofbraincomputingØ EnergyefficiencyØ CompactnessØ FaulttoleranceØ SpeedØ ConfigurationandlearningreplaceprogrammingØ Scalability
LarrySmarr,Calit
Conventionalcomputingismovingawayfromthebrain
Whybraininspiredcomputing?futurecomputingbasedon
biologicalinformationprocessing
understandingbiologicalinformationprocessing
Twofundamentallydifferentmodelingapproaches:
• NUMERICALMODEL(Turing)
representsmodelparametersasbinarynumbers
• PHYSICALMODEL(notTuring)
representsmodelparametersasphysicalquantities→voltage,current,charge(likethebiologicalbrain)
canbecombinedtoformahybridsystem
needmodelsystemtotestideas
Herrmannv.Helmholtz(1821-1894)JuliusBernstein(1839-1917)
SantiagoRamónyCajal(1852-1934)
Individualcellsinthebrainarespatiallyseparated
constituents
“interactionoveradistance”
and
“spatialandtemporalintegration”
5µm
Basicsofneuralcommunicationaction potential (“spike”)
neurons
synapses output spike
neuron threshold voltage
membrane voltage
Ø Thresholdfornon-linearresponseleadingtoall-or-nonelaw(spikes)
Ø neuronsintegrateoverspaceandtime-characteristictimeconstants
Ø temporalcorrelationisimportantØ mixed-signalsystem:
actionpotential↔membranevoltage
Ahrensetal,NatureMethods10,413–420(2013)
W. W. Norton
MembraneCapacitanceC IonChannel
Conductancegi=1/Ri
MembraneVoltageU
SomeElectricalQuantitiesofarealNeuronMembrane
IonChannelCurrentIi
U,Iandgarefunctionsoftimeinanoperatingnetwork!Currenttheoriesandmodellingaretreatingthesequantitiesonly(fewexceptions)
MovingChargesQ
jdrift =σ ⋅
E = −σ ⋅ grad(Φ)
jdiffusion = −D ⋅ grad(n)
Hodgkin-Huxley1952Describingthenon-linearity
Markram,2015
PhD,MihaiPetrovici,2015
Leaky-integrate-and-fire(LIF)
Diesmann,Proceedingsofthe4thBiosupercomputingSymposium,Tokyo,2012
K-Computer,RIKENLab,12.6MWProcessor-to-NeuralCellRatio1:20.000
Simulationspeed1.520:1comparedtobiologicalreal-time
N.Gogtayetal.,PNAS101(21):8174-8179,2004
15yearsofwiringuptheadolescentbrainduringdevelopment
Slowdynamics
Nature+Real-time Simulation
CausalityDetection 10-4s 0.1s
SynapticPlasticity 1s 1000s
Learning Day 1000Days
Development Year 1000Years
12OrdersofMagnitude
Evolution >Millenia >1000Millenia
>15OrdersofMagnitude
TimeScales
FIGURE 5
100 J1 Joule
10-4 J0.1 milliJoule
10-8 J10 nanoJoule
10-10 J0.1 nanoJoule
10-14 J10 femtoJoule
Energy Scales
EF
FI
CI
EN
CY
EnergyScalesComputationalPrimitive:Energyusedforasynaptictransmission10-14ordersofmagnitudedifferencefor„thesamething“
From:HBPprojectreport
?
How much does a Neural Computation cost ? FROM TOP TO BOTTOM (Human Brain)
20 W total Power equally shared 100 Billion neurons firing at 1 Hz 10-10 Joule per action potential 1015 Synapses transmitting at 1 Hz 10-14 Joule per synaptic transmission
FROM BOTTOM TO TOP
Approx. 109 ATP molecules to be hydrolyzed for action potential
Approx. 105 ATP molecules to be hydrolyzed for synaptic transmission D. Attwell and S. B. Laughlin Obtain 10-19 Joule (approx. 1 eV) per ATP molecule Bray, Dennis. Cell Movements. New York: Garland, 1992 10-10 Joule (100.000 fJ = 0.1 nJ) per action potential
10-14 Joule (10 fJ) per synaptic transmission
Metal(100nmtimes100nm)
Insulator(Oxide)<20AtomLayers
Semiconductor
2Volts
„Switching��ofaMOStransistor:approximately0.5fJ
SynapticTransmission:
approximately10fJ
20CMOSTransistors
Electronicsvs.Biologyonthedevicelevel-Notabigdifference!
It‘snotthedevices,it‘stheARCHITECTUREandtheCOMPUTATIONALMODEL
½CU2
TheBraininaComputer? ComputerslikeBrains?
1X =l 2X =l 3X =l
1W =l 2W =lNl=W
Nl=X
OutputlayerInputlayer
Innerlayers
ArtificialNeuronalNetworksignoretimeevolution….
Here:local,norecurrencyfeed-forward
CAT
PairsofneuronsconnectedbyweightsNeuronperformsintegration(summing)
∑
)3( =ix
y
)1( =iwu ( )uf
)(Iw
)2( =iw)2( =ix
)( Iix =
1X =l 2X =l 3X =l
1W =l 2W =lNl=W
Nl=X
OutputlayerInputlayerInnerlayers
CAT
LearningExample:Supervised
Labelledinputdata
DesiredoutputDeviation
Actualoutput
JumpstartStrategicNetworkSupervisedLearningPredicthumanmovesdatabaseofexistingmatches160.000matches,30Millionpositions
PolicyNetworkReinforcementLearningNetworkself-matches128.000.000matches
ValueNetworkCombinationoffirst2steps30Millionself-matchesOneyearlearningtime,0.5MWEnergy:183MWhExcessivetrainingsamples
LearningisslowandexpensiveApplicationisfast
https://upload.wikimedia.org/wikipedia/commons/0/0a/Temporal_summation.JPG
Whatistime(spiking...)goodfor?Ø SparseinformationcodingbytimecorrelationsØ Shorttermspikebasedsynapticplasticity(STP)Ø Spike-timing-dependentplasticity(STDP)Ø Temporalnoise(stochasticity)basedcomputing
Ø EnergyefficiencyØ Computationaladvantages
Brain-inspiredorbrain-derivedorneuromorphiccomputing
Digital
• Discretevaluesofphysicalvariables• ComputationbyBooleanalgebra• Onewireonebitofinformation• Signalrestoredaftergate
Analog
• Continuousvaluesofphysicalvariables• Computationbycomponentphysics• Onewiremanybitsofinformation• Signalnotrestoredafterstage
Nature/mixed-signal• Localanaloguecomputation• Binarycommunicationbyspikes• Signalrestoration
Large-scaleNeuromorphicComputing–compare
Ø Commoditymicroprocessors SpiNNaker,HBP Soft-binary-codeØ Customfullydigital TrueNorth,IBM Hard-binary-codeØ CustomMixed-Signal BrainScaleS,HBP PhysicalmodelAnythingincommon?
+Massivelyparallel(closetoperfectweakscaling)+Asynchronouscommunication+Configurability-Limitedflexibilityandcomplexityinneuralmodels
COMPLEMENTARITYOFAPPROACHESESSENTIAL!
Click to edit Master title style
• Click to edit Master text styles
– Secondlevel• Third level
– Fourth level
HBP Neuromorphic Computing Concepts
PHYSICALMODELSYSTEM
Localanalogcomputingwith4Millionneuronsand1Billionsynapses–binary,asynchronous
communication–x10000acceleratedemulation
Location:Heidelberg(Germany)
MANY-CORENUMERICALMODELSYSTEM
0.5–1MillionARMprocessors–address-based,smallpacket,asynchronouscommunication–real-timesimulation
Location:Manchester(UK)
35
PhysicalModelSystemContinuousTimeIntegratingNeuralCellMembrane(+non-linearity)
CmdVdt
= −gleak V −Eleak( )
Cm
R = 1/gleak
Eleak
V(t) gleak [S] Cm [F]
Biology(*) 10-8 10-10 VLSI 10-6 10-13
(*) Brette/Gerstner, J. Neurophysiology, 2005
„Time“isimposedbyinternalphysics,notbyexternalcontrol€
cmdVdt
= −gleak V − E l( ) + pkgk V − Ex( )k∑ + plgl V − E i( )
l∑pk,l(t) exponential onset and decay (post-synaptic potential shape) gk,l 0 to gmax (“weights”)
effective membrane time-constant cm /gtotal is time-dependent
A New VLSI Model of Neural Microcircuits Including Spike Time Dependent Plasticity, Johannes Schemmel, Karlheinz Meier, Eilif Muller, Proceedings of the 2004 International Joint Conference on Neural Networks (IJCNN'04), IEEE Press, pp. 1711-1716, 2004
Implementationexamplewithsynapticinputsandneuronnon-linearitymixed-signal:analogcores,binarycommunication
Physical Model, local analogue computing,
binary continuous time communication
Wafer-Scale Integration of 200.000 neurons and 50.000.000 synapses on
a single 20 cm wafer
Short term and long term plasticity, 10.000 faster
than real-time
HardwarePlatformConstraints
NeuromorphicSubstrate
SimulationandVerfification
ModelBuildingCells,Network,Plasticity
Theory
TechnologyMapping
TechnologyRouting
FormalNetworkDescription
BiologicalDatabases
ConfigurationSpace40MBforafullWafer
ChallengeandOpportunity:Variability
Marder,TaylorNatureNeuroscience14,Nr2,2011
Pyloricrhythmofthecrustaceanstomatogastricganglion
20.000.000modelnetworkscreatedwith17randomcellparameters,fixedconnectivity(Neuron)400.000networksfoundwith„identical(de-generate)“timingbehaviourinmeasuredbiologicalrange
Sensitivityofsingleparameterswithin„de-generate“solutions
Marder,TaylorNatureNeuroscience14,Nr2,2011
Variabilityhastobeattherightplace...
Hardware-In-the-Loop
Millionsofparameters-networktopology-neuronsizesandparameters-synapticstrengths
Whatfor?- Calibration- Learning- Environment- Data
Separated?
ConventionalComputerDataorsimulatedenvironment
Learningmechanisms
Actionofmachineondataorenvironment
Stateofdataorenvironment
Reward(penalty)
NeuromorphicMachine
SebastianSchmittetal.,acceptedIJCNN2017
Physicalmodelemulation
Feed-forward,rate-based.4-layerspikingnetworkMNISTclassificationonaphysicalmodelmachine
performancebeforeandafterhardwarein-the-looplearning
https://arxiv.org/abs/1703.01909
MNISTclassificationonaphysicalmodelmachineNeuronalfiringactivityafterhardwarein-the-looplearning
input 2xhidden
label
https://arxiv.org/abs/1703.01909
Schmuker,M.etal.,"Aneuromorphicnetworkforgenericmultivariatedataclassification."ProceedingsoftheNationalAcademyofSciences(2014):201303053.
3LayerSpikingNeuronNetworkderivedfromInsectOlfactorySystem
LI:ReceptorNeurons
LII:Decorrelationthroughlateralinhibition(Glomeruli)
LIII:Association(SoftWTAthroughstronginhibitorypopulatuions)
SupervisedLearningSynapticProjectionsfromLayer2toLayer3
Exampleforinsectbrainderivedcircuit
Schmuker,M.etal.,"Aneuromorphicnetworkforgenericmultivariatedataclassification."ProceedingsoftheNationalAcademyofSciences(2014):201303053.
Neuronalfiringactivitybeforeandafterlearning
Applicationingenericmultivariatedataclassification
T. Pfeil, A.-C. Scherzer, J. Schemmel and K. Meier, Neuromorphic Learning towards Nano Second Precision, Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN). Dallas, TX, USA: IEEE Press, 2013, pp. 869-873.
On-chip:Spike-Timing-
Dependent-Plasticity
BoltzmannMachines
Networksofsymmetricallyconnectedstochasticnodesk
Stateofnodesdescribedbyvectorofbinaryrandomvariableszk(0,1)
Probabilityforstate-vectorconvergestoatargetBoltzmann-distributionEnergyfunction
WHATFOR?Learninternalstochasticmodelofinputspace–Generateordiscriminate
Stochasticallymodulatedfiringprobability
Deterministicfiring
M.A.Petrovici,J.Bill,I.Bytschok,J.Schemmel,andK.M.:Stochasticinferencewithspikingneuronsinthehigh-conductancestate.PhysicalReviewE94,2016
StochasticityfromØ ExternalnoisesourceØ InternalnoisesourceØ Ongoingnetwork
activity
Petrovici,MihaiA.,etal."Stochasticinferencewithdeterministicspikingneurons."arXivpreprintarXiv:1311.3211Buesing,Lars,etal."Neuraldynamicsassampling:amodelforstochasticcomputationinrecurrentnetworksofspikingneurons."PLoSComput.Biol7.11(2011):e1002211.
SpikingLIFNeuronsasa2-state(binary)system
Neuralcomputabilitycondition
Anetworkofspikingneuronsdrawssamplesfromthejointdistributionpifthemembranepotentials!"oftheneuronsfollows
Correspondstoalogisticactivationfunctionoftheneuron
PhD,MihaiPetrovici,BALuziweiLeng2015
LearningspecificinputdistributionsbyadjustingLOCALinteractions- Clampvisibleunitstovalueofparticularpattern–reachthermalequilibrium- Incremementinteractionbetweenany2nodesthatarebothon- Runnetworkfreelyandsamplefromstoredprobabilitydistribution- Inferfromclampedinput
Freerunning„Dreaming“Generative
InferringInputincompatiblewith0
Discriminative
FIGURE 5
100 J1 Joule
10-4 J0.1 milliJoule
10-8 J10 nanoJoule
10-10 J0.1 nanoJoule
10-14 J10 femtoJoule
Energy Scales
EF
FI
CI
EN
CY
EnergyScalesEnergyusedforasynaptictransmission
FillingtheGap
- Typically10.000.000timesmoreenergyefficientthanstate-of-theartHPC(comparablemodel)
- 10.000lessefficientthanbiology
From:HBPprojectreport
Nature+Real-time Simulation Accelerated
Model
CausalityDetection 10-4s 0.1s 10-8s
SynapticPlasticity 1s 1000s 10-4s
Learning Day 1000Days 10s
Development Year 1000Years 3000s
12OrdersofMagnitude
Evolution >Millenia >1000Millenia >Months
>15OrdersofMagnitude
TimeScales
Processing Element
Processing Element
Processing Element
Processing Element
Rout
er
SerDesSerDesSerDes
SerDes SerDes SerDes SerDes
MCU
Memory Interface
Shared Memory
Shared Memory
Today:Workingprototypes2020:Operationalsystems
Goal:learningcognitivemachines
SpiNNaker-2
4-coreQuadProcessingElement25GIPS/WonasingledieFloatingpointprecisionTruerandomnumbers
BrainScales-2
FlexiblelocallearningOn-the-flynetworkreconfiguration
StructuredneuronsDendriticcomputation
NextgenerationofNMcomputingintheHBP
Final Thoughts Ø After 10 years of development the BrainScaleS large scale
physical hardware system is being commissioned and delivers first results
Ø Fully non-Turing, physical model computing can solve
established machine learning tasks Ø 2nd generation physical model systems start to offer very
advanced accelerated local learning capabilities and exploitation of dendritic computation
Goal : Build a continuously learning cognitive machine