20
Large-scale Neural Modeling Large-scale Neural Modeling in MapReduce and Giraph in MapReduce and Giraph Co-authors Nicholas D. Spielman Neuroscience Program University of St. Thomas Presenter Shuo Yang Graduate Programs in Software University of St. Thomas Special thanks Bhabani Misra, PhD Graduate Programs in Software University of St. Thomas Jadin C. Jackson PhD Department of Biology University of St. Thomas Bradley S. Rubin, PhD Graduate Programs in Software University of St. Thomas

Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

  • Upload
    imsure

  • View
    147

  • Download
    0

Embed Size (px)

DESCRIPTION

Using MapReduce and Giraph to model large-scale neural networks

Citation preview

Page 1: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Large-scale Neural ModelingLarge-scale Neural Modelingin MapReduce and Giraphin MapReduce and Giraph

Co-authorsNicholas D. Spielman Neuroscience Program University of St. Thomas

Presenter Shuo Yang Graduate Programs in Software University of St. Thomas

Special thanksBhabani Misra, PhD Graduate Programs in Software University of St. Thomas

Jadin C. Jackson PhD Department of Biology University of St. Thomas

Bradley S. Rubin, PhD Graduate Programs in Software University of St. Thomas

Page 2: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Why Hadoop & What is Hadoop

Why not supercomputers?

Expensive

Limited access

Scalability

Why Hadoop?

Runs on commodity hardware

Scalable

Full-fledged eco-system & community

Open-source implementation of MapReduce

based on Java

MapReduce Model

Client

MapReduce

HDFS

Split DataOutput

Map

MapReduce Output

Page 3: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

…....…....

∑ I

input currents from neighbors

∆vI1

I2

In

currents to all neighbors

Synaptic weight matrix

0 1000Time Step

Neuron ID

Simulation results

0

2500

Neural Model (Izhikevich model)

Page 4: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

…....…....

∑ I

input currents from neighbors

∆vI1

I2

In

currents to all neighbors

Synaptic weight matrix

0 1000Time Step

Neuron ID

Simulation results

0

2500

Neural Model (Izhikevich model)

This is a graph structure

Page 5: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

MapperN1 I2

I3

N1 I2

I3

I2

I3

Reducer

MapperN2 I1

I3

N2 I1

I3

I1

I3

MapperN3 I2

I1

N3 I2

I1

I2

I1

Reducer

Reducer

N1 I2

I3

N2 I1

I3

N3 I2

I1

sum currents to N1

sum currents to N2

sum currents to N3

update N1

update N2

update N3

HDFS

initial input

write back to HDFS

N1 and its local structure

N2 and its local structure

N3 and its local structure

MapSort &Shuffle Reduce

Basic MapReduce Implementation

input from previous job

Page 6: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

MapperN1 I2

I3

N1 I2

I3

I2

I3

Reducer

MapperN2 I1

I3

N2 I1

I3

I1

I3

MapperN3 I2

I1

N3 I2

I1

I2

I1

Reducer

Reducer

N1 I2

I3

N2 I1

I3

N3 I2

I1

sum currents to N1

sum currents to N2

sum currents to N3

update N1

update N2

update N3

HDFS

initial input

write back to HDFS

N1 and its local structure

N2 and its local structure

N3 and its local structure

MapSort &Shuffle Reduce

Basic MapReduce Implementation

input from previous job

Problems:synaptic currents are sent directly to the reducers without local aggregation

The graph structure is shuffled in each iteration

Page 7: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

N1 I2

I3

Mapper

N2 I1

I3

N3 I2

I1

HDFS

initial inputMap

Sort &Shuffle Reduce

In-Mapper Combining (IMC, introduced by Lin & Schatz)

N1 I2

I3

N2 I1

I3

N3 I2

I1

I1

I1

I2

I2

I3

I3

Reducer

Reducer

Reducer

I3

N2 I1

I3

N3

I1

update N1

update N2

update N3

I2

I2

Page 8: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

N1 I2

I3

Mapper

N2 I1

I3

N3 I2

I1

HDFS

initial inputMap

Sort &Shuffle Reduce

In-Mapper Combining (IMC, introduced by Lin & Schatz)

N1 I2

I3

N2 I1

I3

N3 I2

I1

I1

I1

I2

I2

I3

I3

Reducer

Reducer

Reducer

I3

N2 I1

I3

N3

I1

update N1

update N2

update N3

I2

I2

The graph structure is still shuffled!

Page 9: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

MapperN1 I2

I3

I2

I3

Reducer

MapperN2 I1

I3

I1

I3

MapperN3 I2

I1

I2

I1

Reducer

Reducer

N1 I2

I3

N2 I1

I3

N3 I2

I1

sum currents to N1

sum currents to N2

sum currents to N3

update N1

update N2

update N3

HDFS

initial input

write back to HDFS

N1 and its local structure

N2 and its local structure

N3 and its local structure

Schimmy (introduced by Lin & Schatz)

N1 I2

I3

N2 I1

I3

N3 I2

I1

Map

remotely read graph structure

sort & shuffle Reduce

Page 10: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

MapperN1 I2

I3

I2

I3

Reducer

MapperN2 I1

I3

I1

I3

MapperN3 I2

I1

I2

I1

Reducer

Reducer

N1 I2

I3

N2 I1

I3

N3 I2

I1

sum currents to N1

sum currents to N2

sum currents to N3

update N1

update N2

update N3

HDFS

initial input

write back to HDFS

N1 and its local structure

N2 and its local structure

N3 and its local structure

Schimmy (introduced by Lin & Schatz)

N1 I2

I3

N2 I1

I3

N3 I2

I1

Map

remotely read graph structure

sort & shuffle Reduce

Problems:Remote reading from HDFS

The graph structure is read and written in each iteration

Page 11: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

MapperN1 I2

I3

I2

I3

Reducer

MapperN2 I1

I3

I1

I3

MapperN3 I2

I1

I2

I1

Reducer

Reducer

N1 I2

I3

N2 I1

I3

N3 I2

I1

sum currents to N1

sum currents to N2

sum currents to N3

update N1

update N2

update N3

HDFS

initial input

write back to HDFS

N1 and its local structure

N2 and its local structure

N3 and its local structure

Schimmy (introduced by Lin & Schatz)

N1 I2

I3

N2 I1

I3

N3 I2

I1

Map

remotely read graph structure

sort & shuffle Reduce

Observation:The graph structure is read-only!

Page 12: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

MapperN1 I2

I3

Reducer

MapperI1

I3

MapperN3 I2

I1

Reducer

Reducer

N1

N2

N3

sum currents to N1

sum currents to N2

sum currents to N3

update N1

update N2

update N3

HDFS

initial input

write back to HDFS

Mapper-side Schimmy

N1 I2

I3

N2 I1

I3

N3 I2

I1

N2

Mapsort & shuffle Reduce

Page 13: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Drawbacks of Graph algorithm in MapReduce

Non-intuitive and hard to implement

Not efficiently expressed as iterative algorithms

Not optimized for large numbers of iterations

input from HDFS

output to HDFS

input from HDFS

output to HDFS

Mapper Intermediate files Reducer

Iterate

Startup Penalty Disk Penalty Disk Penalty

Not optimized for large numbers of iterations

Page 14: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Giraph

N1 I2

I3

N2 I1

I3

N3 I2

I1

N1 I2

I3

N2 I1

I3

N3 I2

I1

HDFS

Load input Synchronous barrier Synchronous barrier

N1 I2

I3

N2 I1

I3

N3 I2

I1

HDFS

…...

Write results back

Iterative graph processing system

Powers Facebook graph search

Highly scalable

Based on BSP model

Mapper-only job on Hadoop

In-memory computation

“Think like a vertex”

More intuitive APIs

Page 15: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Giraph

N1 I2

I3

N2 I1

I3

N3 I2

I1

N1 I2

I3

N2 I1

I3

N3 I2

I1

HDFS

Load input Synchronous barrier Synchronous barrier

N1 I2

I3

N2 I1

I3

N3 I2

I1

HDFS

…...

Write results back

Iterative graph processing system

Powers Facebook graph search

Highly scalable

Based on BSP model

Mapper-only job on Hadoop

In-memory computation

“Think like a NEURON”

More intuitive APIs

Page 16: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Comparison of running time of each iteration

Page 17: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Comparison of speeds – 40 ms simulation

6% 0% -11% -48% -64% -91%

Page 18: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Conclusion

Hadoop is capable of modeling large-scale neural networks.

Based on IMC and Schimmy, our Mapper-side Schimmy improves MapReduce graph algorithms

Where graph structure is read-only.

Vertex-centric approaches, such as, Giraph showed superior performance. However,

# of iterations specified as a global variable

Limited by memory per node

Not widely adopted by industry

Page 19: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Large-scale Neural ModelingLarge-scale Neural Modelingin MapReduce and Giraphin MapReduce and Giraph

Co-authorsNicholas D. Spielman Neuroscience Program University of St. Thomas

Presenter Shuo Yang Graduate Programs in Software University of St. Thomas

Special thanksBhabani Misra, PhD Graduate Programs in Software University of St. Thomas

Jadin C. Jackson PhD Department of Biology University of St. Thomas

Bradley S. Rubin, PhD Graduate Programs in Software University of St. Thomas

Page 20: Ieee eit-talk-large-scale-neural-modeling-in-map reduce-giraph

Comparison of speeds – 40 ms simulation

Comparison of speeds – 20 ms to 40 ms simulation