30
Real Processing in the Memory with Memristive Memory Processing Unit Shahar Kvatinsky Viterbi Faculty of Electrical Engineering Technion Israel Institute of Technology July 2019 1

Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Real Processing in the Memory with Memristive Memory Processing Unit

Shahar KvatinskyViterbi Faculty of Electrical EngineeringTechnion – Israel Institute of Technology

July 2019

1

Page 2: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

2

Page 3: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

The External Memory Wall Problemvon Neumann (Architecture) Bottleneck

Year

Pe

rfo

rma

nc

e

3

Page 4: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

And a Huge Energy Bottleneck

A. Pedram, S. Galal, S. Richardson, S. Kvatinsky, and M. Horowitz, “Dark Memory and Accelerator-Rich System

Optimization in the Dark Silicon Era,” IEEE Design & Test, April 2017

Operation(16-bit operand)

Energy/Op(45 nm)

Cost(vs. Add)

Add operation 0.18 pJ 1X

Load from on-chip SRAM 11 pJ 61X

Send to off-chip DRAM 640 pJ 3,556X

1000X more energy to go to memory<

4

Page 5: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Moving Computation to Memory

5

Page 6: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Micron Automata Memory

Processor

Prior Art

SA connected to SIMD pipeline

Configuration PIM machine

Active Pages90’s

Recent

H. S. Stone, “A Logic-in-Memory Computer,” IEEE Transactions on Computers, January 1970

M. Gokhale et al., “Processing in memory: the Terasys massively parallel PIM array,” Computer, 1995

M. Oskin et al., “Active pages: A computation model for intelligent memory,” Comput. Archit. News, 1998

D. Elliott et al., “Computational ram: Implementing processors in memory,” IEEE Des. Test, 1999

P. Dlugosch et al., "An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing," IEEE TPDS, 2014

V. Seshadri et al., “Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology,” MICRO 2017

Processing “In-Memory” (PIM)Reducing Data Movement

Data transfer is still required to/from DRAM and PUs

CPU

CMOS

Processing

Units (PUs)

Memory

(DRAM) Memory

(DRAM)

Ambit

6

Page 7: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Alleviating the Data Transfer Problem

Computation

using

memory cells

Dlugosch et al., IEEE TPDS, 2014

Oskin et al., Comput. Archit. News, 1998

Elliott et al., IEEE Des. Test, 1999

Gokhale et al., Computer, 1995

Comp

Memory

Logic

Computation

using logic blocks

Computation

using peripheral

circuit

Li et al., DAC, 2016

Seshadri et al., MICRO 2017

Aga et.al. HPCA 2017

Eckert et.al. ISCA 2018

Kvatinsky et al., TCAS-II,

2014

Talati et al., TNANO 2016

IN-NEAR-OUT-

J. Reuben, R. Ben Hur, N. Wald, N. Talati, A. Haj Ali, P.-E. Gaillardon, and S. Kvatinsky,

"Memristive Logic: A Framework for Evaluation and Comparison", PATMOS 20177

Page 8: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Real Computing within the MemoryBeyond von Neumann Architecture

Memory

Control Unit

Arithmetic/Logic Unit

Input Device

Output Device

CPU

MemoryProcessing

Unit(MPU)

8

Page 9: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

MemristorsEmerging Nonvolatile Memory Technologies

STT MRAMResistive

RAM

(RRAM)

Phase Change

Memory

(PCM)

9

Page 10: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Memristor – Memory ResistorResistor with Varying Resistance

Decrease resistanceIncrease resistance

Current

Voltage

Current

Low resistive state

(RON, LRS)High resistive state

(ROFF, HRS)

Memristor10

Page 11: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Attractive for Memory!

CMOS compatible Rad hard

Dense

Nonvolatile

Fast

Low power

High endurance 11

Page 12: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

We Want to Compute within the Memristive Crossbar Memory

Bitlines

Wo

rdlin

es

12

Page 13: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Write Operation in the Memristive Memory

Voltage

DriverVwrite

Ground

13

Page 14: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Read Operation in the Memristive Memory

Voltage

Driver

Vread (<Vwrite)

Sense

Amplifier

Iout14

Page 15: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Processing In-Memory with MemristorsLogic Families

MAGIC

IMPLYUnipolar

logic

J. Reuben, R. Ben Hur, N. Wald, N. Talati, A. Haj Ali, P.-E. Gaillardon, and S. Kvatinsky,

"Memristive Logic: A Framework for Evaluation and Comparison", PATMOS 2017

Amrani et al., VLSI-SoC 2016

Kvatinsky et al., TCAS-II 2014

Borghetti et al., Nature 2010

15

Page 16: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

MAGIC – Memristor Aided LoGICExample of MAGIC NOR

NORIN2IN1

100

010

001

011

RON

ROFF >> RON

ROFF

ROFF <<VG

Increase resistance

>VG/2

ROFF

RON

RON

Initialize OUT to RON

S. Kvatinsky, D. Belousov, S. Liman, G. Satat, N. Wald, E. G. Friedman, A. Kolodny, and U. C. Weiser,"MAGIC – Memristor Aided LoGIC," IEEE TCAS II, Nov. 2014

R ON =Logic ‘1’R OFF=Logic ‘0’

16

Page 17: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Real MAGIC

Jung et al., Nano Research, July 2017 Bae et al., Nano Letters, Oct. 2017

Our lab (HfOx based)

17

Page 18: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Lab Demonstration

TaOx memristors

Fabricated by

Vikas Rana (Julich)

Tested by Barak

Hoffer (Technion)

(Unpublished)

18

Page 19: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

VG VG

IN1 IN2OUT

MAGIC NOR in a Crossbar

19

Page 20: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

VG VG

IN1 IN2OUT

MAGIC NOR in a Crossbar

20

Page 21: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

VG VG

IN1 IN2OUT

IN INOU

T

VG VG

IN1 IN2 OUT

MAGIC NOR in a Memristive Memory

N. Talati, S. Gupta, P. Mane, and S. Kvatinsky, “Logic Design within Memristive Memories Using MAGIC,“IEEE Transactions on Nanotechnology, July 2016

IN1 IN2OUT

21

Page 22: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Hierarchy of Logical Functions

MUL

COPYNOT

DIV

NANDOR

Convolution

MAGIC - NOR

XOR

Matrix multiplication

SQRTPOW

ADD SUB

LOG

22

Page 23: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

MemristiveMemory

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," NANOARCH 2016

23

Page 24: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

MemristiveMemory

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," NANOARCH 2016

24

Page 25: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

MemristiveMemory

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," NANOARCH 2016

25

Page 26: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

MemristiveMemory

Ro

w C

on

trol

Column Control

mM

PU

Co

ntro

ller

a0

a1

a2

an

b0

b1

b2

bn

c0

c1

c2

cn

mMPU µArchitecture

R. Ben-Hur and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," NANOARCH 2016

26

Page 27: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Issues Involved in mMPU System

CPU

mMPU

mMPU System

mMPU Controller

?

Memory

Design

Periphery

Design

mMPU

Controller

Design and

Optimization

Programming

Model

Software

Applications27

Page 28: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

mMPU Performance Potential

A. Haj-Ali et al., “IMAGING - In-Memory AlGorithms for Image processING,” IEEE TCAS I, December 2018

A. Haj-Ali et al., “Not in Name Alone: a Memristive Memory Processing Unit for Real In-Memory Processing,”

IEEE Micro, October 2018

S. Li et al. , “Pinatubo: A Processing-in-Memory Architecture for Bulk Bitwise Operations in Emerging Non-volatile

Memories,” DAC 2016

M. Imani et al., “Ultra-Efficient Processing In-Memory for Data Intensive Applications,” DAC 2017

29

Page 29: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

mMPU – Huge Potential• Memristors enable non-von Neumann machines

to overcome the memory wall

• mMPU – real processing in memory new computing paradigm

• Our aim is to develop a working end-to-end mMPU system

30

Page 30: Real Processing in the Memory with Memristive Memory ... · Send to off-chip DRAM 640 pJ 3,556X

Thanks!

31