Storage Class Memory Architecture for Energy Efficient Data Centers Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang Computer

Storage Class Memory Architecturefor Energy Efficient Data Centers

Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang

Computer Science DepartmentUniversity of Pittsburgh

Server power consumption

Small Config. Large Config.0

500

1,000

1,500

2,000

2,500

3,000

3,500

(Watts)

Processors

Memory

(Lefurgy et al., ’03)

(1,614W)

(2,972W)

Challenges with DRAM

• Power wall– Large fractions of system power consumed in DRAM

• Cost wall– Memory accounts for a major fraction of overall server cost

• Scaling wall– DRAM scaling becomes harder and harder

• Higher speed (bandwidth) means faster clocking• Larger size = increase of loading (on buses) and

refresh overheads (power & performance)

New non-volatile memory to rescue

US Patents Granted

MRAM

FRAM

PCM (PRAM)

(Lam, VLSI-TSA ’08)

1. Non-volatile2. Byte-addressable3. Acceptable performance4. Good scaling potential

* Subject to write endurance limit

Agenda

• Storage class memory architecture• Industry progress• Our vision• Some research questions

Storage class memory architecture

L1 $$

L2 $$

L1 $$

PCM-Small

SmartMem-ctrlDRAM

PCM-Large

PCM is slow and write endurance limited; we need DRAM buffering

This is PCM working memory; a better species (e.g., SLC)?

This is PCM “storage” space; maybe equivalent to PCM-Small or maybe slower and larger (e.g., MLC)?

“Smart mem. controller” to handle diff. technologies; cache mgmt, wear leveling, error handling (ECC, sparing), trim & low-level scheduling

Prior work & findings

• Memory energy savings– Sizable savings of 20~90% [Zhou et al., ’09, Park et al., ’11]

– At a manageable performance hit of ~5% or so

• Hardware wear leveling feasible [Qureshi et al., ’09, Seong et al., ’10]

• Other system implications– Fast system on and off [Doh et al., ’09]

– Single-level data store [Venkataraman et al., ’11]

– Rapid checkpointing [Dong et al., ’09]

Techinsights decap ’10

512Mb @60nm?Diode switch designBelieved to be a tech.-migrated design

Industry progress: SamsungLee et al. ISSCC ’07Lee et al. JSSC ’08

512Mb @90nmDiode switch design266MB/s read4.64MB/s write (x16)

Chung et al. ISSCC ’11

1Gb @58nmLPDDR2-N“Write skewing”6.4MB/s write“DCWI” (~Flip-N-Write)

(Servalli, IEDM ’09)

Industry progress: Numonyx (Micron)Early access program(2009)

“Alverstone” (OMNEO)128Mb @90nmTR switch design40MB/s read (?)<1MB/s write (?)

Numerous press releases(slated for MP in 2011)

“Bonelli”1Gb @45nm

1.8V I/O

(2011~2012?)

“Imola” and “Mandello”2Gb & 4Gb @45nm

1.2V & 1.8V I/OLPDDR2-NVM &DDR3-NVM

Our vision

• To drastically reduce the power needed by TB capacities for main memory

• Cross-cutting, holistic system design– With heterogeneous

resources, management tasks are best handled by collaboration of layers

– MemVisor

Research questions (infra)

• PCM has the potential to beat DRAM in terms of capacity and power…– But what about performance? How much

performance is “good enough” for key applications?

• What cross-layer information is critical for MemVisor?– What are appropriate interfaces?

• Can we predictively allocate different amount of DRAM and PCM to a virtual machine?– Hardware and software support?

Research questions (application)

• How can we best utilize persistency in memory?– Extension of storage? How?– New algorithms and data structures?

• PCM provides “storage” that is orders of magnitude faster than HDDs– Any changes needed in OS? DBMS?

• New algorithms that work synergistically with the underlying hardware and system layers for longer lifetime and higher reliability?

Storage Class Memory Architecturefor Energy Efficient Data Centers

www.cs.pitt.edu/PCM

Documents

Storage Class Memory Architecture for Energy Efficient Data Centers Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang Computer