Upload
jemimah-james
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Storage Class Memory Architecturefor Energy Efficient Data Centers
Bruce Childers, Sangyeun Cho, Rami Melhem, Daniel Mossé, Jun Yang, Youtao Zhang
Computer Science DepartmentUniversity of Pittsburgh
Server power consumption
Small Config. Large Config.0
500
1,000
1,500
2,000
2,500
3,000
3,500
(Watts)
Processors
Memory
(Lefurgy et al., ’03)
(1,614W)
(2,972W)
Challenges with DRAM
• Power wall– Large fractions of system power consumed in DRAM
• Cost wall– Memory accounts for a major fraction of overall server cost
• Scaling wall– DRAM scaling becomes harder and harder
• Higher speed (bandwidth) means faster clocking• Larger size = increase of loading (on buses) and
refresh overheads (power & performance)
New non-volatile memory to rescue
US Patents Granted
MRAM
FRAM
PCM (PRAM)
(Lam, VLSI-TSA ’08)
1. Non-volatile2. Byte-addressable3. Acceptable performance4. Good scaling potential
* Subject to write endurance limit
Agenda
• Storage class memory architecture• Industry progress• Our vision• Some research questions
Storage class memory architecture
L1 $$
L2 $$
L1 $$
PCM-Small
SmartMem-ctrlDRAM
PCM-Large
PCM is slow and write endurance limited; we need DRAM buffering
This is PCM working memory; a better species (e.g., SLC)?
This is PCM “storage” space; maybe equivalent to PCM-Small or maybe slower and larger (e.g., MLC)?
“Smart mem. controller” to handle diff. technologies; cache mgmt, wear leveling, error handling (ECC, sparing), trim & low-level scheduling
Prior work & findings
• Memory energy savings– Sizable savings of 20~90% [Zhou et al., ’09, Park et al., ’11]
– At a manageable performance hit of ~5% or so
• Hardware wear leveling feasible [Qureshi et al., ’09, Seong et al., ’10]
• Other system implications– Fast system on and off [Doh et al., ’09]
– Single-level data store [Venkataraman et al., ’11]
– Rapid checkpointing [Dong et al., ’09]
Techinsights decap ’10
512Mb @60nm?Diode switch designBelieved to be a tech.-migrated design
Industry progress: SamsungLee et al. ISSCC ’07Lee et al. JSSC ’08
512Mb @90nmDiode switch design266MB/s read4.64MB/s write (x16)
Chung et al. ISSCC ’11
1Gb @58nmLPDDR2-N“Write skewing”6.4MB/s write“DCWI” (~Flip-N-Write)
(Servalli, IEDM ’09)
Industry progress: Numonyx (Micron)Early access program(2009)
“Alverstone” (OMNEO)128Mb @90nmTR switch design40MB/s read (?)<1MB/s write (?)
Numerous press releases(slated for MP in 2011)
“Bonelli”1Gb @45nm
1.8V I/O
(2011~2012?)
“Imola” and “Mandello”2Gb & 4Gb @45nm
1.2V & 1.8V I/OLPDDR2-NVM &DDR3-NVM
Our vision
• To drastically reduce the power needed by TB capacities for main memory
• Cross-cutting, holistic system design– With heterogeneous
resources, management tasks are best handled by collaboration of layers
– MemVisor
Research questions (infra)
• PCM has the potential to beat DRAM in terms of capacity and power…– But what about performance? How much
performance is “good enough” for key applications?
• What cross-layer information is critical for MemVisor?– What are appropriate interfaces?
• Can we predictively allocate different amount of DRAM and PCM to a virtual machine?– Hardware and software support?
Research questions (application)
• How can we best utilize persistency in memory?– Extension of storage? How?– New algorithms and data structures?
• PCM provides “storage” that is orders of magnitude faster than HDDs– Any changes needed in OS? DBMS?
• New algorithms that work synergistically with the underlying hardware and system layers for longer lifetime and higher reliability?
Storage Class Memory Architecturefor Energy Efficient Data Centers
www.cs.pitt.edu/PCM