Upload
anaya-colford
View
223
Download
2
Embed Size (px)
Citation preview
NC STATE UNIVERSITY
Retention-Aware Placement in DRAM (RAPID): Software Methods for
Quasi-Non-Volatile DRAM
Retention-Aware Placement in DRAM (RAPID): Software Methods for
Quasi-Non-Volatile DRAM
Center for Embedded Systems Research (CESR)Department of Electrical & Computer Engineering
North Carolina State University
Ravi K. Venkatesan
Stephen Herr, Eric Rotenberg
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20062
DRAM in Next-Generation Mobile DevicesDRAM in Next-Generation Mobile Devices
Feature-rich next-generation mobile devices
Increase in memory requirements
DRAM displacing SRAM– Samsung SGH i700 Triband Windows Smartphone : 32 MB DRAM
– Motorola i930 cell phone has 32 MB DRAM
Dram Makers Prep Multichip Cell Phone Memories. ElectroSpec, 1/3/2003.Innovative Mobile Products Require Early Access to Optimized DRAM Solutions. APosition Paper by Samsung Semiconductor, Inc., 2003.
DRAM continuously drains battery even in standby– Periodic refresh needed to preserve stored information
Need to reduce DRAM refresh power
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20063
DRAM – CellsDRAM – Cells
Word Line
Bit
Lin
e
Sense Amp
from dummy cell
DRAM capacitor
500 ms
50 s
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20064
Retention Time MeasurementRetention Time Measurement
ISSI DRAM Chip (IS42S16800A)
128 Mb
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20065
Retention Time Characterization AlgorithmRetention Time Characterization Algorithm
10 trials are performed for– Writing all 1s– Writing all 0s
Retention time for each page = minimum of the 20 trials
Disable refresh
Write all 1s to pagei
wait(T)
Read pagei
Is patternall 1s?
T = T + ΔT
Retention Time(pagei) = T - ΔT
YES
NO
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20066
0
200
400
600
800
0 10000 20000 30000 40000 50000
Retention time (ms)
Nu
mb
er
of
pa
ge
sDRAM – Retention Time DistributionDRAM – Retention Time Distribution
worst page retention time
(500 ms)
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20067
0
20
40
60
80
100
0 10000 20000 30000 40000 50000
Refresh period (ms)
% U
sa
ble
DR
AM
Cumulative Retention Time DistributionCumulative Retention Time Distribution
99% DRAM usable with 3s refresh period
6X longer than worst page(500 ms at 25°C)
46X longer than default(64 ms at 70°C)
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20068
0
100
200
300
400
500
0 5000 10000 15000 20000
Retention time (ms)
Nu
mb
er
of
pa
ge
s
DRAM – Retention Time Distribution 45°CDRAM – Retention Time Distribution 45°C
worst page retention time
(320 ms)
0
20
40
60
80
100
0 5000 10000 15000 20000
Refresh period (ms)
% U
sab
le D
RA
M
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/20069
DRAM – Retention Time Distribution 70°CDRAM – Retention Time Distribution 70°C
worst page retention time
(150 ms)
0
200
400
600
800
1000
0 1000 2000 3000 4000 5000 6000
Retention time (ms)
Nu
mb
er
of
pa
ge
s
0
20
40
60
80
100
0 2000 4000 6000
Refresh period (ms)
% U
sa
ble
DR
AM
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200610
DRAM Refresh Approaches DRAM Refresh Approaches
Three classes of hardware techniques: Conventional
– Refresh period based on worst cell
– Worst-case temperature
Temperature-Compensated Refresh (TCR)– Refresh period based on worst cell
– Dynamically adjusts for temperature
Better than worst-case design– Multiple refresh periods (refresh period per block of cells)
– Can be coupled with TCR
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200611
RAPID (Retention-Aware Placement in DRAM)RAPID (Retention-Aware Placement in DRAM)
Novel software-only approach
KEY IDEA: Allocate longer-retention pages before allocating shorter-
retention pages Single refresh period selected based on shortest-
retention page among populated pages, rather than shortest-retention page overall
NOTE: All pages are still refreshed Extended refresh period safe only for populated pages OK for overall correctness
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200612
RAPIDRAPIDConventionalConventional RAPID-1RAPID-1
Data Objects
A, B, C, D
Program Sequence
Allocate A
Allocate B
Allocate C
Allocate D
Free B
Free D
A
B
C
D
0.5s
37s
3s
0.5s
41s
50s
22s
1s
A
B
C
D
3sRefresh Period
A
B
C
D
50s41s37s22s
RAPID-2RAPID-2
A
B
C
D
DC
50s41s37s22s
RAPID-3RAPID-3 RetentionTime
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200613
RAPID - RecapRAPID - Recap
Novel software-only approach– Can exploit off-the-shelf DRAMs to reduce refresh power
RAPID - 1– Eliminate outlier pages
RAPID - 2– RAPID - 1 and– Allocate longer-retention pages before allocating shorter-
retention pages
RAPID - 3– RAPID - 2 and– Continuously reconsolidate data to longest-retention
pages possible
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200614
Coupling RAPID with Off-the-shelf DRAMsCoupling RAPID with Off-the-shelf DRAMs
SELF-REFRESH
DRAM issues refresh commands internally at a refresh period that is fixed or based on a particular temperature range
Typically not programmable
Low power
Used in standby operation
AUTO-REFRESH
Regularly timed refresh commands issued by memory controller
Typically programmable
High power
Used in active mode
RAPID requires a single refresh period that can be adjusted
Refresh options in off-the-shelf DRAMs:
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200615
Coupling RAPID with Off-the-shelf DRAMs (cont.)Coupling RAPID with Off-the-shelf DRAMs (cont.)
Neither refresh option is fully satisfactory– Self-refresh not programmable– Auto-refresh programmable, but power-inefficient
Key observation– DRAMs allow enabling/disabling self-refresh via
configuration register
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200616
Coupling RAPID with Off-the-shelf DRAMs (cont.)Coupling RAPID with Off-the-shelf DRAMs (cont.)
Solution– Set up a periodic timer interrupt (INT1)
• Period = RAPID refresh period
– Set up another interrupt (INT2) • Triggers 64 ms (self-refresh period) after INT1
– INT1 triggered: • Enable self-refresh
– INT2 triggered:• Disable self-refresh
64 ms 64 ms 64 ms
INT1 INT2 INT1 INT2 INT1 INT2
10s 10sRAPID refresh period
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200617
RAPID – ImplementationRAPID – Implementation
INACTIVE LIST (free-list of inactive physical pages)
....
Conventional RAPID-1
RAPID-1: 1% of outlier pages are excluded from Inactive List
Modifications to routines which allocate and deallocate physical pages in memory
...
HEAD
TAIL
HEAD
TAIL
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200618
RAPID – Implementation (cont.)RAPID – Implementation (cont.)
INACTIVE LIST (free-list of inactive physical pages)
RAPID-2 and RAPID-3
RAPID-2 and RAPID-3Single Inactive List transformed into Multiple Inactive Lists
.....
7.7 – 345.3 – 40.6 40.6 – 35.950 – 45.3seconds
Refresh Period = 45.3sRefresh Period = 40.6sRefresh Period = 35.9s
RAPID-1
...
H
T
... ...
... ...
H
T
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200619
Related WorkRelated Work
Characterization[Hamamoto et al. 1998, IEEE Tran Electron Devices] On Retention Time Distribution of DRAM
Dual Period Refresh [Yanagisawa 1988, US Patent] Semiconductor Memory
Multiperiod Refresh[Kim and Papaefthymiou 2001, 2003, IEEE Tran VLSI Systems] Block-Based Multiperiod Refresh
Multiperiod Refresh + Selective[Ohsawa et al. 1998, ISPLED] Optimizing DRAM refresh Count for Merged DRAM/Logic LSIs
Placement + Single Refresh Period + Selective[Kai et al. 2002, US Patent] Semiconductor Circuit and Method of Controlling the Same
[Le et al. 2005, US Patent] Bank Address Mapping According to Bank Retention Time in DRAM
[Kawasaki et al. 2005, US Patent] DRAM and Refresh method thereof
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200620
Evaluation MethodologyEvaluation Methodology
Active Mode (5%), Idle Mode (95%)
Simulated timeline = 24 hours
Divide 24-hour timeline into time slices
Probability [time slice = active period] = 5%
Probability [time slice = idle period] = 95%
Inject random number of allocations/frees in active period– Vary DRAM utilization across timeline
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200621
Custom Hardware TechniquesCustom Hardware Techniques
HW-Multiperiod (HW-M)– Each DRAM page is refreshed at a tailored refresh period, that is
a multiple of shortest refresh period
HW-Multiperiod-Occupied (HW-M-O)– Same as HW-Multiperiod– Refresh only currently occupied pages
HW-Ideal (HW-I)– Each DRAM page is refreshed at its own tailored refresh period
HW-Ideal-Occupied (HW-I-O)– Same as HW-Ideal– Refresh only currently occupied pages
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200622
0
0.2
0.4
0.6
0.8
1
TC
R
R-1
R-2
R-3
HW
-M
HW
-I
HW
-M-O
HW
-I-O
no
rma
lize
d e
ne
rgy
w.r
.t. T
CR
70°C
0
0.2
0.4
0.6
0.8
1
TC
R
R-1
R-2
R-3
HW
-M
HW
-I
HW
-M-O
HW
-I-O
no
rma
lize
d e
ne
rgy
w.r
.t. T
CR
45°C
0
0.2
0.4
0.6
0.8
1
TC
R
R-1
R-2
R-3
HW
-M
HW
-I
HW
-M-O
HW
-I-O
no
rma
lize
d e
ne
rgy
w.r
.t. T
CR
25°C
RAPID – Refresh Energy SavingsRAPID – Refresh Energy Savings
83-95% energy savings at 25°C
Similar energy savings across all temperatures
Nearly as effective as custom hardware techniques
83% 93% 95%
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200623
0
0.2
0.4
0.6
0.8
1
1.2
TC
R
R-1
R-2
R-3E
ne
rgy
Co
ns
um
pti
on
(m
W h
ou
rs)
70°C
3.46
0
0.2
0.4
0.6
0.8
1
1.2
TC
R
R-1
R-2
R-3E
ne
rgy
Co
ns
um
pti
on
(m
W h
ou
rs)
45°C
1.62
0
0.2
0.4
0.6
0.8
1
1.2
TC
R
R-1
R-2
R-3
En
erg
y C
on
sum
pti
on
(m
W h
ou
rs)
25°C
RAPID vs. TCRRAPID vs. TCR
Worst-case non-temperature adjusted R-1 yields same or better energy than TCR
R-1 simpler, cost-effective alternative to TCR
R-2 and R-3 yield much higher energy savings than TCR
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200624
0
0.04
0.08
0.12
0.16
0.2
R-1
R-2
R-3
HW
-M
HW
-I
HW
-M-O
HW
-I-OE
ner
gy
Co
nsu
mp
tio
n (
mW
ho
urs
)
25% utilization (average)
0
0.04
0.08
0.12
0.16
0.2
R-1
R-2
R-3
HW
-M
HW
-I
HW
-M-O
HW
-I-OE
ner
gy
Co
nsu
mp
tio
n (
mW
ho
urs
)
50% utilization (average)
0
0.04
0.08
0.12
0.16
0.2
R-1
R-2
R-3
HW
-M
HW
-I
HW
-M-O
HW
-I-OE
ner
gy
Co
nsu
mp
tio
n (
mW
ho
urs
)
75% utilization (average)
Refresh Energy – Different DRAM UtilizationsRefresh Energy – Different DRAM Utilizations
R-1, HW-M and HW-I constant R-2 and R-3 yield more energy
savings than HW-M and HW-I as utilization decreases
R-2 and R-3 approach energy of HW-I-O
HW-I-O not retention-aware
NC STATE UNIVERSITY
Ravi Venkatesan © 2006 HPCA-12 02/14/200625
Summary – RAPIDSummary – RAPID
RAPID reduces refresh power to vanishingly small levels– Quasi-non-volatile DRAM
Software-only technique– No custom DRAM support required– Can exploit off-the-shelf DRAM
Approaches energy levels of idealized techniques, which require hardware support