Upload
maximillian-phillips
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Energy-Efficient Cache Design Using
Variable-Strength Error-Correcting Codes
Alaa R. Alameldeen, Ilya Wagner, Zeshan Chishti, Wei Wu, Chris Wilkerson, Shih-Lien Lu
Intel Corporation
2011010631
Presenter : Gyu Seong, Kang
[email protected] ( 2 )
Contents
Background
Variable Strength ECC
Three types of VS-ECC
Cache characterization
Cache operation in low-voltage mode
ECC overhead
Simulation
Conclusion
[email protected] ( 3 )
Background
Energy efficiency is the main design concern
“Reducing supply voltage” – one of the most effective method
This approach restricted from process variation
The minimum operating voltage, 'Vccmin'
Failures in cache memory determine the minimum voltage
[email protected] ( 4 )
Background
The probability of multi-bit error is significantly lower than that of having zero or one in cache line.
Probability of a single bit failure(pBitFail) and probability of e failures in 64B cache line
[email protected] ( 5 )
Background
Error correcting code(ECC) Recover error using additional parity bits The size of parity bits is proportional to correctable errors
With ECC, It can reduces operating voltage even lower, resulting in lower power consumption.
Selective high-strength protection of a few cache line SECDED(single error correcting, double error detecting) Multi-bit ECC for one or more error failures
Probability of a sing set persistent failure in a 16-way cache with DECTED or VS-ECC
[email protected] ( 6 )
Variable strength ECC
Based on the number of failing bits Different strengths for different cache lines SECDED Multi-bit ECC
Additional tag information to distinguish cache line class
3 types of VS-ECC is proposed VS-ECC with a fixed number of regular and extended ECC VS-ECC with line disable + fixed number of regular and extended ECC
VS-ECC with variable number of correction bit(1 to 4)
[email protected] ( 7 )
Variable strength ECC
VS-ECC with a fixed number of regular and extended ECC Extended ECC bit
SECDED set to 0 Multi-bit ECC set to 1
Additional parity data is stored in Extended ECC array.
[email protected] ( 8 )
Variable strength ECC
VS-ECC with line disable + fixed number of regular and extended ECC Extended ECC bit
SECDED 0 Multi-bit ECC 1
Save additional parity data in Extended ECC array
Disable bit Disable cache line for more than two persistent failures in the low-volt-
age mode Better soft error coverage
[email protected] ( 9 )
Variable strength ECC
VS-ECC with variable number of correction bit(1 to 4) Number of ECC blocks
SECDED to 4EC5ED
Pointer to Extended ECC block ECC data address for Extended ECC block
[email protected] ( 10 )
Variable strength ECC
Cache line need to be classified for low-voltage mode
4 eECC field / cache set Only 4 lines can be active and contain protected data. Rest of the cache is inactive and undertest.
Cache characterization Reset All the E-bit and valid bits for inactive cache blocks. Write back all the dirty data. Reduce the processor voltage to the target Vccmin Associate 4 eECC field & first 4 ways. Deactivate rest of the ways.(Loss 75% of cache capacity)
Use multi-bit ECC en/decoder for R/W operation during characterization
[email protected] ( 11 )
Variable strength ECC
Memory test for inactive region Use traditional memory test method
Write pre-defined pattern & read back
Under the test, if bit failure is detected Multi bit failure
Set E-bit Single bit failure
Write its location into lines' tag array If single-bit failure again in the same line in the remainder test, Compare its location with the one stored in the tag.
Hit – uses SECDED Miss – Multi-bit failure, set E-bit
The test continues until 5 or more E-bit set to 1 or algorithm completes.
[email protected] ( 12 )
Variable strength ECC
Cache characterization for VS-ECC-Variable Same characterization as VS-ECC-Fixed Additional step to know the exact number of error bits
¼ of the cache is tested at a time Require higher testing accuracy
Cache characterization for VS-ECC-Disable Same characterization as VS-ECC-Fixed Function correctly with lower testing accuracy
[email protected] ( 13 )
Variable strength ECC – Operation
Flow chart of cache operation in low voltage mode for VS-ECC-Fixed
[email protected] ( 14 )
ECC Overhead
Binary BCH Code Parity bit for 64B(29, 512b) data
10bit = 1bit correction Additional 1bit for detection
SECDED 10bit + 1bit 1 cycle latency for decoding
4EC5ED 40bit + 1bit 15 cycle latency for decoding
[email protected] ( 15 )
Simulation setup
Cycle-accurate, execution-driven IA32 simulator OOO model based on Intel Core i7 2GHz 32 KB, 8-way set-associative icache, dcache 2MB, 16-way set-associative L2 cache 64byte cache line
[email protected] ( 16 )
Simulation setup
ECC configuration Baseline – All SECDED
12 cycle L2 hit + 1 cycle for SECDED
Fixed-strength ECC DECTED – additional 1 cycle for ECC 4EC5ED – additional 15 cycle for ECC
MS-ECC [Chishti et al., MICRO 2009]
4 bit error correction per segment(64bit) 1MB 8-way L2 cache with 1 cycle latency(Data : ECC = 1:1)
VS-ECC-Fixed – 12 x SECDED + 4 x 4EC5ED VS-ECC-Disable – 12 x SECDED + 4 x 4EC5ED VS-ECC-Variable – 16 x SECDED + 12 x 10bit ECC block
[email protected] ( 17 )
Simulation result – Reliability
Failure probability as a function of supply voltage for dif-ferent configurations
[email protected] ( 19 )
Simulation result – Energy efficient
Normalized Vccmin, Frequency, Power, and EPI Baseline Vccmin : 830mV f
Baseline Frequency : 2000MHz
Norm. Vccmin Norm. Freq Norm. Power Norm. EPI0
0.2
0.4
0.6
0.8
1
Baseline DECTED 4EC5ED MS-ECC VS-ECC-Fixed VS-ECC-Variable VS-ECC-Disable
[email protected] ( 20 )
Conclusion
Low supply voltage condition A few multi-bit failure in cache While the majority of lines exhibit zero or one errors
Variable-strength error correcting codes Selectively high-strength protection of a few cache line Lines with no failure – SECDED for covering soft error Persistent failure – 4EC5ED Additional bit to support multi-bit ECC control 3 types of VS-ECC are proposed
VS-ECC-Fixed VS-ECC-Variable VS-ECC-Disable
VS-ECC can Avoids significant decreases in cache capacity Incurs minimal additional area overhead
VS-ECC-Disable even better previous published MS-ECC in terms of power & EPI
[email protected] ( 21 )
Q & A ?