View
215
Download
0
Tags:
Embed Size (px)
Citation preview
A Delay-efficient Radiation-hard Digital Design ApproachUsing Code Word State
Preserving (CWSP) Elements
Charu NagpalRajesh Garg
Sunil P. Khatri
Department of Electrical and Computer Engineering,
Texas A&M University, College Station, TX
2
Outline Motivation and Introduction Previous Work
Nicolaidis et. al. (CWSP based) Our Approach
System Level Circuit Level Radiation protection analysis Maximum glitch width
Experimental Results Conclusions and Future Work
3
Why radiation-hardening?
Historically mainly used for space and military electronics Higher levels of radiation in space and
combat environments Terrestrial electronics also becoming
vulnerable Shrinking feature size and supply voltage Reduced capacitances means less charge is
required to flip node voltage This has resulted in a renewed interest in
radiation-hardened circuit design
4
Effects of radiation particle strike
Neutrons, protons and heavy cosmic ions Ions strike diffusion regions Deposit charge Result in voltage spike at the output of a
gate This can flip the state of a memory cell (SEU) For a combinational gate, this spike (SET) may
cause an incorrect value to be sampled by the latches or flip-flops of the design
5
Modeling a radiation-strike
Charge deposited (Q) at a node is given by
where: L is Linear Energy Transfer (MeV-cm2/mg)
t is the depth of the collection volume (µm) The resulting current pulse is traditionally
described as
where: is the collection time constant
is the ion track
establishment constant (Q =100fC, = 200ps and = 50ps)
time (ns)
Iseu
(µ
A)
6
Previous WorkPrevious Work Classification of techniques: Radiation-
tolerant design Device level
M. Takai et al. “Soft error susceptibility and immune structures in dynamic random access memories (DRAM’s) investigated by nuclear microprobes,” IEEE Trans. Nucl. Sci. Feb 1996
Circuit level Zhou et. al. “Transistor sizing for radiation hardening,”
Proc. Int. Reliability Physics Symp. 2004 System level
S. Mitra et. al. “Robust system design with built-in soft-error resilience," IEEE Computer Feb 2005
Redundancy based techniques (TMR) Hardware redundancy – space dimension Temporal redundancy – time dimension
7
CWSP Element This paper is based on
the use of Code Word State Preserving elements
M. Nicolaidis, “Time redundancy based soft-error tolerance to rescue nanometer technologies,” VLSI Test Symposium, April 1999
What is a CWSP element and how does it work?
a
out
An inverter
a
a*
out
CWSP element of an inverter
out
a b
a* b*
a
a*
b
b*
CWSP element of NAND2
out
a b
a
b
NAND2 gate
8
How does the CWSP element work?
Latch/ FF
Original Circuit
Comb. block with k logic gates
OUT
Latch/ FF
Comb. block with k-1 logic gates
CWSP of thekth gate
Delay δ
CW(inverter)
P
P*
δ δP*
P
CW OUT
t1 t2 t3
Delay of 2δ + DCWSP - Dk in the functional pathfunctional path where DCWSP: delay of the CWSP element Dk : delay of the kth gate Use only one type of CWSP element, that of an inverter
Reduces speed degradation due to body effect Avoid the need to have a library of CWSP gates Need to implement the complement of the combinational function, however..
9
Our Approach - Architecture
Original Circuit
FFLOGIC
D OUT
clk
Modified FFLOGIC
OUT
δ
CWSP of theinverte
r
EQP*
P
CW FF
CW
CW*
clk
clk_dEQGLBF
1
0
CW*
EQGLBGLBFF
clk
EQGLB
D
EQ FF
EQGLB
EQGLBF
FFLOGIC
OUT
Detect a radiation
strike
clk
D
Calculate the correct
value
Our Approach – Abstract Level
10
OUT
Our Approach - Timing
clkD
P*
P
OUT
CW
EQ
clk_d
EQGLB
EQGLBF
δ δ
Modified FF
δ
CWSP of theinverte
r
P*
P
CW
CW*
clk
EQGLB
CW*EQGLB
D
EQ
clk_d
EQ FF
CW FF
CW*
X
EQGLBF
0
EQGLBGLBFF
clk
EQGLBF
1
Probability of more than one radiation strike in 2 consecutive cycles is 10-10. This is exploited by introducing the MUX shown
11
Our Approach – Circuit details
Modified FFLOGIC
OUT
δ
CWSP of theinverte
r
EQP*
P
CW FF
CW
CW*
clk
clk_dEQGLBF
1
0
EQGLB
CW*
EQGLBGLBFF
EQGLBF
clk
EQGLB
D
EQ FF
12
Our Approach – Circuit details
Circuit level design – Modified FF
Modified FF
clk
clk
clk
clk
clk
clk
clk
clk
EQGLB
EQGLB
D
CW*
OUT
13
Our Approach – Circuit details
Architecture level design Show the low-level design for each component
Modified FFLOGIC
OUT
δ
CWSP of theinverte
r
EQP*
P
CW FF
CW
CW*
clk
clk_dEQGLBF
1
0
CW*
EQGLBGLBFF
EQGLBF
clk
EQGLB
D
EQ FF
EQGLB
P*
P
CW
POLY2 POLY2P
14
Our Approach – Circuit details
EQ, EQGLB and EQGLBF
EQGLBFEQFF
clk_d
GLBFF
clk
0
EQGLB
EQGLBF
EQGLBF
EQGLBF
CW
CW
CW
OUT
EQ
clk_dEQGLBF
1
0
EQGLBGLBFF
EQGLBF
clk
EQ FFC
W
OUT
15
Analysis of radiation-hardening
Analyze radiation strikes at various nodes
Modified FFLOGIC
OUT
δ
CWSP of theinverte
r
EQP*
P
CW FF
CW
CW*
clk
clk_dEQGLBF
1
0
CW*
EQGLBGLBFF
EQGLBF
clk
EQGLB
D
EQ FF
EQGLB
16
Maximum glitch width DMIN constraint
Modified FF
OUT
δ
CWSP of theinverte
r
P*
P
CW
CW*
clk
EQGLB
CW*EQGLBD
EQ
clk_d
EQ FF
CW FF
EQGLBF
1
0
EQ
The input of the CWSP element should be stable for at least 2δ to harden against a glitch of size δ on any of the inputs of the CWSP element
17
If there is a radiation-strike at the D input of the modified FF, CW* should be ready setup time units before the next positive edge of the system clock ‘clk’
Maximum glitch width DMAX constraint
Modified FF
OUT
δ
CWSP of theinverte
r
P*
P
CW
CW*
clk
EQGLB
CW*EQGLBD
EQ
clk_d
EQ FF
CW FF
EQGLBF
1
0
EQ
18
Experimental Setup Circuit simulation is performed in SPICE 65nm BPTM model card is used
VDD = 1V VTN = |VTP|= 0.22V
Radiation strike was modeled as current source Q =100fC , = 200ps and = 50ps Q =150fC , = 200ps and = 50ps
LGsynth93 and ISCAS85 benchmark circuits
19
DMIN and DMAX constraints
Using = 200ps, = 50ps
For Q =100fC, δ= 500psFor Q =150fC, δ= 600ps
∆ = 405ps
For Q =100fC, DMIN ≥1000psFor Q =150fC, DMIN ≥ 1200ps
For Q =100fC, DMAX ≥1405psFor Q =150fC, DMAX ≥ 1605ps
So, any design with DMIN > 1000 and DMAX > 1405 can be protected up to 500ps (for Q =100fC, = 200ps and = 50ps)
For designs with DMIN < 1000ps and DMAX < 1405ps, protect up to:
Brayton et. al. Delay balancing is done in industrial designs, DMIN = 80% DMAX
time (ns)
Gate
ou
tpu
t vo
ltag
e (
V)
20
One copy for the entire circuit
For every flip-flop in the circuit
Area overhead Area overhead calculation
Modified FFLOGIC
OUT
δ
CWSP of theinverter
EQP*
P
CW FF
CW
CW*
clk
clk_dEQGLBF
1
0
CW*
EQGLBGLBFF
clk
EQGLB
D
EQ FF
EQGLBF
EQGLB
21
Delay overhead
Delay for unprotected circuit DMAX + Tsetup + TCO
= DMAX + 40ps + 69ps Delay for the protected circuit
DMAX + Tsetup + TCO + Dinput_load = DMAX + 38ps + 76ps + 6.5ps
Dinput_load is the increase in delay of the combinational circuit output (due to the increased load on the D-input of the modified flip-flop of our design)
22
Experimental Results Q=150fC, = 200ps and = 50ps
DMIN ≥ 1200ps, DMAX ≥ 1605ps
CircuitArea
(Unhardened)Area
(Hardened)Overhead
(%)Dmax
Delay (Unhardened)
Delay (Hardened)
Overhead (%)
alu2 28.25 37.29 32.00 1624.54 1733.54 1745.04 0.66
alu4 53.88 65.88 22.27 1700.28 1809.28 1820.78 0.64
apex2 399.67 404.28 1.15 2069.55 2178.55 2190.05 0.53
C3540 97.83 130.53 33.43 1931.05 2040.05 2051.55 0.56
C6288 223.59 271.09 21.24 5141.06 5250.06 5261.56 0.22
seq 421.60 473.53 12.32 2936.80 3045.80 3057.30 0.38
C7552 187.68 347.62 85.23 2472.79 2581.79 2593.29 0.45
C880 36.15 74.78 106.83 1692.80 1801.80 1813.30 0.64
Average 39.31% 0.51%
23
Experimental Results Q=100fC, = 200ps and = 50ps
DMIN ≥1000ps, DMAX ≥ 1405ps Circuit
Area (Unhardened)
Area (Hardened)
Overhead (%)
DmaxDelay
(Unhardened)Delay
(Hardened)Overhead
(%)
alu2 28.25 36.38 28.78 1624.54 1733.54 1745.04 0.66
alu4 53.88 64.66 20.02 1700.28 1809.28 1820.78 0.64
apex2 399.67 403.82 1.04 2069.55 2178.55 2190.05 0.53
C1908 43.66 77.01 76.38 1562.65 1671.65 1683.15 0.69
C3540 97.83 127.19 30.02 1931.05 2040.05 2051.55 0.56
C6288 223.59 266.23 19.07 5141.06 5250.06 5261.56 0.22
C7552 187.68 331.22 76.48 2472.79 2581.79 2593.29 0.45
C880 36.15 70.83 95.91 1692.80 1801.80 1813.30 0.64
seq 421.60 468.22 11.06 2936.80 3045.80 3057.30 0.38
C5315 152.17 315.63 107.42 1475.91 1584.91 1596.41 0.73
dalu 65.59 87.00 32.63 1489.09 1598.09 1609.59 0.72
Average 45.34% 0.56%
24
Experimental Results For DMIN < 1000ps and DMAX < 1405ps, protect up
to:
CircuitArea
(Unhardened)Area
(Hardened)Overhead
(%) DmaxDelay
(Unhardened)Delay
(Hardened)Overhead
(%)Pulse Width
apex3 139.13 208.59 49.93 1230.12 1339.12 1350.62 0.86 412.56
b11_opt_C 55.43 104.70 88.90 1270.95 1379.95 1391.45 0.83 432.97
C1355 46.01 88.65 92.67 1012.19 1121.19 1132.69 1.03 303.60
C432 15.12 24.58 62.54 1385.39 1494.39 1505.89 0.77 490.19
C499 46.01 88.65 92.67 1012.19 1121.19 1132.69 1.03 303.60
ex5p 178.18 264.90 48.67 1195.08 1304.08 1315.58 0.88 395.04
k2 88.53 151.36 70.97 1170.34 1279.34 1290.84 0.90 382.67
apex1 111.43 174.26 56.39 982.90 1091.90 1103.40 1.05 288.95
ex4p 17.59 24.40 38.66 630.38 739.38 750.88 1.56 112.69
Average 66.82% 0.99%
25
Conclusions and Future Work
With the proposed approach: For Q=150fC (100fC), = 200ps and = 50ps delay
overhead 0.51 (0.56)%, area overhead 39.31 (45.34)%
For circuits with DMIN < 1000ps or DMAX < 1405psProtect
Delay overhead < 1%, for high – speed, delay critical applications
Combine the proposed approach with gate sizing Attenuate glitch width using sizing Now δ is smaller, DMIN, DMAX smaller as well
Approach Area Ovh. Delay Ovh.Protectio
n
Proposed 45.34% 0.56% 100%
Mohanram et. al. 42.95% 2.80% 90%
Nicolaidis et. al. 17.65% 28.65% 100%
26
Thank You !