Upload
loren-stanley
View
219
Download
2
Tags:
Embed Size (px)
Citation preview
CMP238: Projeto e Teste de Sistemas VLSI
Marcelo Lubaszewski
Aula 3 - Teste
PPGC - UFRGS
2005/I
Lecture 3 - Fault Simulation andFault Injection
• Fault simulation
• Fault simulation algorithms• Serial• Parallel• Deductive• Concurrent
• Random Fault Sampling
• Fault Injection• ASIC• FPGAs
Problem and Motivation
• Fault simulation Problem: Given A circuit A sequence of test vectors A fault model
– Determine Fault coverage - fraction (or percentage) of modeled
faults detected by test vectors Set of undetected faults
• Motivation Determine test quality and in turn product quality Find undetected fault targets to improve tests
Fault simulator in a VLSI Design Process
Verified designnetlist
Verificationinput stimuli
Fault simulator Test vectors
Modeledfault list
Testgenerator
Testcompactor
Faultcoverage
?
Remove tested faults
Deletevectors
Add vectors
Low
Adequate
Stop
Fault Simulation Scenario• Circuit model: mixed-level
• Mostly logic with some switch-level for high-impedance (Z) and bidirectional signals
• High-level models (memory, etc.) with pin faults
• Signal states: logic• Two (0, 1) or three (0, 1, X) states for purely Boolean
logic circuits
• Four states (0, 1, X, Z) for sequential MOS circuits
• Timing:• Zero-delay for combinational and synchronous circuits
• Mostly unit-delay for circuits with feedback
Fault Simulation Scenario (continued)
• Faults:• Mostly single stuck-at faults• Sometimes stuck-open, transition, and path-delay
faults; analog circuit fault simulators are not yet in common use
• Equivalence fault collapsing of single stuck-at faults• Fault-dropping -- a fault once detected is dropped
from consideration as more vectors are simulated; fault-dropping may be suppressed for diagnosis
• Fault sampling -- a random sample of faults is simulated when the circuit is large
Fault Simulation Algorithms
• Serial
• Parallel
• Deductive
• Concurrent
Serial Algorithm• Algorithm: Simulate fault-free circuit and save
responses. Repeat following steps for each fault in the fault list:
• Modify netlist by injecting one fault• Simulate modified netlist, vector by vector, comparing
responses with saved responses• If response differs, report fault detection and suspend
simulation of remaining vectors
• Advantages:• Easy to implement; needs only a simulator, less memory• Most faults, including analog faults, can be simulated
Serial Algorithm (Cont.)• Disadvantage: Much repeated computation; CPU time
prohibitive for VLSI circuits
• Alternative: Simulate many faults together
Test vectors Fault-free circuit
Circuit with fault f1
Circuit with fault f2
Circuit with fault fn
Comparator f1 detected?
Comparator f2 detected?
Comparator fn detected?
Parallel Fault Simulation
• Exploits inherent bit-parallelism of logic operations on computer words
• Storage: one word per line for two-state simulation
• Multi-pass simulation: Each pass simulates w-1 new faults, where w is the machine word length
• Speed up over serial method ~ w-1
• Not suitable for circuits with timing-critical and non-Boolean logic
Parallel Fault Sim. Example
a
b c
d
e
f
g
1 1 1
1 1 1 1 0 1
1 0 1
0 0 0
1 0 1
s-a-1
s-a-0
0 0 1
c s-a-0 detected
Bit 0: fault-free circuit
Bit 1: circuit with c s-a-0
Bit 2: circuit with f s-a-1
Deductive Fault Simulation• One-pass simulation
• Each line k contains a list Lk of faults detectable on k
• Following simulation of each vector, fault lists of all gate output lines are updated using set-theoretic rules, signal values, and gate input fault lists
• PO fault lists provide detection data
• Limitations:• Set-theoretic rules difficult to derive for non-Boolean
gates• Gate delays are difficult to use
Deductive Fault SimulationExample
a
b c
d
e
f
g
1
1 1
0
1
{a0}
{b0 , c0}
{b0}
{b0 , d0}
Le = La U Lc U {e0}
= {a0 , b0 , c0 , e0}
Lg = (Le Lf ) U {g0}
= {a0 , c0 , e0 , g0}
U
{b0 , d0 , f1}
Notation: Lk is fault list for line k
kn is s-a-n fault on line k
Faults detected bythe input vector
Concurrent Fault Simulation• Simulation of fault-free circuit and only those parts of the
faulty circuit that differ in signal states from the fault-free circuit.
• A list per gate containing copies of the gate from all faulty circuits in which this gate differs. List element contains fault ID, gate input and output values and internal states, if any.
• All events of fault-free and all faulty circuits are implicitly simulated.
• Faults can be simulated in any modeling style or detail supported in simulation (offers most flexibility.)
• Faster than other methods, but uses most memory.
Conc. Fault Sim. Example
a
b c
d
e
f
g
1
11
0
1
1
11
1
01
1 0
0
10
1
00
1
00
1
10
1
00
1
11
1
11
0
00
0
11
0
00
0
00
0 1 0 1 1 1
a0 b0 c0 e0
a0 b0
b0
c0 e0
d0d0 g0 f1
f1
Conc. Fault Sim. Example
a
b c
d
e
f
g
1 -> 0
11 -> 0
0
1->0
1->0
11->0
1
01
1 0
0
10
1
00
1
00
1
10
1
00
1
11
1
11
0
00
0
11
0
00
0
00
0 1 0 1 1 1
a0 b0 c0 e0
a0 b0
b0
c0 e0
d0d0 g0 f1
f1
Conc. Fault Sim. Example
a
b c
d
e
f
g
1
11
0
1
1
11
1
01
1 0
1
11
1
00
1
00
0
11
0
01
1
11
1
11
1
01
0
11 1
01
0 1 0 1 1 1
a1b0 c0 e1
a1
b0
b0
e1
d0
d0 g1
f1
f1
Fault Sampling
• A randomly selected subset (sample) of faults is simulated.
• Measured coverage in the sample is used to estimate fault coverage in the entire circuit.
• Advantage: Saving in computing resources (CPU time and memory.)
• Disadvantage: Limited data on undetected faults.
Motivation for Sampling
• Complexity of fault simulation depends on:• Number of gates• Number of faults• Number of vectors
• Complexity of fault simulation with fault sampling depends on:
• Number of gates• Number of vectors
Random Sampling Model
All faults witha fixed butunknowncoverage
Detectedfault
Undetectedfault
Random
picking
Np = total number of faults
(population size)
C = fault coverage (unknown)
Ns = sample size
Ns << Npc = sample coverage (a random variable)
Example
A circuit with 39,096 faults has an actualfault coverage of 87.1%.
The measured coverage in a random sample of 1,000 faults is 88.7%. The above
formula gives an estimate of 88.7% 3%. CPU time for sample simulation was about 10% of
that for all faults.
• ASIC (general)– performed by simulation or – emulation to speed up the process
• FPGAs– directly into the bitstream
Fault Injection
Fault Injection
General: ASIC performed by simulation or emulation to speed up the process
• When should a fault be inserted?
Pseudo-random number
generator
timelocationbit
Fault Injection EnableSignals
Enable_registers
Enable_memory
Fault maskreset
Reset_core
clock
Memory location
Fault Injection Block
[0][4][5][7]
8-bit pseudo-random register
Fault Injection Control Block
applicationcorecore
......• Where should a fault be inserted?
Device Under Test Core Registers
CLK RST
mask ...
FI_EN
ENFI_EN CLK RST
mask
FI_EN
ENFI_EN
Hamming Code
Hamming Decode
• Special paths are required in the high-level description to ensure the full controllability of a fault insertion.
• No major performance penalties are observed
Device Under Test Core Memory
FI_memory location
mask
WEAENARSTACLKAADDRADIA
DOA
DOB
WEBENBRSTBCLKBADDRBDIB
FI_enable_memory
WEA
FI_memory location
mask
WEAENARSTACLKAADDRADIA
DOA
DOB
WEBENBRSTBCLKBADDRBDIB
Hamm
ing Code
Hamm
ing Decode
FI_enable_memory
WEA
• Dual-port memories allows high flexibility with low extra logic.
• Good fault model approximation: the faults can be injected in the memory at any time (all micro-controller states) without perturbing the normal operation.
Monitor Block
• Observability of an error in response of a fault injected in the DUT.
• What is the best method to analyze the results?
• Assumption: application results will be stored in the internal memory, implemented using BlockRam in the Virtex FPGA.
• Two main tasks are performed by the monitor block:• Clear all the internal memory in order to avoid accumulation
of faults;
• Read all the internal memory to analyze if there is an error or not.
Fault Injection Platform
Fault injection Soft-core
“Golden” Soft-core
Fault injection block
reset
Reset_core
clock
Monitor Block
Monitor Block
data
data
status
status
Lost of sequence
error
Chipscope Analyzer
Virtex FPGA
jtagcomputer
• “Golden” chip approach was used to observe the faults in the system.
• Comparing:– Data memory error– Application_end signal in the end lost of sequence.
Reset
Reset_MEM
Read_MEM
Clear_MEM
Clear Memory Application
Fault Injection Signals
Reset_DUT
Clear Memory ApplicationRead Memory
Time range for Fault Injection
Application_end
Lost of sequence
Error
Case Sudy: 8051 Fault Injection Experiment
000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000001000000011000001000000010100000110000000010000000100000001000000010000000100000001000000100000001000000010000000100000001000000010000000110000001100000011000000110000001100000011000001000000010000000100000000100000010000000100000001010000010100000101000000101000010100000101000001100000011000000110000000110000011000000110000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110 000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110000101010010101000111111010101000110100101111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000….000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
variables
Matrix 2
Matrix Result
0h
0Ah0Bh
1Fh2Fh
52h53h
76h
100h
Matrix 1
• Registers: Datapath, Control block, state machine (16 registers)
• Memory: 256 bytes application memory (RAM)
• 8051 Application: 6x6 Matrix Multiplication – Multiplication is performed by subsequent
addition and left shift register.
• According to the TIME:– fault in the memory location < 52h can
generate an error or a lost of sequence.
memory
Case Study: 8051 Fault Injection ExperimentVirtex Platform
• The fault injection system and the (DUT) were implemented in Virtex FPGA platform.
• Thousands of faults were injected in some seconds
• The emulation in FPGA showed a ~120,000 times acceleration
• Component: Virtex XCV300 - 240p
• 8051-like micro-controller runs at 10 MHz
• Fault injection Standard 8051 – 28 % LUTs
• Fault injection Hardened 8051– 30 % LUTs
Case Study: 8051 Fault Injection Experiment
Chipscope Analyzer
• Results of thousands of injected faults were analyzed in the computer by the Chipscope Analyzer software (ISE).
Fault Injection
FPGAs
(Directly into the bitstream)
Introduction Digital designs synthesized in SRAM-based FPGAs are
sensitive to upsets in the FPGA customization cells.
DesignDesign
clk
E1E2
E1E3
E2E3
SynthesisCustomization bits
…10101011110001101010…
M M M M
Upset = bit-flip
…10101011100001101010…
clk
E1E2
E1E3
E2E3
fault
Motivation
• A way to test the reliability of a (fault-tolerant) design is to perform fault injection, simulating the effects of SEUs, and see how the design responds
• The estimated time for injecting one bit fault in a FPGA bitstream is around 1 to 3 seconds
• The main steps of the fault injection are:– read the bit file – flip one bit– write the new bit file with the correct CRC– download it to the board by the SelectMAP configuration mode
interface
– check the output signal
Fault Injection Cost AnalysisXCV300 Analysis
CLB Rows x CLB Cols 32 x 48
Total of 1.536 CLBs
1.327.104 CLB Bits
Total of 470 CLBs Used
406.080 CLB Bits
31 days!
9.5 days!
CLB View in the FPGA Editor
Which bit in the bitstream correspond to this connection?
Which bit in the bitstream correspond to the logic?
Virtex Bitstream
...
Frame
0
...
171615
0 1 2 3 …. 47
Bit
CLB
BRAM
IOB
586 bits
What does this bit do?
CLB View in the FPGA Editor
Which bit in the bitstream correspond to this connection?Bit:?, frame: ?
Which bit in the bitstream correspond to the logic?Bit: ?, frame:?
CLBclass ToolCLB Map
SEUs Effects on FPGAs I
Bit Flip in a Single Linebit: 03 frame: 29
Fault effect:- Open Circuit
SEUs Effects on FPGAs II
Bit Flip in a Hex Line Muxbit: 01 , frame:32
Fault effect:- Open Circuit- Short Cut Circuit
Bitstream
Error flagList of bits that cause error:Major Address, Frame, Byte Frame, Bit(bit location in the Bitstream)
XCV300 240p(DUT)
Read: … 011001010101000001100001… step 1: load … 011000010101000001100001…step 2: load … 0110011010101000001100001… step 3: reset flip-flops … 011001100101000001100001…
3-modes
Bitstream:1,663,200 bits
Fault InjectionTool used at Xilinx
Fault Injection Tool (under developed at UFRGS)
Fault Injection Time Reduction
Bits to be tested of the 1,536 CLBs
Number of Bits Estimated time Time Save (%)
All bits 1,327,104 31 days
LUTs 98,304 1.1 days 96,45
GRM 638,976 7.4 days 76,13
Single matrix 466,944 5.4 days 82,58
Hex matrix 172,032 2 days 93,55
Responsible for the majority of the errors due to SEU- Open circuits combined to Short cut circuits
One can choose where to inject the faults!! This can reduce the fault injection time.