25
D_160 / MAPLD - 2004 Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

Embed Size (px)

Citation preview

Page 1: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 1

Fault Tolerant State Machines

Gary Burke, Stephanie Taft

Jet Propulsion Laboratory, California Institute of Technology

Page 2: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 2

Reasons for Fault Tolerant State Machines

• Reliable designs are essential for Flight systems

• The state machine needs to be tolerant of single event upsets

Page 3: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 3

State Machines• A state machine is a sequential machine that when

built into an FPGA or ASIC controls the sequencing of actions in the digital logic

• The current state of a machine is held in a state register which is updated on a clock

• The next value of the state register (next state) is derived from the current state and the inputs

• Outputs from the state machine are decoded from the state register and can also be combined with the inputs

Page 4: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 4

State Machine Encoding

• Each distinct state of a state machine is represented by a unique binary code

• Encoding is the assignment of binary codes to states

Page 5: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 5

Different Methods of Encoding States

• Binary– The simplest encoding method in which each

state is given the next available binary number in sequence

• One Hot – The number of bits in the code is equal to the

number of states– Each encoded state has just 1 bit in the encoded

word set to a 1 (the rest are 0)

Page 6: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 6

Different Methods of Encoding States Continued

• Hamming Distance of 2 (H2)– Compared to Binary encoding Hamming 2 uses one extra bit

to ensure all codes are separated by a Hamming distance of 2– It will take 2 changes in the state register to reach another

known state

• Hamming Distance of 3 (H3)– This extension on Hamming distance of 2 encoding uses

additional bits to ensure all codes are separated by a Hamming distance of 3

– It will take 3 changes in the state register to reach another known state

Page 7: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 7

Synthesis• To check the overhead of each of the state

machines, they were individually synthesized• Finite state machine optimization is turned off• A clock frequency of 50 MHz is used• Target device is a Xilinx Spartan 2, speed grade 6• Error injection circuitry is not included

Page 8: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 8

Synthesis ResultsState

Machine Size

# Slice Flip

Flops

# of 4 input LUTs

Clock Period

(ns)

Max Synthesized Frequency

(MHz)

Minimum Period (ns)

4 3 8 20 226.6 4.48 4 22 20 133.5 7.5

12 5 41 20 124.5 8.016 5 49 20 117.8 8.524 6 84 20 91.5 10.932 6 107 20 87.3 11.5

4 5 15 20 162.8 6.18 6 42 20 117.4 8.5

12 7 55 20 105.0 9.516 7 71 20 102.6 9.824 9 91 20 88.7 11.332 9 137 20 83.5 12.0

Hamming 2

Hamming 3

State Machine

Size

# Slice Flip

Flops

# of 4 input LUTs

Clock Period

(ns)

Max Synthesized Frequency

(MHz)

Minimum Period

(ns)

4 2 7 20 272.1 3.78 3 15 20 178.8 5.6

12 4 25 20 129.6 7.716 4 38 20 122.1 8.224 5 50 20 109.6 9.132 5 96 20 94.5 10.6

4 4 10 20 238.2 4.28 8 20 20 194.8 5.1

32 12 31 20 173.0 5.816 16 41 20 148.9 6.724 24 63 20 148.9 6.732 32 237 20 68.6 14.6

Binary

One Hot

Page 9: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 9

Four Bit State Encoding

4 Bit State Encoding

2

4

3

5

7

10

8

15

3.74.2 4.4

6.1

0

2

4

6

8

10

12

14

16

Binary One Hot Hamming 2 Hamming 3

# of Slice Flip Flops

# of Four Input LUTs

Clock Period (ns)

Page 10: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 10

Eight Bit State Encoding

8 Bit State Encoding

3

8

4

6

15

20

22

15

5.6 5.1

7.58.5

0

5

10

15

20

25

Binary One Hot Hamming 2 Hamming 3

# of Slice Flip Flops

# of Four Input LUTs

Clock Period (ns)

Page 11: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 11

Twelve Bit State Encoding

12 Bit State Encoding

4

12

57

25

31

41

55

7.75.8

8.0 9.5

0

10

20

30

40

50

60

Binary States One Hot Hamming 2 Hamming 3

# of Slice Flip Flops

# of Four Input LUTs

Clock Period (ns)

Page 12: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 12

Sixteen Bit State Encoding

16 Bit State Encoding

4

16

57

3841

49

71

8.2 6.78.5 9.8

0

10

20

30

40

50

60

70

80

Binary One Hot Hamming 2 Hamming 3

# of Slice Flip Flops

# of Four Input LUTs

Clock Period (ns)

Page 13: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 13

Twenty-Four Bit State Encoding

24 Bit State Encoding

5

24

69

50

91

9.1 6.710.9 11.3

63

84

0

10

20

30

40

50

60

70

80

90

100

Binary One Hot Hamming 2 Hamming 3

# of Slice Flip Flops

# of Four Input LUTs

Clock Period (ns)

Page 14: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 14

Thirty-Two Bit State Encoding

32 Bit State Encoding

5 6 9

96107

137

14.6 11.5 12.032

237

10.6

0

50

100

150

200

250

Binary One Hot Hamming 2 Hamming 3

# of Slice Flip Flops

# of Four Input LUTs

Clock Period (ns)

Page 15: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 15

Fault Injection Test

• A test circuit is generated with an example of each state machine executing the same task, plus a reference state machine

• The task chosen requires a16-state state machine, to detect a 16-bit pattern in a serial input stream

• An error generator injects faults into all state machines except the reference state machine

Page 16: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 16

Error Injection Test Continued

• The outputs of each state machine are compared to the reference output

• A set of counters tallies the comparison outputs• 2 types of failure are logged for each state

machine:– Failure to detect pattern

– False detection of pattern (false-positive)

Page 17: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 17

Error Injection Test Continued

• Non-key patterns are 1-bit different from the key pattern, to increase the likelihood of a false match

• Error rate can vary, set to 1:199 clocks in example• Errors are weighted by distributing them pseudo-randomly

over 16 bits. A state machine with a word size of n, receives n/16 of the total faults

• Synchronous fault injection is before the state register• Asynchronous fault injection is after the state register• All results are from actual implementation of the test

circuits in a Spartan 2 FPGA

Page 18: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 18

Error Rate – Synchronous Faults Synchronous (rate=199)

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Binary 1-Hot H2 H3

erro

rs p

er p

atte

rn single

false-pos single

double

false-pos double

Page 19: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 19

Error Rate – Asynchronous Faults

Asynchronous (rate=199)

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.02

Binary 1-Hot H2 H3

erro

rs p

er p

atte

rn single

false-pos single

double

false-pos double

Page 20: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 20

Error Rate – Asynchronous Pulse Faults

Pulse (rate=199)

0

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

Binary 1-Hot H2 H3

erro

rs p

er p

atte

rn single

false-pos single

double

false-pos double

Page 21: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 21

Results: Binary Encoding

• Lowest resources used

• Second fastest speed after One Hot– Fastest for small number of states

• Second-most sensitive to errors

• Generates false-positive errors i.e. reports false pattern matches

Page 22: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 22

Results: One Hot Encoding

• No false-positive errors (single faults)• Fastest speed except for small number of states

and large number of states• Uses more resources than Binary• Inefficient for large number of states• Worst fault tolerance of all encoding tested• Has 2x the error rate of binary encoding

Page 23: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 23

Results: Hamming Distance of 2 (H2) Encoding

• No false-positive errors (single faults)

• Better Fault Tolerance than Binary

• More resources needed than One Hot, except for large number of states

Page 24: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 24

Results: Hamming Distance of 3 (H3) Encoding

• Zero single-fault errors– Immune to synchronous and asynchronous

errors

• Lowest double-fault errors• Most resources used (*)

~2x binary encoding

• Slowest speed (*)(*) Except for large number of states

Page 25: D_160 / MAPLD - 2004Burke 1 Fault Tolerant State Machines Gary Burke, Stephanie Taft Jet Propulsion Laboratory, California Institute of Technology

D_160 / MAPLD - 2004 Burke 25

Summary

• Binary encoding will give unpredictable results when faults are injected; generating false-positive errors in the pattern matching example

• One Hot encoding provides false-positive protection, but at the cost of considerably more errors

• Hamming 2 encoded state machines will provide significantly better fault tolerance at a cost of about 25% more resources than binary

• Hamming 3 encoded state machines give excellent fault tolerance but at a ~2x increase in resources