98
Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor T. Johnson Dept. of Computer Science & Engineering, University of Texas at Arlington Thanks to Otmane Ait Mohamed (Concordia Univ., Canada) & Yvon Savaria (Polytechnique Montreal, Canada) for supervising the project

Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Early Dependability Assessment of FPGA-Based

Space Applications Using Formal Verification

Khaza Anuarul HoqueMentor: Dr. Taylor T. Johnson

Dept. of Computer Science & Engineering,

University of Texas at Arlington

Thanks to Otmane Ait Mohamed (Concordia Univ., Canada) & Yvon Savaria(Polytechnique Montreal, Canada) for supervising the project

Page 2: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Outline

2

Motivation

FPGA and SEUs

Proposed methodology

Design options analysis

DAL analysis and scrub optimization

TMR partitioning optimization

Lessons learned

Summary

Future directions

Page 3: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Motivation: Cosmic Radiation

3

Page 4: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Motivation (cont.)

4

Page 5: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Background: FPGAs and SEUs

5

SRAM-based FPGAs

(+) Cheaper than Rad-hard FPGAs

(+) Better performance than other types of FPGAs

(+) On-field programmability

Susceptible to cosmic radiation induced SEUs

(-) Mitigation required

(-) Redundancy (such as TMR)

(-) Scrubbing/Reconfiguration

Page 6: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

SEU Estimation

6

Three main ways to analyze SEUs:

1. Hardware testing (beam testing/laser testing)

(+) Most realistic and accurate

(-) Requires finished implementation

(-) May damage the device and

(-) Costly

2. Fault injection (emulation/simulation)

(+) Less accurate than hardware testing but still very useful

(-) Test time grows with possible test cases

Page 7: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

SEU Estimation

7

3. Analytical techniques

(+) Better controllability, quick estimation of SEUs

(+) No risk of damaging the device

(+) Estimation at early design stage

(-) Can be relatively less accurate in some cases

An early estimation of SEUs will help :

-To build a more reliable design

-To adopt the required mitigation strategy

- Reduce the design time

- Reduce the design cost

Page 8: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Objectives

8

To propose a methodology for early SEU analysis on SRAM-based

FPGA designs for aerospace applications:

Based on formal verification technique (probabilistic model checking)

Design options analysis

What is the trade-off between Dependability-performability-area ?

Which design option shall a designer choose ?

Scrub Optimization and DAL analysis

Is the adopted mitigation enough ?

Scrub frequency can be optimized ?

Early assessment of TMR partitioning to increase reliability

Can we find the number of partitions early ?

Page 9: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Formal Verification

9

The application of rigorous, mathematics-based techniques to establish the correctness of computerised systems

Main techniques: Model checking Equivalence checking Theorem proving

Many properties other than correctness are important

Quantitative requirements: “how reliable is my car’s Bluetooth network?” “how efficient is my phone’s power management policy?” “how secure is my bank’s web-service?”

Probabilistic model checking is a formal verification technique for modelling and analysing systems that

exhibit probabilistic behaviour

Page 10: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Probabilistic Model Checking

10 Image credit: David Parker/Probabilistic Model Checking, Michaelmas 2011

Page 11: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Probabilistic Model Checking

11

Models: variants of Markov chains: Discrete-time Markov chains (DTMCs) Continuous-time Markov chains (CTMCs) Markov decision processes (MDPs)

Property Specifications:

- PCTL, CSL, PCTL*, LTL

Transient and Steady-state analysis:P = ? [F [t,t] oper] - Instantaneous availability of the system, e.g. Probability that the system will be in a specific state in time instant t Timing and ordering of events:P = ? [!fail_B U [3600,7200] fail_B] - Probability that component B fails for the first time during the second hour of operation. Reward-based properties:R{“oper”} = ? [C<t] - Expected cumulative operational time of the system in the time interval [0, t]

Page 12: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Proposed Methodology

12

Page 13: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Outline

13

Motivation

FPGA and SEUs

Proposed methodology

Design options analysis

DAL analysis and scrub optimization

TMR partitioning optimization

Lessons learned

Summary

Future directions

Page 14: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Outline

13

Proposed methodology

Design options analysis

Page 15: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Options Analysis

14

Analyze design options with respect to

Reliability

Availability

Safety

performability-area tradeoff

Throughput: 0.33

Page 16: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Options Analysis

14

Analyze design options with respect to

Reliability

Availability

Safety

performability-area tradeoff

Throughput: 0.33

Component failed !

Page 17: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Options Analysis

14

Analyze design options with respect to

Reliability

Availability

Safety

performability-area tradeoff

Throughput: 0.33

Component failed !

Solution:

rescheduling

Page 18: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Options Analysis

14

Analyze design options with respect to

Reliability

Availability

Safety

performability-area tradeoff

Throughput: 0.25

Component failed !

Solution:

rescheduling

Page 19: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Option Analysis Methodology

15

Dataflow

graph

Configuration

PRISM model

PRISM model

checker

Results

Characterization

Library

Properties

Mitigation(s)

Fault

coverage

Rewards

Page 20: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Option Analysis Methodology

15

Dataflow

graph

Configuration

PRISM model

PRISM model

checker

Results

Characterization

Library

Properties

Mitigation(s)

Fault

coverage

Rewards

Page 21: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Option Analysis Methodology

15

Dataflow

graph

Configuration

PRISM model

PRISM model

checker

Results

Characterization

Library

Properties

Mitigation(s)

Fault

coverage

Rewards

Page 22: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Option Analysis Methodology

15

Dataflow

graph

Configuration

PRISM model

PRISM model

checker

Results

Characterization

Library

Properties

Mitigation(s)

Fault

coverage

Rewards

Page 23: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Option Analysis Methodology

15

Dataflow

graph

Configuration

PRISM model

PRISM model

checker

Results

Characterization

Library

Properties

Mitigation(s)

Fault

coverage

Rewards

Page 24: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Design Option Analysis Methodology

15

Dataflow

graph

Configuration

PRISM model

PRISM model

checker

Results

Characterization

Library

Properties

Mitigation(s)

Fault

coverage

Rewards

Page 25: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Markov Modeling Example

16

Table : Characterization library

Page 26: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Outline

17

Motivation

FPGA and SEUs

Proposed methodology

Design options analysis

DAL analysis and scrub optimization

TMR partitioning optimization

Quantitative analysis

Summary

Future directions

Page 27: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Outline

17

Proposed methodology

DAL analysis and scrub optimization

Page 28: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Scrub Optimization and DAL Analysis

18

DO-254:

- Baseline of required design flow steps for airborne component

- Five levels of compliance, commonly known as Design Assurance Levels (DALs)

- A failure condition on flight control system (that may lead to a catastrophic event) ≠ A failure condition on the entertainment system (even though it could spoil your day!)

- Engineers designing to level A or B face a much more rigorous test, verification, and documentation process than for levels C, D, or E.

Can a lower scrub frequency (to lower the power consumption) meeting DAL requirement?

Page 29: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 30: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 31: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 32: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

+ * +

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 33: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

+ * +

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 34: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

+ * +

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 35: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

+ * +

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 36: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

+ * +

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Page 37: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

+ * +

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Level Classification Failure Condition Description Pb/h

A Catastrophic Failure conditions that would prevent continued safe flight and

landing. <10-9

Extremly improbable

B Hazardous / Severe-Major Large reduction in safety margins or functional capabilities,

physical distress or higher workload such that the flight crew could not be relied on to perform their tasks accurately or completely, or adverse effects on occupants including serious or potentially fatal injuries to a small number of those occupants

<10-7

Extremly remote

C Major Significant reduction in safety margins or functional

capabilities, a significant increase in flight crew workload or in conditions impairing flight crew efficiency, or discomfort to occupants, possibly including injuries.

<10-5

remote

D Minor Slight reduction in safety margins or functional capabilities, a

slight increase in flight crew workload, such as routine flight plan changes, or some inconvenience to occupants

<10-3

Probable

E No Effect Failure conditions that do not affect the operational capability

of the aircraft or increase flight crew workload. -

Yes

Page 38: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Analysis Methodology

19

High-level

description

CDFG

Extraction

Resource estimation

Characterization

Library

Mitigation strategy

+ * +

Failure & scrub

parameter

PRISM MC

PRISM model

(Erlang distribution)

DAL

Properties

DAL

met ?Finish

Number of

total essential bits

Level Classification Failure Condition Description Pb/h

A Catastrophic Failure conditions that would prevent continued safe flight and

landing. <10-9

Extremly improbable

B Hazardous / Severe-Major Large reduction in safety margins or functional capabilities,

physical distress or higher workload such that the flight crew could not be relied on to perform their tasks accurately or completely, or adverse effects on occupants including serious or potentially fatal injuries to a small number of those occupants

<10-7

Extremly remote

C Major Significant reduction in safety margins or functional

capabilities, a significant increase in flight crew workload or in conditions impairing flight crew efficiency, or discomfort to occupants, possibly including injuries.

<10-5

remote

D Minor Slight reduction in safety margins or functional capabilities, a

slight increase in flight crew workload, such as routine flight plan changes, or some inconvenience to occupants

<10-3

Probable

E No Effect Failure conditions that do not affect the operational capability

of the aircraft or increase flight crew workload. -

Yes

No

Page 39: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Modeling and Parameters

20

Set of states

- The set of states can be classified into “fully operational” , “faulty

with one or more faults” and “failed” states.

Modeling parameters

- Number of critical bits: Characterization library

- Environmental: Design Failure rate,

- Target system: SelectMap bus width B and configuration clock

frequency 𝑓𝑐𝑐𝑙𝑘- Mitigation: Correction rate,

µ𝑑𝑒𝑠𝑖𝑔𝑛 =𝐵 × 𝑓𝑐𝑐𝑙𝑘

#𝑐𝑜𝑛𝑓𝑖𝑔𝑢𝑟𝑎𝑡𝑖𝑜𝑛 𝑏𝑖𝑡𝑠

λ𝑑𝑒𝑠𝑖𝑔𝑛 = λ𝑏𝑖𝑡 × # critical bits

Page 40: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Deterministic Delay Modeling

21

Erlang process :

𝑆0 𝑆1

𝜆

Page 41: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Deterministic Delay Modeling

21

Erlang process :

𝑆0 𝑆1

𝑆0 𝑆1 𝑆𝑘−1𝑆2 𝑆𝑘 𝑘 𝜏 𝑘 𝜏 𝑘 𝜏

𝜆

Page 42: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Deterministic Delay Modeling

21

Erlang process :

𝑆0 𝑆1

𝑆0 𝑆1 𝑆𝑘−1𝑆2 𝑆𝑘 𝑘 𝜏 𝑘 𝜏 𝑘 𝜏

𝜆

Pro

babi

lity

𝜏

Page 43: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Modeling Periodic Blind Scrub

22

Page 44: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Modeling TMR & Blind Scrub

23

Page 45: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Outline

24

Motivation

FPGA and SEUs

Proposed methodology

Design options analysis

DAL analysis and scrub optimization

TMR partitioning optimization

Lessons learned

Summary

Future directions

Page 46: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Outline

24

Proposed methodology

TMR partitioning optimization

Page 47: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 48: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 49: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 50: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 51: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 52: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 53: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 54: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 55: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 56: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 57: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Triple Modular Redundancy (TMR)

25

Page 58: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning: Example

26

Page 59: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

Req.

Met ?

Page 60: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

Req.

Met ?

Page 61: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

Req.

Met ?

Page 62: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

Failure rate of each

module

Scrub

rate

Req.

Met ?

Page 63: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

User

Failure rate of each

module

Scrub

rate

Req.

Met ?

Page 64: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

Scrub

rate

Req.

Met ?

Page 65: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

No of components each

moduleScrub

rate

Req.

Met ?

Characterization

library

Page 66: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

No of components each

moduleScrub

rate

Req.

Met ?

Characterization

library

Page 67: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Reliability/

Availability

Properties

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

No of components each

module

Requirement

specification

Scrub

rate

Req.

Met ?

Characterization

library

Page 68: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Reliability/

Availability

Properties

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

No of components each

module

Requirement

specification

Scrub

rate

Req.

Met ?

Characterization

library

Page 69: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Reliability/

Availability

Properties

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

No of components each

module

Requirement

specification

Scrub

rate

Req.

Met ?

Characterization

library

Page 70: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Reliability/

Availability

Properties

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

No of components each

module

Requirement

specification

Scrub

rate

Req.

Met ?Finish

Characterization

library

Yes

Page 71: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

TMR Partitioning Methodology

27

High-level

description

Extracted

CDFG

PRISM model

Reliability/

Availability

Properties

Quantitative

Results

PRISM MC

CTMC

User

No of

partitions

Failure rate of each

module

No of components each

module

Requirement

specification

Scrub

rate

Req.

Met ?Finish

Characterization

library

Yes No

Page 72: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Markov Modeling of Single Bit Upset

28

A system with N partitions can be defined by a set:

where each represented by a CTMC

The final model of system can be defined by the parallel

composition (||) of all the CTMCs of the partitions:

Page 73: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Markov Modeling of Single Bit Upset

28

A system with N partitions can be defined by a set:

where each represented by a CTMC

The final model of system can be defined by the parallel

composition (||) of all the CTMCs of the partitions:

Page 74: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Markov Modeling of Single Bit Upset

28

A system with N partitions can be defined by a set:

where each represented by a CTMC

The final model of system can be defined by the parallel

composition (||) of all the CTMCs of the partitions:

Page 75: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Markov Modeling of Single Bit Upset

28

A system with N partitions can be defined by a set:

where each represented by a CTMC

The final model of system can be defined by the parallel

composition (||) of all the CTMCs of the partitions:

Page 76: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Markov Modeling of Single Bit Upset

28

A system with N partitions can be defined by a set:

where each represented by a CTMC

The final model of system can be defined by the parallel

composition (||) of all the CTMCs of the partitions:

Page 77: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

SBU Model for FIR: 2 Partitions

29

Page 78: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

SBU Model for FIR: 2 Partitions

29

Page 79: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

SBU Model for FIR: 2 Partitions

29

Page 80: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

SBU Model for FIR: 2 Partitions

29

Page 81: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Combined Model for FIR: 2 Partitions

30

Page 82: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Combined Model for FIR: 2 Partitions

30

Page 83: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Combined Model for FIR: 2 Partitions

30

Page 84: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Combined Model for FIR: 2 Partitions

30

Page 85: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Key Lessons

31

Extra reliability provided by the redundancy is not always

useful to suppress the additional area overhead.

Rescheduling with scrubbing is good enough to serve as a

fault recovery and repair mechanism where optimization of

reliability, area, and performance is required.

It is possible to find an appropriate scrub interval (slowest

scrub rate) to save power while meeting the dependability

requirements instead of choosing the highest scrub

frequency.

Page 86: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Key Lessons (cont.)

32

There exists an optimal number of TMR partitions

The more the number of partitions (which means smaller

modules), the less frequent scrub will be required to meet a

target reliability.

For availability, the number of partitions is important for the

cases where the scrub interval is long.

For the case of frequent scrubbing, the number of partitions

increases the availability to a minimal level.

For longer scrub intervals the availability improvement with

the increased number of partitions is quite significant.

Page 87: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Lessons Learned

33

Brief experience from some our previous research: K. A. Hoque, O. A. Mohamed and Y. Savaria, “Applying Formal Verification to Early

Assessment of FPGA-based Aerospace Applications: Methodology and Experience”, 10th IEEE Systems Conference (SysCon 2016), Orlando, USA, 2016.

K. A. Hoque, O. A. Mohamed and Y. Savaria, “Towards an accurate reliability, availability and maintainability analysis approach for satellite systems based on probabilistic model checking”, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, 2015.

K. A. Hoque, O. A. Mohamed, Y. Savaria and C. Thibeault, “Probabilistic Model Checking Based DAL Analysis to Optimize a Combined TMR-Blind-Scrubbing Mitigation Technique for FPGA- Based Aerospace Applications”, 12th ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE), IEEE, 2014, Lausanne, Switzerland.

K. A. Hoque, O. A. Mohamed, Y. Savaria and C. Thibeault, “Early Analysis of Soft Error Eects for Aerospace Applications using Probabilistic Model Checking”, 2nd International Workshop on Formal Techniques for Safety-Critical Systems (FTSCS13), CCIS, Springer-Verlag, 2013, Queenstown, New Zealand.

New modeling results related to TMR partitioning are not published yet.

Page 88: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Summary

34

Use of FPGA in aerospace is common, hence early dependability

analysis is helpful in saving:

Time

Effort

Cost

We proposed formal verification technique based methodology:

design options analysis

Scrub interval optimization with DAL analysis

optimal partitioning of TMR

Probabilistic model checking can be used for SEU analysis as a

complimentary approach with other techniques.

Page 89: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Future Directions

35

Inclusion of other fault models, such as analyzing design

failures due to aging, electromigration, hot electron effects,

and Negative-Bias Temperature Instability (NBTI) and Single-

Event Functional Interrupts (SEFI).

Inclusion of adaptive mitigation model based on radiation

sensitivity.

Extension of the models to handle three or more bit upsets.

Extension of the models to support read-back scrubbing and

other customizable scrubbing schemes.

Page 90: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Acknowledgement

36

This research work is a part of the AVIO-403 project financially supported by the Consortium for Research and Innovation in Aerospace in Quebec (CRIAQ), Fonds de Recherche du Qu´ebec- Nature et Technologies (FRQNT) and the Natural Sciences and Engineering Research Council of Canada (NSERC). The presenter would also like to thank Bombardier Aerospace, MDA Space Missions and the Canadian Space Agency (CSA) for their technical guidance and financial support.

The presenter would also like to thank Dr. Taylor T. Johnson, from VeriVITAL group, University of Texas at Arlington for his financial support to attend the S5 Symposium.

Page 91: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

37

Thank YouQuestions ?

Page 92: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

38

EXTRA SLIDES

Page 93: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Quantitative Analysis:

16-tap FIR Filter

39

No Configurations Spare Scrubbing Rescheduling

C1 2A 2M None ✓ ✓

C2 2A 3M 1 Mul ✓ ✓

C3 3A 2M 1 Add ✓ ✓

C4 3A 3M 1 Add,

1 Mul✓ ✓

Table : Design options to evaluate

Page 94: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Expected Throughput: Overall Reward

40

Interval

(days)

Configurations Overall

reward

1

C1

C2

C3

C4

1.432

1.045

1.326

0.993

4

C1

C2

C3

C4

1.216

0.940

1.166

0.931

9

C1

C2

C3

C4

0.942

0.769

0.932

0.790

R {“Expected throughput”} = ? [ S ]

Page 95: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

DAL Verification

41

“For a given scrub interval, the probability that the FIR filter will

fail in last 20 minutes of the flight is less than 0.01”

P < 0.01 [F [344400,345600] ("failure")]

Figure: Fault tree of the system Table : Verification of DAL requirement

Scrub

Interval (s)

DAL-A met

(scrub only)

(A = 0.0001

B=0.001)

DAL-A met

(scrub & TMR)

(B=0.001)

0.5 False True

1.0 False True

1.5 False True

2.0 False True

2.5 False True

3.0 False True

Page 96: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Quantitative Analysis:

Availability Verification

42

“for a given scrub interval, does the system meet the requirement for

five 9s in the long run?”

S >= 0.99999 [“operational”]

Table : Verification of availability requirements

Scrub Interval

(sec)

Availability

requirement

met?

(FIR)

Availability

requirement

met?

(EWF)

0.5 True True

1.0 False True

1.5 False False

2.0 False False

2.5 False False

3.0 False False

Page 97: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Quantitative Analysis: 64-tap FIR Filter

43

No of

partitions

No. of

states

No. of

transitions

(SBU)

No. of

transitions

(SBU+DBU)

0 3 6 N/A

2 9 26 30

4 81 361 578

8 6561 478858 129506

Property 1: Reliability: P = ?[G[0,T] operational], T = 1 month

Property 2: Availability: R{"up time“} = ? [C<=T ]/T, T = 1 month

Page 98: Early Dependability Assessment of FPGA-Based Space ...Early Dependability Assessment of FPGA-Based Space Applications Using Formal Verification Khaza Anuarul Hoque Mentor: Dr. Taylor

Reliability/Availability: Combined Model

44Optimal no: 4 partitions