21
Detecting soft errors by a purely software approach: method, tools and experimental results B. Nicolescu, R. Velazco TIMA Laboratory, “Circuit Qualification” research group 46, Av. Félix Viallet, 38031, Grenoble, France ted By: MD. Hasibur Rashid // MSc. In CSE, KUET, Ba

Detecting soft errors by a purely software approach

Embed Size (px)

Citation preview

Page 1: Detecting soft errors by a purely software approach

Detecting soft errors by a purely software approach: method, tools and experimental

results

B. Nicolescu, R. VelazcoTIMA Laboratory, “Circuit Qualification” research group

46, Av. Félix Viallet, 38031, Grenoble, France

Presented By: MD. Hasibur Rashid // MSc. In CSE, KUET, Bangladesh

Page 2: Detecting soft errors by a purely software approach

Abstract

In this paper is described a software technique allowing to detect soft errors occurring in processor-based digital architectures. The detection mechanism is based on a set of rules allowing the transformation of the target application into a new one, having same functionalities but being able to identify bit-flips arising in memory areas as well as those perturbing the processor’s internal registers. Experimental results issued from fault injection sessions and preliminary radiation test campaigns performed in complex DSP processor, provide objective figures about the efficiency of the proposed error detection technique.

Page 3: Detecting soft errors by a purely software approach

Introduction• The technological progress achieved in the microelectronics technology has

as a consequence the increasing sensitivity to the different effects of the environment (i.e. radiation, EMC(Electromagnetic Compatibility), …). Particularly, processors operating in space environment are subject to different radiation phenomena, whose effects can be permanent or transient.

• This paper strictly focuses the transient effects, also called SEUs (Single Event Upsets) occurring as the consequence of the impact of charged particles with sensitive areas of integrated circuits. The SEUs are responsible for the modification of memory cells content with consequences in the operation of the considered application, ranging from erroneous results to system crashes. The consequences of the SEUs depend on both the nature of the perturbed information and the bit-flips occurrence instants.

Page 4: Detecting soft errors by a purely software approach

• SET (Single Event Transient) could constitute a serious source of errors not only for circuits operating in space, but also for digital equipment operating in the Earth’s atmosphere at high altitudes (avionics) and even at ground level.

• Several approaches have been proposed in the past to achieve fault tolerance (or just safety) by modifying only the software. The proposed methods can mainly be categorized in two groups: those proposing the replication of the program execution and the check of the results (i.e., Recovery Blocks and N-Version Programming) and those based on introducing some control code into the program (e.g., Algorithm Based Fault Tolerance (ABFT), Assertions, Code Flow Checking ).

Page 5: Detecting soft errors by a purely software approach

Software Based Fault Tolerance

• This section describes the investigated methodology to provide error detection capabilities through a purely software approach

• Transformation rules*A.1. Error affecting data*A.2. Error affecting basic instructions*A.3. Error affecting control instructions

• Transformation tool - C2C Translator

Page 6: Detecting soft errors by a purely software approach

Error affecting data

This group of rules aims at detecting those faults affecting the data. The idea is to define the inter dependence relationships between the variables of the program and to classify them in two categories according to their role in the program:• intermediary variables: they are used for the

calculation of other variables• final variables: they do not take part in calculation of

any other variable

Page 7: Detecting soft errors by a purely software approach

Error affecting data

The proposed rules are then:

• Identification of the relationships between the variables • Classification of the variables according to their role in the

program: intermediary variable and final variable• Every variable x must be duplicated: let x1 and x2 be the names

of the two copies • Every operation performed on x must be performed on x1 and

x2 • After each write operation on the final variables, the two

copies x1 and x2 must be checked for consistency, and an error detection procedure is activated if an inconsistency is detected

Page 8: Detecting soft errors by a purely software approach

• The interdependence relationships between the variables are: a = f(b,c) and d = f(a = f(b,c),b). In this case only d is considered a final variable while a, b and c are intermediary variables. In Figure 1 .b are shown the transformations issued from the set of rules presented.

Page 9: Detecting soft errors by a purely software approach

A.2. Error affecting basic instructions

Page 10: Detecting soft errors by a purely software approach

According to these modifications, the studied rules become the following:

• A boolean flag status_block is associated with every basic block i in the code; 1 for the a inactive state and 0 for the active state

• An integer value ki is associated with every basic block i in the code

• A global execution check flag (gef) variable is defined• A statement assigning to gef the value of (ki &(status_block =

status_block + 1 ) mod2) is introduced at the beginning of every basic block i; a test on the value of gef is also introduced at the end of the basic block

Page 11: Detecting soft errors by a purely software approach

A.3. Error affecting control instructions

• a) Rules targeting errors affecting the conditional control instructions

Page 12: Detecting soft errors by a purely software approach

• b) Rules targeting errors affecting the unconditional control instructions

Page 13: Detecting soft errors by a purely software approach

In summary, the rules are defined as follows:

• For every test statement the test is repeated at the beginning of the target basic block of both the true and (possible) false clause. If the two versions of the test (the original and the newly introduced) produce different results, an error is signalled

• A flag ctrl_branch is defined in the program• An integer value kj is associated with any procedure j in the code• At the beginning of every procedure, the value kj is assigned to

ctrl_branch; a test on the value of ctrl_branch is introduced before and after any call to the procedure

Page 14: Detecting soft errors by a purely software approach

B. Transformation tool - C2C Translator

The C2C Translator accepts as an input a C code source producing as output the C code corresponding to a hardened program according to a set of options. From the resulting C code can be obtained, using an ad hoc compiler, the assembly language code for a targeted processor.

Page 15: Detecting soft errors by a purely software approach

Experimental ResultsMain characteristics of the studied program• The application that we considered for the experimentation was a Constant

Modulus Algorithm(CMA), used in space communications. This application will be called in the following CMA Original. The two set of rules above described and discussed were automatically applied on the CMA Original program, getting two new programs called in the following CMA Hardened old and CMA Hardened New. Main features of these programs are summarized in Table.

Page 16: Detecting soft errors by a purely software approach

Software Fault Injection ResultsThe following categories were considered:• Effect-less: The injected fault does not affect the program behavior.• Software Detection: The implemented rules detect the injected fault.• Hardware Detection: The fault triggers some hardware mechanism (e.g., illegal instruction

exception).• Loss Sequence: The program under test triggers some time-out condition (e.g., endless loop).• Incorrect Answer: The fault was not detected in any way and the result is different from the

expected one.

In order to quantify the error detection capabilities, two magnitudes were introduced: the detection efficiency (e) and the failure rate ( T).

Page 17: Detecting soft errors by a purely software approach

B.1. Fault injection in the DSP32C registers

B.2. Fault injection in the code of the programs & data memory area

Page 18: Detecting soft errors by a purely software approach
Page 19: Detecting soft errors by a purely software approach

Preliminary Radiation Testing Campaign

Where: + Flux represents the number of particles reaching the processor per square unit and time unit + Time exposure is the duration of the experiment + Estimated Upsets represents the number of upsets expected during the whole radiation experiment

Page 20: Detecting soft errors by a purely software approach

Conclusions

In this paper we presented a software error detection method and a tool for automatic generation of hardened application. The technique is exclusively based on modifying the application code and does not require any special hardware requirement. As a consequence, we can conclude that the method is suitable for usage in low-cost safety-critical applications, where the high constraints involve in terms of memory overhead (about 4 times) and speed decrease (about 2.6 times) can be balanced by the low cost and high reliability of the resulting code.

Page 21: Detecting soft errors by a purely software approach

Thank YouMD. Hasibur Rashid, MSc. In CSE, KUET, Bangladesh