36
Advanced Encryption Standard Advanced Encryption Standard Secured Architecture and Secured Architecture and its Auto its Auto - - Test Capabilities Test Capabilities Paolo MAISTRI Paolo MAISTRI

Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

Advanced Encryption StandardAdvanced Encryption Standard

Secured Architecture and Secured Architecture and

its Autoits Auto--Test CapabilitiesTest Capabilities

Paolo MAISTRIPaolo MAISTRI

Page 2: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

2

OutlineOutline

�� IntroductionIntroduction

�� Error detection by redundancyError detection by redundancy

�� A new approach for temporal redundancy: DDRA new approach for temporal redundancy: DDR

�� PrinciplesPrinciples

�� Error coverageError coverage

�� CostsCosts

�� Testability issues and comparisonsTestability issues and comparisons

�� ConclusionConclusion

Page 3: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

3

IntroductionIntroduction

�� Fault attacks are one of the most effective ways to break a Fault attacks are one of the most effective ways to break a cryptosystemcryptosystem�� AES can be broken with 2 wellAES can be broken with 2 well--located byte faults (located byte faults (PiretPiret--QuisquaterQuisquater, ,

CHES 2003)CHES 2003)

�� Faults can be easily injected by glitches, spikes, drops in poweFaults can be easily injected by glitches, spikes, drops in power r supply, lasers, supply, lasers, ……�� Laser injection allows controlling timing, location and focus (iLaser injection allows controlling timing, location and focus (i.e., .e.,

approximate size) of the fault, but not the valueapproximate size) of the fault, but not the value

�� Offline error detection can not guarantee enough protection Offline error detection can not guarantee enough protection against the attacksagainst the attacks�� Integrity checks after decryption are not enough!Integrity checks after decryption are not enough!

�� Different error detection schemes must be compared with Different error detection schemes must be compared with respect to their costs and respect to their costs and validatedvalidated with with realisticrealistic modelsmodels

Page 4: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

4

The Advanced The Advanced EncryptionEncryption StandardStandard

128-bit

Round Key

Key

Schedule

Round

128-bit

Round Key

Key

Schedule

Round

4x4-byte

State

Encryption

RoundSubBytes

ShiftRows

MixColumn

Secret Key Plain Text

Ciphered Text

�� 128128--bit data inputbit data input

�� 128128-- to 256to 256--bit secret keybit secret key

�� RoundRound--based:based:

�� SubBytesSubBytes: Non: Non--linear byte linear byte

substitutionsubstitution

�� ShiftRowShiftRow: Word rotation: Word rotation

�� MixColumnMixColumn: linear word : linear word

transformationtransformation

�� AddRoundKeyAddRoundKey: modulus: modulus--2 2

addition with round keyaddition with round key

�� Encryption and Key Schedule Encryption and Key Schedule

use the same basic operationsuse the same basic operations

�� Decryption uses inverse Decryption uses inverse

operations of encryptionoperations of encryption

Page 5: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

5

FaultFault AttacksAttacks againstagainst AESAES

�� Inject an error towards the end of the computationInject an error towards the end of the computation

�� Compare the proper and the faulty resultCompare the proper and the faulty result

�� Build an equation systems in the key, the error value, the Build an equation systems in the key, the error value, the intermediate variablesintermediate variables�� Most values are UNKNOWN!Most values are UNKNOWN!

�� Use different fault injections to drop some solutions and reduceUse different fault injections to drop some solutions and reducethe number of key guessesthe number of key guesses

�� It allows the application of differential cryptanalysis to It allows the application of differential cryptanalysis to ciphers of any length, since it allows injecting the ciphers of any length, since it allows injecting the differences differences any time any time during the computationduring the computation

Page 6: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

6

Concurrent Error Detection SchemesConcurrent Error Detection Schemes

�� Spatial redundancySpatial redundancy

�� Duplication of (some) functional unitsDuplication of (some) functional units

�� Dual railDual rail

�� Information redundancy Information redundancy

�� Single or Single or multiple parity bitsmultiple parity bits

�� Linear + nonlinear codesLinear + nonlinear codes

�� Cubic networksCubic networks

�� Temporal redundancy Temporal redundancy

�� ......

Page 7: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

7

Error detection Error detection

by temporal redundancyby temporal redundancy�� To detect transient faults, additional time can be spent to To detect transient faults, additional time can be spent to

compute the same (or an equivalent) operationcompute the same (or an equivalent) operation�� Compute the same operation twiceCompute the same operation twice

�� Compute the inverse operation and compare if the initial input iCompute the inverse operation and compare if the initial input is s obtained (same or different hardware is used, depending on the cobtained (same or different hardware is used, depending on the cipher)ipher)

�� Use idle cycles of a pipeline implementationUse idle cycles of a pipeline implementation to redo the same to redo the same computation and compare the resultscomputation and compare the results

�� Throughput is often extremely sacrificedThroughput is often extremely sacrificed�� Increasing throughput by pipelining is not possible due to securIncreasing throughput by pipelining is not possible due to security ity

constraints (i.e., feedback modes)constraints (i.e., feedback modes)

�� Not every design allows these approaches without penaltiesNot every design allows these approaches without penalties

�� Cryptographic systems have often very short critical path, but Cryptographic systems have often very short critical path, but they are usually embedded in slower systems (e.g., smart cards)they are usually embedded in slower systems (e.g., smart cards)�� Can we improve the efficiency Can we improve the efficiency without multiplying the clock frequencywithout multiplying the clock frequency??

Page 8: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

8

DoubleDouble--DataData--Rate ComputationRate ComputationDoubleDouble--DataData--Rate ComputationRate Computation

�� Twice the throughput at the same Twice the throughput at the same frequencyfrequency

�� Small area overhead for DDR Small area overhead for DDR logiclogic

�� Increased parallelismIncreased parallelism

�� More complex routing, thus lower More complex routing, thus lower max frequencymax frequency

�� Error detection requires additional Error detection requires additional overheadoverhead

�� Design may require data alignment Design may require data alignment and synchronization and synchronization ““bubblesbubbles””

Operation2 clock cycles to

compute Operation

for all input bytes

4 clock cycles to

compute Operation

for all input bytes

Operation

Page 9: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

9

AES Basic ArchitectureAES Basic Architecture

MixColumns, AddRoundKey and State

2-stage SBox

Register layer

Combinatorial logic

8-bit signal

32-bit signal

�� 3232--bit databit data--pathpath

�� 4 Substitution Boxes4 Substitution Boxes

�� 16 GF Multipliers for 16 GF Multipliers for

MixColumnsMixColumns

�� 6 clock cycles per round6 clock cycles per round

�� OnOn--thethe--fly key unrollingfly key unrolling

<<<

Output

Input

SBOX SBOX SBOX SBOX

From Key Unit

To Key Unit

Page 10: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

10

�� The data alignment phase partitions the register space The data alignment phase partitions the register space

into two classes:into two classes:

�� Registers triggered by ascending clock edgeRegisters triggered by ascending clock edge

�� Registers triggered by descending clock edgeRegisters triggered by descending clock edge

�� Alignment can be done:Alignment can be done:By columns: registers in the same columns share the clock alignmBy columns: registers in the same columns share the clock alignmentent

By rows: registers in the same rows share the clock alignmentBy rows: registers in the same rows share the clock alignment

By checkers: elements of the partitions are interleaved both in By checkers: elements of the partitions are interleaved both in columns columns

and rows, like a chess boardand rows, like a chess board

Data Alignment (in AES)Data Alignment (in AES)

Page 11: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

11

SynchronizationSynchronization

�� DDR computation can be employed when we have DDR computation can be employed when we have scarce resources, high parallelism and no data scarce resources, high parallelism and no data dependencydependency�� In this AES design, SBoxes are the In this AES design, SBoxes are the scarce scarce resourcesresources

�� Row rotation is performed while moving data during nonRow rotation is performed while moving data during non--linear substitution (collateral linear substitution (collateral datadata--dependencedependence))

�� RowRow--wise DDR alignment is thus chosen for medium wise DDR alignment is thus chosen for medium designdesign

�� In AES, all operations are independent on each byte, In AES, all operations are independent on each byte, but the but the MixColumns MixColumns operationoperation�� MixColumns are not a scarce resource (each byte is MixColumns are not a scarce resource (each byte is

computed locally), but values have to be stable (i.e., a latch icomputed locally), but values have to be stable (i.e., a latch is s used)used)

Page 12: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

12

AES DDR ArchitectureAES DDR Architecture

+ + + +

- - - -

+ + + +

- - - -

+/- +/- +/- +/-

<<<

Output

Input

SBOX SBOX SBOX SBOX

From Key Unit

To Key Unit

�� Row data alignmentRow data alignment

�� Substitution Boxes with Substitution Boxes with

double (DDR) registersdouble (DDR) registers

�� Data Cells with SDR Data Cells with SDR

(complementary) (complementary) FFsFFs

�� 3 clock cycles per round3 clock cycles per round

MixColumns, AddRoundKey and State

2-stage SBox

Register layer

Combinatorial logic

8-bit signal

32-bit signal

Page 13: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

13

DDR Round ComputationDDR Round Computation

0 4 8 12

1 5 9 13

2 6 10 14

3 7 11 15

SBox – Stage 1

SBox – Stage 2

ShiftRows

0 4 8 12

1 5 9 13

2 6 10 14

3 7 11 15

SBox – Stage 1

SBox – Stage 2

ShiftRows

0 4 8 12

1 5 9 13

3 7 1115

SBox – Stage 1

SBox – Stage 2

ShiftRows

2 610 14 0 4 8 12

15 9 13

2 610 14

SBox – Stage 1

SBox – Stage 2

ShiftRows

3 7 1115

0 4 8 12

15 9 13

2 610 14

3 7 1115

SBox – Stage 1

SBox – Stage 2

ShiftRows

Clockcycles

0 1 2 3 4

Page 14: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

14

DDR DDR vsvs Pipeline RedundancyPipeline Redundancy

D3 D4

D1 D2

Regular

pipeline

Pipeline

redundancy

DDR

redundancy

D1 D2

D1

D3

D2

D4

D3 D4

D1 D1

D1

D2

D1

D2

D2 D2

D1 D2 D1 D2

D3 D4

D3 D4

D1 D2 D3 D4

Area:

+50%

Throughput:

-18%

Area:

+36%

Throughput:

-15/-55%

(RC6 implementation)

Page 15: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

15

Operation ModesOperation Modes

(P1, P2) (P3, P4)

(C1, C2) (C3, C4)

(P1, P1) (P2, P2)

(C1, C1) (C2, C2)

(dummy, P1) (P1, P2)

(dummy, C1) (C1, C2)

Single: the unit uses the DDR computation to Single: the unit uses the DDR computation to

improve its throughput and no check is improve its throughput and no check is

performed on dataperformed on data

Double: the unit uses the DDR computation Double: the unit uses the DDR computation

to compute each round twice, checking for to compute each round twice, checking for

inconsistenciesinconsistencies

Interleaved: like the Interleaved: like the DoubleDouble mode, but the first mode, but the first

and second repetition are processed together and second repetition are processed together

with two different (consecutive) blocks in ECB with two different (consecutive) blocks in ECB

mode, which share the encryption keymode, which share the encryption key

Page 16: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

16

Synthesis ComparisonSynthesis Comparison

Throughput* Technology/ Area

Page 17: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

17

SEU Injection CampaignSEU Injection Campaign

�� Where?Where?�� Linear layer (multiplication, row rotation, and key addition)Linear layer (multiplication, row rotation, and key addition)

�� NonNon--linear substitution layer: inner and outer pipeline stagelinear substitution layer: inner and outer pipeline stage

�� Control unit Control unit

�� AES is highly regular: only one target element for each data patAES is highly regular: only one target element for each data path locationh location

�� When?When?�� Every computation clock cycleEvery computation clock cycle

�� Input or output phase were not consideredInput or output phase were not considered

�� From 1 up to several clock cycles (twice the length of a round)From 1 up to several clock cycles (twice the length of a round)

�� What?What?�� According to the fault model for laser attacks, the error value According to the fault model for laser attacks, the error value is is notnot

controllablecontrollable

�� Exhaustive search of all error values is carried on for each tarExhaustive search of all error values is carried on for each targeted area geted area (e.g., all byte values for a byte target)(e.g., all byte values for a byte target)

Page 18: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

18

Hardware EmulationHardware Emulation

�� Fault injection was based on hardware Fault injection was based on hardware emulationemulation

�� 33--level environment: host PC, level environment: host PC, embedded CPU, hardwareembedded CPU, hardware

�� Injection software ran on the FPGA Injection software ran on the FPGA PowerPCPowerPC�� Reduced communication, thus faster Reduced communication, thus faster

execution of the campaign due to less execution of the campaign due to less wasted timewasted time

�� Different tasks can be distributed at Different tasks can be distributed at any level: hw logic, FPGA PPC, host any level: hw logic, FPGA PPC, host

�� Fault emulation is at hw level onlyFault emulation is at hw level only

�� Extra logic is added to the original Extra logic is added to the original AES descriptionAES description�� For each targeted flipFor each targeted flip--flop, one XOR flop, one XOR

is inserted between the FF and the is inserted between the FF and the combination block at its inputcombination block at its input

Memory

(RAM)

FPGA

Memory

(RAM)

Programmable logic

Instrumented

AES

encryption IP

HW

Interface

Micro-

processor

Memory

Board

Host PC

FFCombinational logic

Injected error

Observed value

Acknowledgment: thanks to Pierre Vanhauwaert for his injection tool ☺

Page 19: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

19

11--Cycle Error CoverageCycle Error Coverage

Protected targets:Protected targets:

Non protected targets:Non protected targets:

000084.2684.2615.7415.7466FSM FSM SynchrSynchr

90.5290.524.924.920.200.204.364.3699Aux FSMAux FSM

81.6381.631.871.8716.3016.30001919Main FSMMain FSM

26.1526.152.912.9153.2753.2717.6817.6833Key Key ctrlsctrls

69.8669.8627.4527.452.452.450.240.242222Misc Misc ctrlsctrls

47.3447.3450.7250.720.060.061.881.882424Inner SBoxInner SBox

32.2032.2033.9033.900033.9033.9016*16*SBox OutputSBox Output

33.9033.90000066.1066.1016*16*Linear layerLinear layer

DetectedDetectedFalse PosFalse PosUndetectedUndetectedSilentSilentSize (bits)Size (bits)LocationLocation

Result Class [%]Result Class [%]Instrumented TargetInstrumented Target

* Full search on single byte (8-bit target) gave the same results

Page 20: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

20

Protections beyond DDR: ControllerProtections beyond DDR: Controller

If an FSM performs an erroneous transition (e.g., If an FSM performs an erroneous transition (e.g.,

Idle > Output), the error signal is raised and the Idle > Output), the error signal is raised and the

machines go back to their reset statemachines go back to their reset state

Verify state transitionsVerify state transitions

If any FSM falls into a nonIf any FSM falls into a non--existing state existing state

encoding, computation stops and they both encoding, computation stops and they both

return to the reset statereturn to the reset state

Validate state encodingValidate state encoding

Specific registers are duplicated to ensure correct Specific registers are duplicated to ensure correct

behavior (e.g., counters, state registers)behavior (e.g., counters, state registers)Protect sensitive targetsProtect sensitive targets

Simplify the controller removing redundant Simplify the controller removing redundant

registers, which store signals that could be registers, which store signals that could be

instead computed on the flyinstead computed on the fly

Reduce the number of Reduce the number of

possible targetspossible targets

Page 21: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

21

AES DDR AES DDR –– Fault Injection ResultsFault Injection Results

Undetected Faults

1,E-08

1,E-07

1,E-06

1,E-05

1,E-04

1,E-03

1,E-02

1,E-01

1,E+00

1 2 3 4 5 6 7 8 9

Fault duration [cycles]

Undetection probability

DC 1 Byte DC 2B DC 2B Low SB Out SB Int

CU 1 bit CU Misc CU Count CU FSM Lo CU FSM Hi

�� Linear layer (DC): not detected Linear layer (DC): not detected

only when both copies affectedonly when both copies affected

�� Negligible percentage of Negligible percentage of

undetected faults in FSMsundetected faults in FSMs

�� Miscellaneous control signals Miscellaneous control signals

not detected with 0.1not detected with 0.1--0.3%0.3%

�� SS--Box outputs not detected for Box outputs not detected for

faults longer than 5 cyclesfaults longer than 5 cycles

�� Internal SInternal S--Box detection Box detection

affected by sharing with key unit affected by sharing with key unit

and by double computationand by double computation

Page 22: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

22

A Similar (Smaller) DesignA Similar (Smaller) Design

�� 3232--bit databit data--pathpath

�� 4 Substitution Boxes4 Substitution Boxes

�� 4 GF Multipliers for 4 GF Multipliers for

MixColumnsMixColumns

�� 9 clock cycles per round9 clock cycles per round

�� OnOn--thethe--fly key unrollingfly key unrolling

State

MixColumns and AddRoundKey

2-stage SBox

Register layer

Combinatorial logic

8-bit signal

32-bit signal<<<

Output

Input

SBOX SBOX SBOX SBOX

From Key Unit

To Key Unit

Page 23: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

23

Fault Injection in Fault Injection in ““SmallSmall”” DUDU

Page 24: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

24

Fault Injection in Fault Injection in ““SmallSmall”” CUCU

Page 25: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

25

Coverage versus CostsCoverage versus Costs

[20] Multiple parity bits, improved Sbox

[22] [22] CubicCubic network network

[14] Inverse operation

[18] Single parity bit per block

[19] Multiple parity bits

Page 26: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

26

Taking stock so farTaking stock so far……

�� DDR is very effective against realistic fault modelsDDR is very effective against realistic fault models�� Currently trying to break an AES implementation by glitch attackCurrently trying to break an AES implementation by glitch attacks and s and

laser injectionslaser injections

�� DDR (and parity) apply to data path only, control unit must be DDR (and parity) apply to data path only, control unit must be addressed with other protection meansaddressed with other protection means

�� DDRDDR�� Coverage of the data path is almost 100% for multipleCoverage of the data path is almost 100% for multiple--bit errorsbit errors

�� MultiMulti--cycle faults are reasonably coveredcycle faults are reasonably covered

�� Permanent faults are not detected, other solutions may be requirPermanent faults are not detected, other solutions may be required ed (parity, revived?)(parity, revived?)

�� Tailored attacks against DDR are not detected Tailored attacks against DDR are not detected �� The attacker must be able to inject the same error value in the The attacker must be able to inject the same error value in the same same

location at very specific time slots: very difficult and unlikellocation at very specific time slots: very difficult and unlikely with current y with current attack capabilitiesattack capabilities

�� But what about testing such a complex circuit?But what about testing such a complex circuit?

Page 27: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

27

�� Due to DFA, cryptographic architectures must be fully Due to DFA, cryptographic architectures must be fully functional and thoroughly testedfunctional and thoroughly tested

�� Testing techniques are severely limited due to security Testing techniques are severely limited due to security issuesissues�� No scan chains allowed, they are a potential security breachNo scan chains allowed, they are a potential security breach

�� Testing and security are opposed conceptsTesting and security are opposed concepts�� Key issues are Key issues are controllabilitycontrollability and and observabilityobservability

�� Alternative testing approaches are neededAlternative testing approaches are needed�� Internal state of the circuit must be kept confidential!Internal state of the circuit must be kept confidential!

�� BuiltBuilt--in selfin self--tests may be a good solutiontests may be a good solution

Testability in CryptographyTestability in Cryptography

Page 28: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

28

BuiltBuilt--In Self TestIn Self Test

�� AES as LFSRAES as LFSR�� The output of an encryption is highly uncorrelated and can be usThe output of an encryption is highly uncorrelated and can be used ed

instead of a linear feedback shift registerinstead of a linear feedback shift register

�� Built In Self TestBuilt In Self Test�� Additional logic is used only to drive the device into test modeAdditional logic is used only to drive the device into test mode, feed the , feed the

proper data, and analyze the resultsproper data, and analyze the results

�� Control path must be verified with dedicated BIST logicControl path must be verified with dedicated BIST logic

�� SoftwareSoftware--based self test (SBST) can be effective and save silicon based self test (SBST) can be effective and save silicon areaarea

�� Some concurrent error detection schemes may be used to Some concurrent error detection schemes may be used to perform online checkingperform online checking

�� What if the cryptographic device acts as a coprocessor? Can What if the cryptographic device acts as a coprocessor? Can the existing CPU be exploited to test the existence of faults the existing CPU be exploited to test the existence of faults in the in the cryptochipcryptochip??

Page 29: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

29

Software Based Self Test (SBST)Software Based Self Test (SBST)

Memory CPU

AES

Key

Unit

Data

UnitControl

�� CryptoprocessorCryptoprocessor works works as a as a coprocessorcoprocessor to aid to aid main CPUmain CPU

�� No additional No additional hardware:hardware: circuit ascircuit as--is, is, just rely on encryption just rely on encryption routinesroutines

�� Self test is a small piece Self test is a small piece of of softwaresoftware in memoryin memory�� Initial test input values Initial test input values

are stored in memoryare stored in memory

�� AES processes dataAES processes data

�� The resulting signature The resulting signature is verified by the CPUis verified by the CPU

Page 30: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

30

Test MethodologyTest Methodology

�� Transparent with respect to the cipher implementationTransparent with respect to the cipher implementation

�� Instruct the coprocessor to encrypt the provided dataInstruct the coprocessor to encrypt the provided data

�� Encryption and Encryption and decryptiondecryption

�� A few enc/A few enc/decdec cycles cycles for better coveragefor better coverage

�� First all encryptions, First all encryptions, then all decryptionsthen all decryptions

�� Encryption key used Encryption key used also for decryptionalso for decryption

�� Check only the last Check only the last result (result (ieie, signature), signature)

CPU reads

input values

from memory

CPU compares

final result with

stored signature

ENC

ENC

ENC

DEC

DEC

DEC

CPU

Still using

enc key!

Cryptoprocessor

Page 31: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

31

Test Setup: Error Detection SchemesTest Setup: Error Detection Schemes

Error Detecting scheme are developed against fault attacksError Detecting scheme are developed against fault attacks�� Always ON, hence they can be used also during self testAlways ON, hence they can be used also during self test

Considered architecturesConsidered architectures

�� BasicBasic�� No error detection mechanism, simple and plainNo error detection mechanism, simple and plain

�� ParityParity�� Data path is extended with one parity bit for each byteData path is extended with one parity bit for each byte

�� Prediction uses dedicated logic and is computed in parallel withPrediction uses dedicated logic and is computed in parallel with normal normal computationcomputation

�� Simple, cheap, and effective against natural faultsSimple, cheap, and effective against natural faults

�� DDRDDR�� A few slides earlierA few slides earlier……

Page 32: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

32

Results Results –– Data UnitData Unit

�� Undetected faults due to lack of observability and controllabiliUndetected faults due to lack of observability and controllabilityty

�� Two encryption/decryption cycles are enough to detect every Two encryption/decryption cycles are enough to detect every fault in the functional partfault in the functional part

�� Undetected faults are in the detection logicUndetected faults are in the detection logic

�� DDR design is designed to halt when an error is detected DDR design is designed to halt when an error is detected �� Results are worse when using several test cyclesResults are worse when using several test cycles

Undetected faults in data unit

0%

1%

2%

3%

4%

5%

6%

Basic Parity DDR

1 testcycle

2 testcycles

3 testcycles

Page 33: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

33

Results Results –– Key UnitKey Unit

�� Key unit always contains untestable faults due to lack of Key unit always contains untestable faults due to lack of observabilityobservability

�� Number of masked faults depends on the initial input textNumber of masked faults depends on the initial input text

�� Checking several signature may increase the fault coverageChecking several signature may increase the fault coverage

�� Parity bits may increase the observability of the key, thus Parity bits may increase the observability of the key, thus increasing the fault coverageincreasing the fault coverage

Undetected faults in key unit

0%

5%

10%

15%

20%

25%

Basic Parity DDR

1 testcycle

2 testcycles

3 testcycles

Page 34: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

34

Results Results –– Control UnitControl UnitUndetected faults in control path

0%

2%

4%

6%

8%

10%

12%

14%

Basic Parity DDR

1 testcycle

2 testcycles

3 testcycles

�� Control logic always contains untestable faults due to feedback Control logic always contains untestable faults due to feedback loops and shared combinational logicloops and shared combinational logic

�� The fault coverage is almost independent of the number of The fault coverage is almost independent of the number of encryption/decryption cyclesencryption/decryption cycles

�� Fault coverage depends mainly on the encoding of the finite Fault coverage depends mainly on the encoding of the finite state machinesstate machines

83.45%1891142TwoTwo--HotHot

93.65%40630OneOne--HotHot

89.96%86840BinaryBinary

Fault Fault

coveragecoverageUndetUndet

faultsfaults# faults# faults

Page 35: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

35

To concludeTo conclude……

�� The main processor can be used to run a small testing algorithm;The main processor can be used to run a small testing algorithm;

the cryptographic device is input generator, oracle, and DUT at the cryptographic device is input generator, oracle, and DUT at the same timethe same time

�� With no additional hardware, this approach allows to test the With no additional hardware, this approach allows to test the encryption data path almost completelyencryption data path almost completely�� Lack of observability for the detection mechanismsLack of observability for the detection mechanisms

�� Control logic and key scheduler are quite difficult to test, sinControl logic and key scheduler are quite difficult to test, since ce they are not directly observable at the outputthey are not directly observable at the output�� Observability of the key may be improvedObservability of the key may be improved

�� Different test methodologies are being exploredDifferent test methodologies are being explored�� Different operation orderingDifferent operation ordering

�� Different operation protocolsDifferent operation protocols

�� Multiple signaturesMultiple signatures

�� ……

Page 36: Advanced Encryption Standard Secured Architecture and its ...async/CCIS/talks_08/Paolo_Maistri.pdf · instead of a linear feedback shift register Built In Self Test Additional logic

36

QuestionsQuestions