41
Indian Institute of Technology Kharagpur Telecom ParisTech From Theory to Practice: Private Circuit and Its Ambush Debapriya Basu Roy, Shivam Bhasin, Sylvain Guilley, Jean-Luc Danger and Debdeep Mukhopadhyay 20/01/2015 Debapriya Basu Roy, Weekly Presentation 1/20

From Theory to Practice: Private Circuit and Its Ambush—Debapriya

Embed Size (px)

Citation preview

Indian Institute of Technology Kharagpur

Telecom ParisTech

From Theory to Practice: Private Circuit and Its Ambush

Debapriya Basu Roy, Shivam Bhasin, Sylvain Guilley, Jean-LucDanger and Debdeep Mukhopadhyay

20/01/2015 Debapriya Basu Roy, Weekly Presentation 1/20

Introduction

Side Channel: Information leakage from the implementation

Probing Attack

Power attack and EM radiation attack

t-Private Circuit: Countermeasure design with sound theoretical proof

Overhead: O(nt2), n is the number of gates in the circuit.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20

Introduction

Side Channel: Information leakage from the implementation

Probing Attack

Power attack and EM radiation attack

t-Private Circuit: Countermeasure design with sound theoretical proof

Overhead: O(nt2), n is the number of gates in the circuit.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20

Introduction

Side Channel: Information leakage from the implementation

Probing Attack

Power attack and EM radiation attack

t-Private Circuit: Countermeasure design with sound theoretical proof

Overhead: O(nt2), n is the number of gates in the circuit.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20

Introduction

Side Channel: Information leakage from the implementation

Probing Attack

Power attack and EM radiation attack

t-Private Circuit: Countermeasure design with sound theoretical proof

Overhead: O(nt2), n is the number of gates in the circuit.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20

Introduction

Side Channel: Information leakage from the implementation

Probing Attack

Power attack and EM radiation attack

t-Private Circuit: Countermeasure design with sound theoretical proof

Overhead: O(nt2), n is the number of gates in the circuit.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 2/20

Related Works

Masking: by product of private circuit to protect against first orderdifferential attacks.

Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.Designing block ciphers with reduced number of AND operations, forexample: PICARO.

Modifying private circuit for efficient FPGA implementation.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20

Related Works

Masking: by product of private circuit to protect against first orderdifferential attacks.

Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.

Designing block ciphers with reduced number of AND operations, forexample: PICARO.

Modifying private circuit for efficient FPGA implementation.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20

Related Works

Masking: by product of private circuit to protect against first orderdifferential attacks.

Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.Designing block ciphers with reduced number of AND operations, forexample: PICARO.

Modifying private circuit for efficient FPGA implementation.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20

Related Works

Masking: by product of private circuit to protect against first orderdifferential attacks.

Reducing Overhead: From O(nt2) to O(nt), further reducing it todt/2e.Designing block ciphers with reduced number of AND operations, forexample: PICARO.

Modifying private circuit for efficient FPGA implementation.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 3/20

Motivation

Private circuit is based on sound theoretical proof but with someinherent assumptions which may not be valid in practical scenario.

Theoretical analysis of private circuit for power analysis in presence ofglitches has been studied.

However, no practical evaluation of private circuit is present in theliterature

20/01/2015 Debapriya Basu Roy, Weekly Presentation 4/20

Motivation

Private circuit is based on sound theoretical proof but with someinherent assumptions which may not be valid in practical scenario.

Theoretical analysis of private circuit for power analysis in presence ofglitches has been studied.

However, no practical evaluation of private circuit is present in theliterature

20/01/2015 Debapriya Basu Roy, Weekly Presentation 4/20

Motivation

Private circuit is based on sound theoretical proof but with someinherent assumptions which may not be valid in practical scenario.

Theoretical analysis of private circuit for power analysis in presence ofglitches has been studied.

However, no practical evaluation of private circuit is present in theliterature

20/01/2015 Debapriya Basu Roy, Weekly Presentation 4/20

Contribution

We identify the practical scenarios in which private circuit may fail toprovide us the desired security.

We actually try to identify the lazy engineering practices which canrattle the security of private circuit.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 5/20

Contribution

We identify the practical scenarios in which private circuit may fail toprovide us the desired security.

We actually try to identify the lazy engineering practices which canrattle the security of private circuit.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 5/20

Contribution

We have implemented a lightweight block cipher SIMON usingprivate circuit methodology on SASEBO-GII board.

The implemented private circuits are analyzed against SCA using EMtraces and correlation power analysis.

Moreover, we have used Test Vector Leakage Assessment (TVLA)methodology based leakage detection to classify our design as sidechannel secure or not.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 6/20

Contribution

We have implemented a lightweight block cipher SIMON usingprivate circuit methodology on SASEBO-GII board.

The implemented private circuits are analyzed against SCA using EMtraces and correlation power analysis.

Moreover, we have used Test Vector Leakage Assessment (TVLA)methodology based leakage detection to classify our design as sidechannel secure or not.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 6/20

Contribution

We have implemented a lightweight block cipher SIMON usingprivate circuit methodology on SASEBO-GII board.

The implemented private circuits are analyzed against SCA using EMtraces and correlation power analysis.

Moreover, we have used Test Vector Leakage Assessment (TVLA)methodology based leakage detection to classify our design as sidechannel secure or not.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 6/20

Contribution

Optimized SIMON: SIMON is implemented according to the ISWscheme, but the design tool is free to optimize the circuit. This is anexample of lazy engineering approach,

2-input LUT based SIMON: Here, to mimic the private circuitmethodology exactly on the FPGA, we have constrained the designtool to map each two-input gate to a single LUT. In other words,though a LUT has six inputs, it is modeled as two-input gate andgate-level optimization is minimized.

Synchronized 2-input LUT based SIMON: This is nearly similar to theprevious methodology. The only difference is that each gate or LUT ispreceded and followed by flip-flops so that each and every input tothe gates is synchronized and glitches are minimized .

20/01/2015 Debapriya Basu Roy, Weekly Presentation 7/20

Contribution

Optimized SIMON: SIMON is implemented according to the ISWscheme, but the design tool is free to optimize the circuit. This is anexample of lazy engineering approach,

2-input LUT based SIMON: Here, to mimic the private circuitmethodology exactly on the FPGA, we have constrained the designtool to map each two-input gate to a single LUT. In other words,though a LUT has six inputs, it is modeled as two-input gate andgate-level optimization is minimized.

Synchronized 2-input LUT based SIMON: This is nearly similar to theprevious methodology. The only difference is that each gate or LUT ispreceded and followed by flip-flops so that each and every input tothe gates is synchronized and glitches are minimized .

20/01/2015 Debapriya Basu Roy, Weekly Presentation 7/20

Contribution

Optimized SIMON: SIMON is implemented according to the ISWscheme, but the design tool is free to optimize the circuit. This is anexample of lazy engineering approach,

2-input LUT based SIMON: Here, to mimic the private circuitmethodology exactly on the FPGA, we have constrained the designtool to map each two-input gate to a single LUT. In other words,though a LUT has six inputs, it is modeled as two-input gate andgate-level optimization is minimized.

Synchronized 2-input LUT based SIMON: This is nearly similar to theprevious methodology. The only difference is that each gate or LUT ispreceded and followed by flip-flops so that each and every input tothe gates is synchronized and glitches are minimized .

20/01/2015 Debapriya Basu Roy, Weekly Presentation 7/20

Preliminaries

SIMON

In 2013, NSA had introduced two ultra-lightweight block cipher SIMONand SPECK with a Feistel construction. Out of the two block ciphers,SIMON is more suited for hardware implementations. SIMON can encrypta block of 2k bits, with a key of m · k bits.

TVLA

TVLA consists in operating the device under test with a fixed and chosenkey. Then, a T-test is applied on both sets of measurements. Similardifference testing can be performed on intermediate values of the blockcipher and also on each bit of that intermediate value.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 8/20

Preliminaries

SIMON

In 2013, NSA had introduced two ultra-lightweight block cipher SIMONand SPECK with a Feistel construction. Out of the two block ciphers,SIMON is more suited for hardware implementations. SIMON can encrypta block of 2k bits, with a key of m · k bits.

TVLA

TVLA consists in operating the device under test with a fixed and chosenkey. Then, a T-test is applied on both sets of measurements. Similardifference testing can be performed on intermediate values of the blockcipher and also on each bit of that intermediate value.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 8/20

t-Private Circuit

Input Encoding: A vector of (a1, a2, ..., a2t , a2t+1)

a2t+1 = a⊕2t⊕i=1

ai .

NOT gate: a = (a1, a2, ..., a2t+1). `a = (a1, a2, ..., a2t+1).

Xor gate:ci = ai ⊕ bi , 1 ≤ i ≤ 2t .

20/01/2015 Debapriya Basu Roy, Weekly Presentation 9/20

t-Private Circuit

Input Encoding: A vector of (a1, a2, ..., a2t , a2t+1)

a2t+1 = a⊕2t⊕i=1

ai .

NOT gate: a = (a1, a2, ..., a2t+1). `a = (a1, a2, ..., a2t+1).

Xor gate:ci = ai ⊕ bi , 1 ≤ i ≤ 2t .

20/01/2015 Debapriya Basu Roy, Weekly Presentation 9/20

t-Private Circuit

Input Encoding: A vector of (a1, a2, ..., a2t , a2t+1)

a2t+1 = a⊕2t⊕i=1

ai .

NOT gate: a = (a1, a2, ..., a2t+1). `a = (a1, a2, ..., a2t+1).

Xor gate:ci = ai ⊕ bi , 1 ≤ i ≤ 2t .

20/01/2015 Debapriya Basu Roy, Weekly Presentation 9/20

t-Private Circuit

AND gate: Inputs a = (a1, a2, ..., a2t+1) and b = (b1, b2, ..., b2t+1),output c = (c1, c2, ..., c2t+1), which is calculated by following steps:

1 Generate random bits ri,j , where i 6= j and 1 ≤ i ≤ j ≤ 2t + 1.2 Compute rj,i = (ri,j ⊕ aibj)⊕ ajbi , where i 6= j and 1 ≤ i ≤ j ≤ 2t + 1.3 Compute ci = aibi ⊕

⊕j 6=i ri,j , where 1 ≤ i ≤ 2t and 1 ≤ j ≤ 2t.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 10/20

AND Gate Example

Inputs of the AND gate are two vectors a = (a1, a2, a3) andb = (b1, b2, b3), Output c = (c1, c2, c3) is calculated as follows:

c1 = a1b1 ⊕ r1,2 ⊕ r1,3 (1)

c2 = a2b2 ⊕ (r1,2 ⊕ a1b2)⊕ a2b1 ⊕ r2,3 (2)

c3 = a3b3 ⊕ (r1,3 ⊕ a1b3)⊕ a3b1 ⊕ (r2,3 ⊕ a2b3)⊕ a3b2 (3)

20/01/2015 Debapriya Basu Roy, Weekly Presentation 11/20

CAD Optimization

Figure: t = 1 private circuit for AND third coordinate on 4-input LUTs

4 input LUT

4 input LUT

4 input LUT

unconnected

a3

b3

a2

a1

r1,3

a3b3 ⊕ a3b1 ⊕ a3b2 = a3(b) => Leakage

r2,3

a1b3 ⊕ a2b3 ⊕ r1,3

c3

b3

b2

b1

{p(b = 0|x = 0) = 2/3,

p(b = 1|x = 0) = 1/3,and

{p(b = 0|x = 1) = 0,

p(b = 1|x = 1) = 1.(4)

20/01/2015 Debapriya Basu Roy, Weekly Presentation 12/20

CAD Optimization

Figure: t = 1 private circuit for AND third coordinate on 4-input LUTs

4 input LUT

4 input LUT

4 input LUT

unconnected

a3

b3

a2

a1

r1,3

a3b3 ⊕ a3b1 ⊕ a3b2 = a3(b) => Leakage

r2,3

a1b3 ⊕ a2b3 ⊕ r1,3

c3

b3

b2

b1

{p(b = 0|x = 0) = 2/3,

p(b = 1|x = 0) = 1/3,and

{p(b = 0|x = 1) = 0,

p(b = 1|x = 1) = 1.(4)

20/01/2015 Debapriya Basu Roy, Weekly Presentation 12/20

Delay in Random Variables

There are two ways in which random variables can be provided to theprivate circuit: as external input or from a Random Number generator(RNG). Generally, random numbers are provided to the circuit froman RNG.

c3 = a3b3 ⊕ (r1,3 ⊕ a1b3)⊕ a3b1 ⊕ (r2,3 ⊕ a2b3)⊕ a3b2 (5)

Delay in the arrival of random bits r1,3, r2,3, a1 and a2 lead toinformation leakage.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 13/20

Delay in Random Variables

There are two ways in which random variables can be provided to theprivate circuit: as external input or from a Random Number generator(RNG). Generally, random numbers are provided to the circuit froman RNG.

c3 = a3b3 ⊕ (r1,3 ⊕ a1b3)⊕ a3b1 ⊕ (r2,3 ⊕ a2b3)⊕ a3b2 (5)

Delay in the arrival of random bits r1,3, r2,3, a1 and a2 lead toinformation leakage.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 13/20

Experimental Setup

A parallel implementation SIMON32/64 crypto-core, running at clockfrequency of 24-MHz, along with a simple UART interface is used totest our design on the Xilinx Virtex XC5-VLX30 FPGA of theSASEBO-GII platform.

For t = 1, total number of random bits required by SIMON is 272,whereas for t = 2 and t = 3, number of required random bits become608 and 1008. Random numbers are generated by a maximal lengthLFSR.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 14/20

Experimental Setup

A parallel implementation SIMON32/64 crypto-core, running at clockfrequency of 24-MHz, along with a simple UART interface is used totest our design on the Xilinx Virtex XC5-VLX30 FPGA of theSASEBO-GII platform.

For t = 1, total number of random bits required by SIMON is 272,whereas for t = 2 and t = 3, number of required random bits become608 and 1008. Random numbers are generated by a maximal lengthLFSR.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 14/20

Result: Optimized Simon

(a) TVLA Plot

0 100 200 300 400 500 600 700 800 900 1000−20

−15

−10

−5

0

5

10

15

20

Sample Points

TV

LA

Valu

e

TVLA Value of Optimized Simon

Safe Value of TVLA

(b) Average KeyRanking

0 200 400 600 800 10001

2

3

4

5

6

7

Number of Traces/1000

Av

era

ge

Ke

y R

an

kin

g

(c) Correlation Value

0 10 20 30 40 50 60−0.015

−0.01

−0.005

0

0.005

0.01

0.015

Sample Points

Co

rre

lati

on

Va

lue

Wrong Key Guess

Correct Key Guess

20/01/2015 Debapriya Basu Roy, Weekly Presentation 15/20

Result: 2 input LUT based Simon

(a) TVLA Plot

0 200 400 600 800 1000−15

−10

−5

0

5

10

15

Sample Points

TV

LA

Valu

e

TVLA Value of 2 i/p LUT Simon

Safe Value of TVLA

(b) Average KeyRanking

0 200 400 600 800 10001

2

3

4

5

6

7

8

Number of Traces/1000

Av

era

ge

Ke

y R

an

kin

g

(c) Correlation Value

0 5 10 15 20 25 30 35 40 45 50−0.01

−0.008

−0.006

−0.004

−0.002

0

0.002

0.004

0.006

0.008

0.01

Sample Points

Co

rre

lati

on

Va

lue

Wrong Key Guess

Correct Key Guess

20/01/2015 Debapriya Basu Roy, Weekly Presentation 16/20

Synchronized 2-input LUT based SIMON

(a) TVLA Plot

0 200 400 600 800 1000 1200 1400 1600 1800 2000−15

−10

−5

0

5

10

15

Sample Points

TV

LA

Valu

e

TVLA Value of Synchronized Simon

Safe Value of TVLA

First Round Output

(b) Avg. KeyRank P)

0 200 400 600 800 10002

3

4

5

6

Number of Traces/1000

Avera

ge K

ey R

an

kin

g

(c) CorrelationValue P

0 5 10 15 20 25 30 35 40 45 50−0.015

−0.01

−0.005

0

0.005

0.01

0.015

Sample Points

Co

rrela

tio

n V

alu

e

Wrong Key Guess

Correct Key Guess

(d) Avg. KeyRank 1

0 200 400 600 800 10001

2

3

4

5

6

7

Number of Traces/1000

Avera

ge K

ey R

an

kin

g

(e) CorrelationValue 1

0 10 20 30 40 50 60−0.015

−0.01

−0.005

0

0.005

0.01

0.015

Sample Points

Co

rrela

tio

n V

alu

e

Wrong Key Guess

Correct Key Guess

Figure: Side Channel Analysis of Synchronized 2 input LUT SIMON

20/01/2015 Debapriya Basu Roy, Weekly Presentation 17/20

Attack Summary

Table: Summary of Side Channel Analysis

Design TVLA Avg. Key RemarksName Test Ranking

Optimized Fails, significant Key ranking is low, NotSIMON information leakage successful attack secure

2 input LUT Fails, but less Key ranking is high, Secure againstbased SIMON information leakage attack is not CPA, could be

compared to optimized successful broken bySIMON better model

Synchronized Passes: no leakage Key ranking is high, Secure2 input LUT at first round. Initial attack is notbased Simon peaks are caused by successful

plain-text loading

20/01/2015 Debapriya Basu Roy, Weekly Presentation 18/20

Resource Comparison

Name LUTs Registers Slices Freq. Clock(MHz) Cycles

Optimized 761 805 595 147 32SIMON (1×) (1×) (1×) (1×) (1×)

2 i/p LUT 1305 805 1241 88 32based SIMON (1.71×) (1×) (2.08×) (0.59×) (1×)Synchronized

2 i/p LUT 1309 2920 4090 104 288based SIMON (1.71×) (3.62×) (6.87×) (0.70×) (9×)

20/01/2015 Debapriya Basu Roy, Weekly Presentation 19/20

Conclusion

We analyzed private circuits at an implementation level on a SIMONcrypto-processor. Our results show that it is very easy for a CAD toolto override the basic requirements of private circuits.

Practical evaluations indicate that with proper constraints the leakagecan be reduced. Moreover, by synchronizing each gate, we removeglitches and delay and approach much closer to theoretical evaluationof private circuits, but at a huge overhead.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 20/20

Conclusion

We analyzed private circuits at an implementation level on a SIMONcrypto-processor. Our results show that it is very easy for a CAD toolto override the basic requirements of private circuits.

Practical evaluations indicate that with proper constraints the leakagecan be reduced. Moreover, by synchronizing each gate, we removeglitches and delay and approach much closer to theoretical evaluationof private circuits, but at a huge overhead.

20/01/2015 Debapriya Basu Roy, Weekly Presentation 20/20