14
Arithmetic Tradeoffs on Performance/Cost/Security for Hardware Asymmetric Cryptography Arnaud Tisserand CNRS, Lab-STICC laboratory CEA Seminar, July 2017 Hardware Security Group at Lab-STICC 8 faculties and 12 PhD students / postdocs / ATER / engineers Hardware security for embedded systems: I memory and communication protection I secure OS with HW blocks, DIFT I multicore / manycore security Crypto implementations in hardware & embedded software: I asymmetric (RSA, (H)ECC, PQC) I arithmetic aspects (operators, libraries) I homomorphic encryption Secure hardware implementation: I side channel and fault injection attacks and protections I targets: FPGA and ASIC (reconfigurable, CGRA, ASIP) I high-level synthesis (HLS) for security Lab-STICC: Laboratoire des Sciences et Techniques de l’Information, de la Communication et de la Connaissance MOCS: ethodes, Outils, Circuits et Syst` emes Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 2/53 Skills (1/2) Optimized hardware arithmetic operators: I low-power operators (x ± y , x · y , 1/x , x , 1/ p x 2 + y 2 , n i =0 x i y i ,...) I multiplication by constants (scalar, vector, matrix) I advanced computation algorithms I advanced representations of numbers I function approximation (sin(x ), cos(x ), exp(x ), log(x ), tan(x ),...) I modular and finite fields arithmetic F p and F 2 m I fault tolerance (or detecting) operators I FPGAs and ASICs targets Tools for hardware arithmetic circuits: I operators generators I arithmetic circuits with bounded errors Software arithmetic/computation libraries: I (public-key) cryptography I floating-point emulation on integer processor I multiprecision computations (up to millions of bits) I embedded processors, multi-cores and GPUs targets Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 3/53 Skills (2/2) Hardware accelerators for crypto. applications: I public-key crypto.: RSA, (H)ECC I private-key: AES, (3)DES I hash functions: SHAx (multi-mode) Crypto-processor for (Hyper)-Elliptic Curve Cryptography: I arithmetic operators over F p and F 2 m , typically 100–600 bits I optimized architectures, algorithms and number representations Software libraries for arithmetic and cryptography: I ECC library for GPUs and embedded processors I RNS library for homomorphic encryption in multicores Study and implementation of protections against physical attacks: I Passive: power consumption, electromagnetic radiations, timings I Active: fault injection (in progress ) Levels: arithmetic algorithms, numbers/objects representations, operators, architectures, circuit optimizations Trade-offs between: performance, cost (area/energy), security True random number generators (TRNGs) Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 4/53

Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Embed Size (px)

Citation preview

Page 1: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Arithmetic Tradeoffs on Performance/Cost/Securityfor Hardware Asymmetric Cryptography

Arnaud Tisserand

CNRS, Lab-STICC laboratory

CEA Seminar, July 2017

Hardware Security Group at Lab-STICC

8 faculties and ≈ 12 PhD students / postdocs / ATER / engineers

• Hardware security for embedded systems:

I memory and communication protectionI secure OS with HW blocks, DIFTI multicore / manycore security

• Crypto implementations in hardware & embedded software:

I asymmetric (RSA, (H)ECC, PQC)I arithmetic aspects (operators, libraries)I homomorphic encryption

• Secure hardware implementation:

I side channel and fault injection attacks and protectionsI targets: FPGA and ASIC (reconfigurable, CGRA, ASIP)I high-level synthesis (HLS) for security

Lab-STICC: Laboratoire des Sciences et Techniques de l’Information, de la Communication et de la Connaissance

MOCS: Methodes, Outils, Circuits et Systemes

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 2/53

Skills (1/2)• Optimized hardware arithmetic operators:

I low-power operators (x ± y , x · y , 1/x ,√

x , 1/√

x2 + y 2,∑n

i=0 xi yi , . . .)I multiplication by constants (scalar, vector, matrix)I advanced computation algorithmsI advanced representations of numbersI function approximation (sin(x), cos(x), exp(x), log(x), tan(x), . . .)I modular and finite fields arithmetic Fp and F2m

I fault tolerance (or detecting) operatorsI FPGAs and ASICs targets

• Tools for hardware arithmetic circuits:I operators generatorsI arithmetic circuits with bounded errors

• Software arithmetic/computation libraries:I (public-key) cryptographyI floating-point emulation on integer processorI multiprecision computations (up to millions of bits)I embedded processors, multi-cores and GPUs targets

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 3/53

Skills (2/2)• Hardware accelerators for crypto. applications:

I public-key crypto.: RSA, (H)ECCI private-key: AES, (3)DESI hash functions: SHAx (multi-mode)

• Crypto-processor for (Hyper)-Elliptic Curve Cryptography:I arithmetic operators over Fp and F2m , typically 100–600 bitsI optimized architectures, algorithms and number representations

• Software libraries for arithmetic and cryptography:I ECC library for GPUs and embedded processorsI RNS library for homomorphic encryption in multicores

• Study and implementation of protections against physical attacks:I Passive: power consumption, electromagnetic radiations, timingsI Active: fault injection (in progress)

• Levels: arithmetic algorithms, numbers/objects representations,operators, architectures, circuit optimizations

• Trade-offs between: performance, cost (area/energy), security

• True random number generators (TRNGs)

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 4/53

Page 2: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

(Hyper-)Elliptic Curve Cryptography (H)ECC

• Finite field Fp:integer arithmetic modulo large prime p

• Elliptic curve over Fp:

E : y 2 = x3 + ax + b

• Points on the curve E :

P = (x1, y1),Q = (x2, y2),R = (x3, y3)

• Set of points on E :I finite (large # about p)I “forms” an abelian groupI group law addition on points

• Two operations:I Point addition: P + Q → R

I Point doubling: P + P = [2]P → R

Fp elements are very large:100–600 bits!

y 2 = x3 + 4x + 20 over F1009

denoted ADD

denoted DBL

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 5/53

Curve Level and Field Level Operations

point

addition doubling tripling quintupling septupling . . .

ADD DBL TPL QPL SPL . . .

P + Q [2]P [3]P [5]P [7]P . . .

if = =

P 6= ±Q P + P P + P + P P + · · ·+ P P + · · ·+ P . . .

Operation at curve level sequence of ≈10–20 Fq operations

Fq operations: add/sub, multiplication M, square S, inversion I

[k]P

ADD, DBL, . . .

M, S, I in Fp

one scalar multiplication

hundreds of curve op.

thousands of field op.

clock cycles

DBL

M . . . M S . . .

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 6/53

Costs of Curve Level OperationsBest computation costs from literature and curves over Fp

a curve-level operations

−3 refs. ADD mADD DBL TPL QPL SPL

6=EFD 11M + 5S 7M + 4S 1M + 8S 5M + 10S n. a. n. a.

[18] n. a. n. a. 1M + 8S 5M + 10S 7M + 16S 15M + 24S

[22] 11M + 5S 7M + 4S 2M + 8S 6M + 11S 9M + 15S 13M + 18S

=

EFD 11M + 5S 7M + 4S 3M + 5S 7M + 7S n. a. n. a.

[24] 11M + 5S 7M + 4S 3M + 5S 7M + 7S 11M + 11S 18M + 11S

[23][22] 11M + 5S 7M + 4S 3M + 5S 7M + 8S 10M + 12S 14M + 15S

refs. λDBL λTPL

6= [14][15][20] 4λM + (4λ+ 2)S (11λ− 1)M + (4λ+ 2)S

refs. λTPL / λ′DBL

6= [14][15] (11λ+ 4λ′ − 1)M + (4λ+ 4λ′ + 3)S

EFD: Explicit-Formulas Database http://hyperelliptic.org/EFD

mADD : A+ J −→ JArnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 7/53

ECC Scalar Multiplication

Q = [k]P = P + P + · · ·+ P︸ ︷︷ ︸k times

• main operation in ECC protocols

• P ∈ E

• k = (kn−1kn−2 . . . k1k0)2

• n = 160–600 bits

Double-and-add scalar multiplication algorithm:

1: Q ← O2: for i from n − 1 to 0 do3: Q ← [2]Q (DBL)4: if ki = 1 then Q ← Q + P (ADD)5: return Q

• Scans each bit of k and performs corresponding curve-level operation

• Average cost: 0.5n ADD + n DBL (security ≈ 0.5n ones in k)

• Security : Elliptic curve discrete logarithm problem (ECDLP)

given P and Q = [k]P, it is computationally unfeasible to obtain k

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 8/53

Page 3: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Basic Power Analysis Attack on ECC

encryption

signature

etc

pro

toco

lle

vel

[k]P

ADD(P,Q) DBL(P)

curv

ele

vel

x±y x×y . . .

fiel

dle

vel

circuit

VDD

GND

Itraces

DBL DBL DBL DBL DBL DBLADD ADD

0 0 0 1 1 0

Scalar multiplication operationfor i from 0 to t − 1 do

if ki = 1 then Q = ADD(P,Q)P = DBL(P)

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 9/53

Accelerator Specifications

encryption

signature

etc

pro

toco

lle

vel

HW

SW

HW[k]P

ADD(P,Q) DBL(P)

P + Pcurv

ele

vel

x±y x×y . . .

fiel

dle

vel

• Performances =⇒ hardware (HW)I dedicated functional unitsI internal parallelism

• Limited cost (embedded systems)I reduced silicon areaI low energy (& power consumption)I large area used at each clock cycle

• Flexibility =⇒ software (SW)I curves, algorithms, representations

(points/elements), k recoding, . . .I at design time / at run time

• Security against SCAs =⇒ HWI secure units (F2m , Fp)I secure key storage/managementI secure control

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 10/53

Accelerator Architecture

exte

rnal

inte

rfac

e

accelerator

interconnect

CTRL

codemem.

key

mn

g.

registerfile

FU1 FU2 FU3

Data: w -bit (32, . . . , 128) except for k digits, control: a few bits per unitArnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 11/53

Protected F2m Multipliers

Unprotected

0

50

100

150

200

250

0 100 200 300 400 500

#tr

an

sitio

ns

cycles

Mastrovito 233

200 225 250cycles

Protected

Overhead:Area/time < 10 %

References:PhD D. Pamula [29]Articles: [32], [31],[30]

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 12/53

Page 4: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Protected (Old) Accelerator for F2m

0 100 200 300

0 50 100 150 200 250 300 350

#tr

ansit.

cycles

DBL operationMastrovitoUnprotectedActivity trace

0.000.020.040.060.08

curr

ent [m

A]

DBL operationMastrovitoUnprotectedCurrent measures

0 100 200 300

#tr

ansit.

DBL operationMastrovitoProtectedActivity trace

0.000.040.080.120.16

cu

rren

t [m

A]

DBL operationMastrovitoProtectedCurrent measures

0 100 200 300

#tr

an

sit.

ADD operationMastrovitoProtectedActivity trace

Warning: old dedicated accelerator (similar behavior is expected for our new one)Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 13/53

Circuit-Level Protections for Arithmetic Operators

References: [12] and [13]

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 14/53

Comparison Architecture ECC 256 vs HECC 128 (1/2)

Implementations on Spartan 6 FPGAs without DSP slices

area

[slice

s]

time [ms]

ECC

HECC

600 800 1000 1200 1400 1600 1800 2000 2200

5

10

15

20

25

30

5,4

5,2

5,1

4,4

4,2

4,1

3,4

3,2

3,1

2,4

2,2

2,1

1,4

1,2

1,1

12,212,1 11,211,1 10,210,1 9,2

9,1

8,2

8,1

7,2

7,1

6,2

6,1

5,2

5,1

4,2

4,1

3,23,1

2,2

2,1

1,2

1,1

On average HECC is 40 % faster than ECC for a similar silicon cost

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 15/53

Comparison Architecture ECC 256 vs HECC 128 (2/2)

%u

sage

×ar

easp

eed

up

ECC HECC

020406080

1001

2

3012345

1,1 1,2 1,4 2,4 3,4 4,4 1,1 1,2 2,1 3,1 3,2 5,2 8,2

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 16/53

Page 5: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

ECC 256 vs Kummer-HECC 128 (similar theor. security)

Recent results presented at CryptArchi 2017 [17].

FPGA Version DSP BRAM Slices Freq. Nb. Time18K (MHz) cycles (ms)

V4ECC 37 11 4655 250 109,297 0.44

HECC 1u 11 7 1413 330 183,051 0.55HECC 2u 22 9 2356 330 115,211 0.35

V5ECC 37 10 1725 291 109,297 0.38

HECC 1u 11 7 873 360 183,051 0.51HECC 2u 22 9 1542 360 115,211 0.32

Gain 1u on V5: -70% DSPs, -30% BRAMs, -49% slices, +30% duration

Gain 2u on V5: -40% DSPs, -10% BRAMs, -10% slices, -15% duration

ECC results from [25]

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 17/53

Double-Base Number SystemStandard radix-2 representation:

k =t−1∑i=0

ki 2i = kt−1

2t−1

kt−2

2t−2

. . .

. . .

k2

22

k1

21

k0

20

t explicit digits

implicit weights

Digits: ki ∈ {0, 1}, typical size: t ∈ {160, . . . , 600}

Double-Base Number System (DBNS):

k =n−1∑j=0

kj 2aj 3bj =

kn−1

an−1

bn−1

. . .

. . .

. . .

k1

a1

b1

k0

a0

b0

n (2, 3)−terms

explicit “digits”

explicit ranks

aj , bj ∈ N, kj ∈ {1} or kj ∈ {−1, 1}, size n ≈ log t

DBNS is a very redundant and sparse representation: 1701 = (11010100101)2

1701 = 243 + 1458 = 2035 + 2136 = (1, 0, 5), (1, 1, 6)= 1728− 27 = 2633 − 2033 = (1, 6, 3), (−1, 0, 3)= 729 + 972 = 2036 + 2235 = (1, 0, 6), (1, 2, 5). . .

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 18/53

Faster Scalar Multiplication AlgorithmsRepresentation of k impacts #operations recode k :

• non-adjacent forms (NAF/NAF-w):high-radix signed-digits representations increase #0s

• double-base number systems (DBNS): x =∑n′

i=1 di bui1 bvi

2 with di = ±1b1 and b2 co-prime integers (typically (b1, b2) = (2, 3))specific op.: point tripling [3]P = P + P + P denoted TPL

decreasing exponents (Horner form) higher speed

• multi-base number systems (MBNS):

more than two bases (co-prime integers), e.g. (2, 3, 5) and (2, 3, 5, 7)

x =∑n′

i=1

(di∏l

j=1 bej,i

j

)with di = ±1

BUT those recoding methods require pre-computations:

• NAF-w : pre-compute and store Pj = [j ]P ∀j ∈ {3, 5, . . . , 2w−1 − 1}

• DBNS/MBNS recoding is performed off-line

Remark: point subtraction (SUB) is as efficient as point addition

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 19/53

MBNS Recoding

Work presented at ARITH 21

[11] T. Chabrier and A. Tisserand.

On-the-fly multi-base recoding for ECC scalar multiplication withoutpre-computations.

In A. Nannarelli, P.-M. Seidel, and P. T. P. Tang, editors, Proc. 21stSymposium on Computer Arithmetic (ARITH), pages 219–228, Austin, TX,U.S.A, April 2013. IEEE Computer Society. DOI: 10.1109/ARITH.2013.17

http://tel.archives-ouvertes.fr/hal-00772613 (PDF)

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 20/53

Page 6: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Notations

• k = (kn−1kn−2 . . . k1k0)2, k > 1, the n-bit scalar stored into t wordsof w bits with w(t − 1) < n ≤ wt (i.e. last word may be 0-padded).k(i) the ith word of k starting from least significant for 0 ≤ i < t

• B the multi-base with l base elements (co-prime integers),B = (b1, b2, . . . , bl )

• predicate divisible(x ,B) returns true if x is divisible by at least onebase element in B (false for other cases)

• number x represented as the sum of terms x =∑n′

i=1

(di∏l

j=1 bej,i

j

)with di ∈ {0,±1} and in Horner form

• term (di , e1,i , e2,i , . . . , el ,i ) defined by di ×∏l

j=1 bej,i

j in B (index i maybe omitted when context is clear)

• Q,P curve points and Q = [k]P scalar multiplication

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 21/53

Very Simple MBNS Unsigned Recoding AlgorithmTransforms k into a list of terms (LT) in Horner form

1: LT← ∅2: while k > 1 do3: if not

(divisible(k,B)

)then (divisibility test)

4: d ← 15: k ← k − 16: else7: d ← 08: for j from 1 to l do9: ej ← 0

10: while k ≡ 0 mod bj do (divisibility test)11: ej ← ej + 112: k ← k/bj (exact division)13: LT← LT ∪ (d , e1, e2, . . . , el )14: return LT

Remark: divisibility tests at line 3 and 10 are sharedArnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 22/53

Very Simple MBNS Scalar Multiplication Algorithm

• MBNS recoding works in a serial way starting with most significant

• each term can be immediately used in the scalar multiplicationrecorded terms are processed and used on-the-fly

• multi-base adaptation of standard left-to-right scalar multiplication([19, Sec. 3.3.1])

1: Q ← O2: foreach t in LT do (t = (d , e1, e2, . . . , el ))3: Q ← Q+d × P (d ∈ {0, 1} ⇒ NOP/ADD)4: for j from 1 to l do5: P ←

[b

ej

j

]P (DBL, TPL, QPL, . . . )

6: Q ← Q + P7: return Q

Remark 1: recoding and curve-level operations are overlappedRemark 2: P is modified over time, we cannot use mADD (time penalty)Remark 3: d = 0 is only possible for the very first term

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 23/53

Implementation of Divisibility Tests (1/2)We use Pascal’s tapes, [33] (published in 1819), [35], values are 2i mod bj :

i

bj 11 10 9 8 7 6 5 4 3 2 1 0

3 2 1 2 1 2 1 2 1 2 1 2 1

5 3 4 2 1 3 4 2 1 3 4 2 1

7 4 2 1 4 2 1 4 2 1 4 2 1

For bj = 3, the periodic sequence is (2 1)∗:

k mod 3 = (. . .+ 23k3 + 22k2 + 21k1 + k0) mod 3

= (. . .+ 2k3 + k2 + 2k1 + k0) mod 3

=

(∑(2k2i+1 + k2i )︸ ︷︷ ︸

α

)mod 3.

For bj = 5, the periodic sequence is (3 4 2 1)∗

• use 3 = 1 + 2 unsigned sum with additional inputs (FPGA)• use 3 ≡ −2 mod 5 signed sum with less operands

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 24/53

Page 7: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Implementation of Divisibility Tests (2/2)

To avoid complex decoding, we use w = lcm(2, 4, 3) = 12 and w = 24

...

tw

ords

w bits

k mem.

k(i)

CTRL

i

w∑for bj = 3

3reg.

R for bj = 3

divisible by 31

w∑for bj = 5

5reg.

R for bj = 5

divisible by 51

w∑for bj = 7

4reg.

R for bj = 7

divisible by 71

FPGA results for n = 160 (XC5VLX50T, ISE 12.4, std efforts S/P&R):

area freq. clock

w slices (FF/LUT) MHz cycles

12 25 (40/81) 543 t + 3

24 67 (53/152) 549 t + 4

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 25/53

Implementation of Exact Division by bj Elements (1/2)

Exact division k/bj : we know that k is divisible by bj

Algorithm from [21] (LSWF), optimized for FPGA and bj ∈ {3, 5, 7}:1: c ← 02: for i from 0 to t − 1 do3: r ← k(i) − c4: r ← r×(b−1

j mod 2w )5: c ← 06: for h from 1 to bj − 1 do7: if r≥h × d(2w − 1)/bje then8: c ← c + 19: k(i) ← (rw−1 · · · r0)

10: return k

bj b−1j mod 212, γ b−1

j mod 224, γ

3 (101010101011)2, 3 (101010101010101010101011)2, 4

5 (110011001101)2, 3 (110011001100110011001101)2, 4

7 (110110110111)2, 3 (110110110110110110110111)2, 4

We use our multiplication by constant algorithm [8]Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 26/53

Implementation of Exact Division by bj Elements (2/2)

...

tw

ords

w bits

k mem.

k(i)

W

R

CTRL

i

w

×(b3 mod 2w)

± seq.

cmp. b3

w

×(b5 mod 2w)

± seq.

cmp. b5

w

×(b7 mod 2w)

± seq.

cmp. b7

MUX1

MUX2

c

sel. bj2

FPGA results for n = 160 (XC5VLX50T, ISE 12.4, std efforts S/P&R):

area freq. clock

w slices (FF/LUT) MHz cycles

12 59 (138/171) 291 t + 4

24 152 (441/448) 202 t + 5

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 27/53

Unsigned Multiple-Base Recoding Unit

...

tw

ords

w bits

k mem.

k(i)

CTRL

i

R

W

− DTD-2

1 exactdivision3,5,7

div.test3,5,7

3

MUX

global ctrl

FPGA results for n = 160 and B = (2, 3, 5, 7) (XC5VLX50T, ISE 12.4, stdefforts S/P&R):

area freq.

w slices (FF/LUT) MHz

12 153 (301/412) 232

24 323 (682/908) 202

Remark: DTD-2 divisibility test and division by 21...v with v ≤ wArnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 28/53

Page 8: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Example

87 = 0 + 31 × (1 + 2271)

time

CLO:

res.:

k :

P :

Q :

P

O

DT

3

87

TPL ADD

3P

O3P

3P

/3

29

DT

∅−

28

DBL DBL SPL ADD

6P

3P

12P

3P

84P

3P

84P

87P

DT

22, 7/22

7

/7

1

DT

∅−

0

Notations:

• “CLO” denotes curve-level operations

• DT denotes divisibility test, “res.” their results

• “/bj ” exact division by bj

Remark: very short latency at the very beginning (< 0.01 % of [k]P forn = 160 and even less for larger fields)

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 29/53

Signed Digits Version: d ∈ {0,±1}Add a selection function S in the recoding algorithm:

unsigned version

4: d ← 15: k ← k − 1

−→signed version

4: d ← S(k)5: k ← k − d

We compared 4 heuristic selection functions:

• minS

k

not divisible

k − 1

k + 1

k′

k′′

divisibilitytests &

exactdivisions ?

d = +1

d = −1

d

• approx: approximated minimum value selection function

• max nb div: maximum number of divisors selection function

• min2: 2 steps minimum value selection function

1) (k − 1, k + 1)min−→ (k ′, k ′′)

2) (k ′ − 1, k ′ + 1, k ′′ − 1, k ′′ + 1)min2−→ (ζ ′, ζ ′′, ζ ′′′, ζ ′′′′)

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 30/53

approx Selection FunctionS

k

not divisible

k − 1

k + 1

k′

k′′

divisibilitytests &

exactdivisions ?

d = +1

d = −1

d

Computing (k ′, k ′′) is expensive, so we try to get an approximation

k ′ ≈ δ′ = blog2(k − 1)c+ 1︸ ︷︷ ︸MSB position of k−1

−l∑

j=1

e ′j log2(bj )

k ′′ ≈ δ′′ = blog2(k + 1)c+ 1︸ ︷︷ ︸MSB position of k+1

−l∑

j=1

e ′′j log2(bj )

1) Exponents e ′j and e ′′j are produced by the divisibility tests

2) Approximate constants: log2 3 ≈ 1.59, log2 5 ≈ 2.32, and log2 7 ≈ 2.81

δ′ = MSB(k− 1)− eb1=2 − 1.5eb2=3 − 2.25eb3=5 − 2.75eb4=7

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 31/53

Comparison of Selection FunctionsFor curves over Fp with a = −3:

1600

1700

1800

1900

2000

2100

2200

(2,7) (2,5) (2,3) (2,5,7) (2,3,7) (2,3,5) (2,3,5,7)(2,3,5,7,11)

com

puta

tion tim

e [M

]

unsigned signed/max_nb_div

signed/minsigned/approx

signed/min2

Average computation time (in M) of 10 000 scalar multiplications with160-bit values

Similar behavior for curves with a 6= −3

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 32/53

Page 9: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Complete FPGA Implementation ResultsSigned recoding unit with approx heuristic:

area freq.

w slices (FF/LUT) MHz

12 173 (326/466) 232

24 345 (724/1 005) 202

ECC processor (modification from [10], collab. UCC crypto group):

memory area freq.

version type slices (FF/LUT) BRAM MHz

smalldistributed 2 204 (3 971/6 816) 0 155

BRAM 1 793 (3 641/6 182) 6 155

largedistributed 3 182 (4 668/7 361) 0 142

BRAM 2 427 (4 297/6 981) 6 142

• small: Fp curves, n = 160, Jacob. coord., NAF/MBNS, 1 unit/op.• large: same with 4NAF/MBNS and 2 ±, 2 ×, 1 inv.

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 33/53

Timings ComparisonFor n = 160 and a 6= −3:

pre-computations

refs. methods perfs storage operations recoding

dbl&add 1 985.3M ∅ ∅ ∅NAF 1 723.0M ∅ ∅ on-the-fly & very cheap

3NAF 1 583.7M 1 pt 49.4M on-the-fly & very cheap

4NAF 1 499.1M 3 pts 140.8M on-the-fly & very cheap

[14] DBNS 1 863.0M ∅ ∅ off-line & costly

[15] DBNS 1 722.3M ∅ ∅ off-line & costly

[1] DBNS 1 558.4M 7 pts >150M off-line & costly

[16] DBNS 1 615.3M ∅ ∅ off-line & costly

our

(2, 3) MBNS 1 746.2M ∅ ∅ on-the-fly & small

(2, 3, 5) MBNS 1 679.9M ∅ ∅ on-the-fly & small

(2, 3, 5, 7) MBNS 1 670.4M ∅ ∅ on-the-fly & small

For n = 160 and a = −3: about 15% slower than best DBNS/MBNS(theoretical) solutions

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 34/53

Addition Chains Recoding

Work presented at Compas 2015

[34] J. Proy, N. Veyrat-Charvillon, A. Tisserand, and N. Meloni.

Full hardware implementation of short addition chains recoding for ECCscalar multiplication.

In Actes Conference d’informatique en Parallelisme, Architecture etSysteme (ComPAS), Lille, France, June 2015.

http://tel.archives-ouvertes.fr/hal-01171095 (PDF)

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 35/53

Addition Chain(s)

Definition

An addition chain for k ∈ N is a sequence of integers (a0, a1, . . . , al )satisfying: a0 = 1, al = k , and ai = ai1 + ai2 for some i2 ≤ i1 < i

Definition

A euclidean addition chain (EAC) satisfies the additional condition:for i ≥ 3, ai = ai1 + ai2 ⇒ ai+1 = ai + ai1 or ai+1 = ai + ai2

Example: for k = 14, possible EACs are

time1 2 3 4 5 9 14

currentterms

1 1 1 1 4 5steps 1 1 0 0

1 2 3 5 8 11 14currentterms

1 1 2 3 3 3steps 0 0 1 1

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 36/53

Page 10: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Scalar Multiplication using EACs

EAC coding of k ∈ N: C = EAC (k) = (cl cl−1 . . . c1c0)

• largest summand is added ci = 0 (big step)

• smallest summand is added ci = 1 (small step)

EAC based scalar multiplication algorithm:

EAC Scalar Multiplication: C = EAC (k), point P1: (U1,U2)← (P, [2]P)2: for ci ∈ C do3: if ci = 0 then (U1,U2)← (U2,U1 + U2) (big step)4: else (U1,U2)← (U1,U1 + U2) (small step)5: return Q = U1 + U2 (Q = [k]P)

EAC: only ADD is used natural protection against SPA (simple poweranalysis, see [26])

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 37/53

State-of-the-Art Scalar Multiplication Algorithms

Algorithms SPA resilient

• Double and Add (DA): basic algorithm NO

• Non-Adjacent Forms (NAF): recoding for faster schemes see [19] NO

• Montgomery Ladder (ML): regular algorithm [28] YES

• Unified Formulas (UF): same formula for ADD and DBL [9] YES

• . . .

Average cost per key/chain bit for several scalar multiplication methods:

Method DA NAF-3 NAF-4 ML UF EAC

Source [19] [19] [19] [28] [9] [27]

Cost (m/bit) 17 13.5 12.8 24 19 7

Warning: n = length(k) and l = length(C) are different!

EAC is efficient for l ≤ 2n

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 38/53

Short EAC Recoding Algorithm

Computing an EAC is easy but computing a short one is very hard!

Apply the subtractive version of Euclid’s algorithm for (k, g) wheregcd(k , g) = 1 and g well chosen (it strongly impacts l)

Random choice leads to l ≈ O(ln(k)2) which is too slow (see [28])

Random search process in range 2ε around g = kφ (with φ = 1+

√5

2 ):

1. compute kφ (mult. by cst. 1

φ)m minimum number of big steps at the end of C

2. get m the minimal number of starting big steps (in a table)

3. in parallel:I start EAC scalar multiplication with m big steps

I try all g in interval[

kφ − ε,

kφ + ε

]to find a short EAC of k

4. finish scalar multiplication using the shortest EAC found

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 39/53

Statistical Analysis of EAC RecodingsSimulations for measuring l

n for g ∈[

kφ − ε,

kφ + ε

]and various ε values

(averaged over 1000 random scalars)

300 400 500 600 700 800 900

1000

200 250 300 350 400le

n(E

AC

) [b

its]

y=2xε<=10

200 250 300 350 400

y=2xε<=100

200 250 300 350 400

y=2xε<=1000

1.8

1.9

2

2.1

2.2

2.3

200 250 300 350 400 450 500

len

(EA

C)

/ le

n(k

)

scalar length [bits]

ε<=10ε<=50

ε<=100ε<=200ε<=500

ε<=1000

speed up zone

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 40/53

Page 11: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Proposed EAC Recoding Unit

Euclide

computationof C

MEM.(BRAM)

1/φ, k, k/φ,a, b, C, C′

a(j) − b(j) b(j) − a(j) a(j)

2− b(j) b(j)

2− a(j)

k

unused unusedcout cout

CTRL CSIPO C C

LSB(a(j)) LSB(b(j) )

computationof k

φ

±

+

+

ε

+ CTRL@

offset C′offset C

offset boffset a

offset k/φoffset k

write ports

read ports

address control signalsscalar

word digit w-bit data word

w = 32 bits in this workArnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 41/53

FPGA Implementation: Recoding Unit

FP

GA

BR

AM optimization: area optimization: speed

Rec. area freq. area freq.Meth. n slices (FF/LUT) MHz slices (FF/LUT) MHz

Sp

arta

n6

EAC

160 1 209 (636/441) 151 211 (662/476) 151192 1 195 (641/441) 160 223 (634/476) 157256 1 203 (636/441) 155 214 (662/476) 154384 1 228 (652/441) 154 215 (682/476) 159

Sp

arta

n6

Bin.&

ML

160 1 26 (70/101) 466 73 (194/237) 388192 1 26 (70/101) 466 75 (231/270) 387256 1 26 (70/101) 466 94 (300/336) 377384 1 26 (70/101) 466 128 (446/475) 379

Sp

arta

n6

NAF-3

160 1 35 (104/122) 328 34 (108/130) 382192 1 35 (104/122) 328 34 (108/130) 382256 1 35 (104/122) 322 34 (108/130) 364384 1 42 (120/123) 248 43 (123/131) 332

Sp

arta

n6

NAF-4

160 1 40 (109/125) 333 36 (113/134) 388192 1 40 (109/125) 333 36 (113/134) 388256 1 40 (109/125) 333 36 (113/134) 365384 1 45 (129/126) 236 43 (132/135) 320

Vir

tex5

EAC 160 1 278 (711/509) 198 276 (709/476) 206NAF-4 160 1 60 (148/161) 418 61 (147/157) 413MBNS

160 1153 (301/412) 232 323 (682/908) 202

[11] n/12 + 4 clock cycles n/24 + 4 clock cycles

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 42/53

FPGA Implementation: Complete Crypto-Processor

AU1 AU2 AU3PointsMem.C

TR

L

address coord.

Prg.

Mem

.

instruction

addr

RecodingUnitki

k

21-bit instruction address control signalsscalar

word digit w-bit data word

• n = 192 bits

• Spartan-6 FPGA

• Small configuration (1 mult., 1add., 1 inv.)

recoding BRAM optim. area freq. dura. SCAmethod target slices (FF/LUT) MHz ms prot.

EAC 3area 534 (1813/1508) 132 35.8

Yspeed 556 (1872/1523) 137 34.5

DA 2area 429 (1243/1134) 191 30

Nspeed 399 (1302/1222) 177 32.5

ML 2area 429 (1243/1134) 191 42.5

Yspeed 399 (1302/1222) 177 45.8

UF 2area 429 (1243/1134) 191 50.4

Yspeed 399 (1302/1222) 177 54.4

NAF-3 2area 422 (1280/1157) 181 25.2

Nspeed 423 (1321/1242) 175 26.1

NAF-4 2area 420 (1277/1161) 158 27.3

Nspeed 425 (1233/1246) 177 24.4

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 43/53

Hardware Implementation of RNS for ECC (1/2)RNS: Residue Number System

• Base B = (m1,m2, . . . ,mk ) of k relatively prime moduli

• Size of the base: k

A = {a1, a2, . . . , ak}, ∀i ai = A mod mi

Operations:A± B = (|a1 ± b1|m1 , . . . , |ak ± bk |mk

)

A× B = (|a1 × b1|m1 , . . . , |ak × bk |mk)

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 44/53

Page 12: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

Hardware Implementation of RNS for ECC (2/2)

Rower 1

w

w

mod3

Rower 2

w

w

mod3

. . .

. . .

Rower n

w

w

mod3Cox mod3

|q|3 |q|4|s|4|s|3

. . .

...

. . .

t+ 2

...

registers

I/O

wchannel 1

w w

2

channel 2

w w

2

channel n

w w

2

. . .

. . .

CTRL

30-state FSM

. . ....

CTRL

(shared)

local reg.

{@, en, r/w}

Arithmetic Unit(6 pipeline stages)

{rst, mode, . . .}

ww

w

w w

INw

OUTw

mod3

OUT mod32

cmpw

= 1 = −1

precomp.mult.

≈ 2n × w w

@1

precomp.ri (×2)

@2

dlog2r ie

precomp.add.

38 × w

@3

w

Optimized algorithms and implementations for Fp operations:

• modular inversion: 10× speedup [3, 6]

• modular multiplication: 12× area-delay cost [5]

• operations patterns [4]

• PhD thesis Karim Bigou [2]

• hybrid postions-residues (HPR) representation: 12× area-delay cost [7]

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 45/53

PAVOIS Integrated Circuit

ECC 256 bits65 nm CMOS1.5 mm2

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 46/53

Our Long Term ObjectivesStudy the links between:

• cryptosystems

• arithmetic algorithms

• Fq, pts representations

• architectures & units

• circuit optimizations

to ensure

• high security againstI theoretical attacksI physical attacks

• low design cost

• low silicon cost

• low energy(/power)

• high performances

• high flexibility

area 1 1 + a

delay 1 1 + t

energy 1 1 + e

a, t, e ∈ 0%, 5%, 10%, . . . , 100%

security 1

×10

×100

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 47/53

References I

[1] D. J. Bernstein, P. Birkner, T. Lange, and C. Peters.Optimizing double-base elliptic-curve single-scalar multiplication.In Proc. 8th International Conference on Progress in Cryptology (INDOCRYPT), volume 4859 of LNCS, pages 167–182,Chennai, India, December 2007. Springer.

[2] K. Bigou.

Etude theorique et implantation materielle d’unites de calcul en representation modulaire des nombres pour la cryptographiesur courbes elliptiques.Phd thesis, University Rennes 1, Lannion, France, November 2014.

[3] K. Bigou and A. Tisserand.Improving modular inversion in RNS using the plus-minus method.In G. Bertoni and J.-S. Coron, editors, Proc. 15th International Workshop on Cryptographic Hardware and EmbeddedSystems (CHES), volume 8086 of LNCS, pages 233–249, Santa Barbara, CA, USA, August 2013. Springer.

[4] K. Bigou and A. Tisserand.RNS modular multiplication through reduced base extensions.In H. Fu and D. Thomas, editors, Proc. 25th IEEE International Conference on Application-specific Systems, Architecturesand Processors (ASAP), pages 57–62, Zurich, Switzerland, June 2014. IEEE.

[5] K. Bigou and A. Tisserand.Single base modular multiplication for efficient hardware RNS implementations of ECC.In T. Guneysu and H. Handschuh, editors, Proc. 17th International Workshop on Cryptographic Hardware and EmbeddedSystems (CHES), volume 9293 of LNCS, pages 123–140, Saint-Malo, France, September 2015. Springer.

[6] K. Bigou and A. Tisserand.Binary-ternary plus-minus modular inversion in RNS.IEEE Transactions on Computers, 65(11):3495–3501, November 2016.

[7] K. Bigou and A. Tisserand.Hybrid position-residues number system.In J. Hormigo, S. Oberman, and N. Revol, editors, Proc. 23rd Symposium on Computer Arithmetic (ARITH), pages126–133, Santa Clara, CA, U.S.A, July 2016. IEEE Computer Society.

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 48/53

Page 13: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

References II

[8] N. Boullis and A. Tisserand.Some optimizations of hardware multiplication by constant matrices.IEEE Transactions on Computers, 54(10):1271–1282, October 2005.

[9] E. Brier, M. Joye, and I. Dechene.Embedded Cryptographic Hardware, chapter Unified Point Addition Formulae for Elliptic Curve Cryptosystems, pages247–256.Nova Science, 2004.

[10] A. Byrne, E. Popovici, and W.P. Marnane.Versatile processor for gf(pm) arithmetic for use in cryptographic applications.IET Computers & Digital Techniques, 2(4):253–264, July 2008.

[11] T. Chabrier and A. Tisserand.On-the-fly multi-base recoding for ECC scalar multiplication without pre-computations.In A. Nannarelli, P.-M. Seidel, and P. T. P. Tang, editors, Proc. 21st Symposium on Computer Arithmetic (ARITH), pages219–228, Austin, TX, U.S.A, April 2013. IEEE Computer Society.

[12] J. Chen, A. Tisserand, E. M. Popovici, and S. Cotofana.Robust sub-powered asynchronous logic.In J. Becker and M. R. Adrover, editors, Proc. 24th International Workshop on Power and Timing Modeling, Optimizationand Simulation (PATMOS), pages 1–7, Palma de Mallorca, Spain, September 2014. IEEE.

[13] J. Chen, A. Tisserand, E. M. Popovici, and S. Cotofana.Asynchronous charge sharing power consistent Montgomery multiplier.In J. Sparso and E Yahya, editors, Proc. 21st IEEE International Symposium on Asynchronous Circuits and Systems(ASYNC), pages 132–138, Mountain View, California, USA, May 2015.

[14] V. Dimitrov, L. Imbert, and P. K. Mishra.Efficient and secure elliptic curve point multiplication using double-base chains.In Proc. 11th International Conference on the Theory and Application of Cryptology and Information Security(ASIACRYPT), volume 3788 of LNCS, pages 59–78, Chennai, India, December 2005. Springer.

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 49/53

References III

[15] V. Dimitrov, L. Imbert, and P. K. Mishra.The double-base number system and its application to elliptic curve cryptography.Mathematics of Computation, 77(262):1075–1104, April 2008.

[16] C. Doche and L. Imbert.Extended double-base number system with applications to elliptic curve cryptography.In Proc. 7th International Conference on Progress in Cryptology (INDOCRYPT), volume 4329 of LNCS, pages 335–348,Kolkata, India, December 2006. Springer.

[17] G. Gallin and A. Tisserand.Hardware architectures for HECC.15th International Workshop on Cryptographic Architectures Embedded in Reconfigurable Devices (CryptArchi), June 2017.Smolenice, Slovakia.

[18] P. Giorgi, L. Imbert, and T. Izard.Optimizing elliptic curve scalar multiplication for small scalars.In Proc. Mathematics for Signal and Information Processing, volume 7444, pages 74440N:1–10, San Diego, CA, USA,September 2009. SPIE.

[19] D. Hankerson, A. Menezes, and S. Vanstone.Guide to Elliptic Curve Cryptography.Springer, 2004.

[20] K. Itoh, M. Takenaka, N. Torii, S. Temma, and Y. Kurihara.Fast implementation of public-key cryptography on a DSP TMS320C6201.In Proc. Cryptographic Hardware and Embedded Systems (CHES), volume 1717 of LNCS, pages 61–72, Worcester, MA,USA, August 1999. Springer.

[21] T. Jebelean.An algorithm for exact division.Journal of Symbolic Computation, 15(2):169–180, February 1993.

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 50/53

References IV

[22] P. Longa and C. Gebotys.Setting speed records with the (fractional) multibase non-adjacent form method for efficient elliptic curve scalarmultiplication.Technical Report 118, Cryptology ePrint Archive, 2008.

[23] P. Longa and C. Gebotys.Fast multibase methods and other several optimizations for elliptic curve scalar multiplication.In Proc. Public Key Cryptography (PKC), volume 5443 of LNCS, pages 443–462, 2009.

[24] P. Longa and A. Miri.New multibase non-adjacent form scalar multiplication and its application to elliptic curve cryptosystems.Technical Report 52, Cryptology ePrint Archive, 2008.

[25] Y. Ma, Z. Liu, W. Pan, and J. Jing.A high-speed elliptic curve cryptographic processor for generic curves over GF(p).In Proc. 20th International Workshop on Selected Areas in Cryptography (SAC), volume 8282 of LNCS, pages 421–437,Burnaby, BC, Canada, August 2013. Springer.

[26] S. Mangard, E. Oswald, and T. Popp.Power Analysis Attacks: Revealing the Secrets of Smart Cards.Springer, 2007.

[27] N. Meloni.New point addition formulae for ECC applications.In Proc. 1st International Workshop on Arithmetic of Finite Fields (WAIFI), volume 4547 of LNCS, pages 189–201, Madrid,Spain, June 2007. Springer.

[28] P. L. Montgomery.Speeding the pollar and elliptic curves methods of factorisation.Mathematics of Computation, 48(177):243–264, January 1987.

[29] D. Pamula.Arithmetic Operators on GF(2m) for Cryptographic Applications: Performance - Power Consumption - Security Tradeoffs.Phd thesis, University of Rennes 1 and Silesian University of Technology, December 2012.

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 51/53

References V

[30] D. Pamula, E. Hrynkiewicz, and A. Tisserand.

Analysis of GF(2233) multipliers regarding elliptic curve cryptosystem applications.In 11th IFAC/IEEE International Conference on Programmable Devices and Embedded Systems (PDeS), pages 271–276,Brno, Czech Republic, May 2012.

[31] D. Pamula and A. Tisserand.GF(2m) finite-field multipliers with reduced activity variations.In 4th International Workshop on the Arithmetic of Finite Fields, volume 7369 of LNCS, pages 152–167, Bochum, Germany,July 2012. Springer.

[32] D. Pamula and A. Tisserand.Fast and secure finite field multipliers.In Proc. 18th Euromicro Conference on Digital System Design (DSD), pages 653–660, Madeira, Portugal, August 2015.

[33] B. Pascal.Œuvres completes, volume 5, chapter De Numeribus Multiplicibus, pages 117–128.Librarie Lefevre, 1819.

[34] J. Proy, N. Veyrat-Charvillon, A. Tisserand, and N. Meloni.Full hardware implementation of short addition chains recoding for ECC scalar multiplication.In Actes Conference d’informatique en Parallelisme, Architecture et Systeme (ComPAS), Lille, France, June 2015.

[35] J. Sakarovitch.Elements of Automata Theory, chapter Prologue: M. Pascal’s Division Machine, pages 1–6.Cambridge, 2009.

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 52/53

Page 14: Hardware Security Group at Lab-STICC Hardware security …tisseran/docs/slides-cea-arith-secu-jul... · Hardware security for embedded systems: ... (Hyper)-Elliptic Curve Cryptography:

The end, questions ?

Contact:

• mailto:[email protected]

• http://www-labsticc.univ-ubs.fr/~tisseran

• CNRS, Lab-STICC LaboratoryUniversity South Brittany (UBS),Centre de recherche C. Huygens, rue St Maude, BP 92116,56321 Lorient cedex, France

Thank you

Arnaud Tisserand. CNRS – Lab-STICC. Arithmetic Tradeoffs for Hardware Asymmetric Cryptography 53/53