Mixing Layers in Symmetric Crypto · MDSmatrices Codingtheoryhasmaximum distance separable (MDS)codes “ReachesSingletonbound” Given[ n, kd] codeoverF q,besterrorcorrectionwhen

Mixing Layers in Symmetric CryptoKo Stoffelen

Joint work with. . .

Thorsten Kranz, Gregor Leander, Ko Stoffelen, Friedrich Wiemer. ShorterLinear Straight-Line Programs for MDS Matrices. ToSC 2017 Issue 4.

Ko Stoffelen, Joan Daemen. Column Parity Mixers. ToSC 2018 Issue 1.

2/16

Diffusion

S

S

S

S

S

S

S

S

S

S

S

S

S

S

S

S

3/16

Diffusion in AES

4/16

Diffusion in AES

b0

b1

b2

b3

=

02 03 01 01

01 02 03 01

01 01 02 03

03 01 01 02

a0

a1

a2

a3

5/16

MDS matrices

• Coding theory has maximum distance separable (MDS) codes

• “Reaches Singleton bound”• Given [n, k, d ] code over Fq, best error correction when d = n− k + 1• For example, Reed-Solomon codes (DVD, Blu-ray)• Use this idea for diffusion in crypto!• An m × n matrix A over Fq is MDS when the set{

(x1, . . . , xn, (Ax)1, . . . , (Ax)m)|x ∈ Fnq}forms an MDS code

• MixColumns matrix is MDS

6/16

MDS matrices

• Coding theory has maximum distance separable (MDS) codes• “Reaches Singleton bound”

• Given [n, k, d ] code over Fq, best error correction when d = n− k + 1• For example, Reed-Solomon codes (DVD, Blu-ray)• Use this idea for diffusion in crypto!• An m × n matrix A over Fq is MDS when the set{



6/16

MDS matrices

• Coding theory has maximum distance separable (MDS) codes• “Reaches Singleton bound”• Given [n, k, d ] code over Fq, best error correction when d = n− k + 1

• For example, Reed-Solomon codes (DVD, Blu-ray)• Use this idea for diffusion in crypto!• An m × n matrix A over Fq is MDS when the set{



6/16

MDS matrices

• Coding theory has maximum distance separable (MDS) codes• “Reaches Singleton bound”• Given [n, k, d ] code over Fq, best error correction when d = n− k + 1• For example, Reed-Solomon codes (DVD, Blu-ray)

• Use this idea for diffusion in crypto!• An m × n matrix A over Fq is MDS when the set{



6/16

MDS matrices

• Coding theory has maximum distance separable (MDS) codes• “Reaches Singleton bound”• Given [n, k, d ] code over Fq, best error correction when d = n− k + 1• For example, Reed-Solomon codes (DVD, Blu-ray)• Use this idea for diffusion in crypto!

• An m × n matrix A over Fq is MDS when the set{(x1, . . . , xn, (Ax)1, . . . , (Ax)m)|x ∈ Fn

q}forms an MDS code


6/16

MDS matrices

• Coding theory has maximum distance separable (MDS) codes• “Reaches Singleton bound”• Given [n, k, d ] code over Fq, best error correction when d = n− k + 1• For example, Reed-Solomon codes (DVD, Blu-ray)• Use this idea for diffusion in crypto!• An m × n matrix A over Fq is MDS when the set{



6/16

MDS matrices

• Coding theory has maximum distance separable (MDS) codes• “Reaches Singleton bound”• Given [n, k, d ] code over Fq, best error correction when d = n− k + 1• For example, Reed-Solomon codes (DVD, Blu-ray)• Use this idea for diffusion in crypto!• An m × n matrix A over Fq is MDS when the set{



6/16

Lightweight MDS matrices

7/16


7/16


7/16


7/16


7/16


7/16


• Cauchy and Vandermonde matrices are MDS, but don’t exist for allparameters

• Add structure to reduce search space (Hadamard, circulant, Toeplitz,subfield)

• Have to test for MDS-ness, but the probability is higher than for arandom matrix

• XOR count of a: number of XORs to multiply a with an arbitrary b,a, b ∈ F2k

• Assumes ’summation’ has constant cost, local optimization• Core idea: an n × n matrix over F2k can be viewed as nk × nk matrix

over F2, do global optimization• Solving the shortest linear straight-line program (SLP) problem yields

the optimal number of XORs• Well-studied problem, known algorithms give better results• Remove common subexpressions, allow cancellation

8/16









8/16









8/16









8/16






• Assumes ’summation’ has constant cost, local optimization

• Core idea: an n × n matrix over F2k can be viewed as nk × nk matrixover F2, do global optimization

• Solving the shortest linear straight-line program (SLP) problem yieldsthe optimal number of XORs

• Well-studied problem, known algorithms give better results• Remove common subexpressions, allow cancellation

8/16







over F2, do global optimization

• Solving the shortest linear straight-line program (SLP) problem yieldsthe optimal number of XORs


8/16








the optimal number of XORs


8/16








the optimal number of XORs• Well-studied problem, known algorithms give better results

• Remove common subexpressions, allow cancellation

8/16









8/16

SLP example

1 1 0 0

1 1 1 0

1 1 1 1

0 1 1 1

x0

x1

x2

x3

⇒

x0 ⊕ x1

x0 ⊕ x1 ⊕ x2

x0 ⊕ x1 ⊕ x2 ⊕ x3

x1 ⊕ x2 ⊕ x3

v0 := x0 ⊕ x1v1 := v0 ⊕ x2v2 := v1 ⊕ x3v3 := v2 ⊕ x0

9/16

SLP example

1 1 0 0

1 1 1 0

1 1 1 1

0 1 1 1

x0

x1

x2

x3

⇒

x0 ⊕ x1

x0 ⊕ x1 ⊕ x2

x0 ⊕ x1 ⊕ x2 ⊕ x3

x1 ⊕ x2 ⊕ x3

v0 := x0 ⊕ x1

v1 := v0 ⊕ x2v2 := v1 ⊕ x3v3 := v2 ⊕ x0

9/16

SLP example

1 1 0 0

1 1 1 0

1 1 1 1

0 1 1 1

x0

x1

x2

x3

⇒

x0 ⊕ x1

x0 ⊕ x1 ⊕ x2

x0 ⊕ x1 ⊕ x2 ⊕ x3

x1 ⊕ x2 ⊕ x3

v0 := x0 ⊕ x1v1 := v0 ⊕ x2

v2 := v1 ⊕ x3v3 := v2 ⊕ x0

9/16

SLP example

1 1 0 0

1 1 1 0

1 1 1 1

0 1 1 1

x0

x1

x2

x3

⇒

x0 ⊕ x1

x0 ⊕ x1 ⊕ x2

x0 ⊕ x1 ⊕ x2 ⊕ x3

x1 ⊕ x2 ⊕ x3

v0 := x0 ⊕ x1v1 := v0 ⊕ x2v2 := v1 ⊕ x3

v3 := v2 ⊕ x0

9/16

SLP example

1 1 0 0

1 1 1 0

1 1 1 1

0 1 1 1

x0

x1

x2

x3

⇒

x0 ⊕ x1

x0 ⊕ x1 ⊕ x2

x0 ⊕ x1 ⊕ x2 ⊕ x3

x1 ⊕ x2 ⊕ x3

v0 := x0 ⊕ x1v1 := v0 ⊕ x2v2 := v1 ⊕ x3v3 := v2 ⊕ x0

9/16

Alternative mixing layers

• Good diffusion over a few rounds is more important

• MDS requirement can be dropped when there are good bounds ontrails (attacks)

• PRIDE uses a near-MDS matrix, with a few more zeroes• Keccak-f uses a column parity mixer (CPM)

10/16


• Good diffusion over a few rounds is more important• MDS requirement can be dropped when there are good bounds on

trails (attacks)

• PRIDE uses a near-MDS matrix, with a few more zeroes• Keccak-f uses a column parity mixer (CPM)

10/16



trails (attacks)• PRIDE uses a near-MDS matrix, with a few more zeroes

• Keccak-f uses a column parity mixer (CPM)

10/16



trails (attacks)• PRIDE uses a near-MDS matrix, with a few more zeroes• Keccak-f uses a column parity mixer (CPM)

10/16

Column parity mixers

For an m × n matrix A:

θ(A) = A +

1m

f (A)

Z

θ fully defined by m and Z

Some algebraic properties:

• If m even, CPMs are involutions (as (1mm)2 = 0), commutative,

∼= (Zn2

2 ,+)• If m odd, CPMs invertible iff Z + I is invertible, non-commutative,∼= GL(n, 2) if Z + I non-singular

11/16



θ(A) = A +

1m

1TmA

Z




∼= (Zn2


11/16



θ(A) = A +

1m

1TmAZ




∼= (Zn2


11/16



θ(A) = A + 1m1TmAZ




∼= (Zn2


11/16



θ(A) = A +

1m

1mmAZ




∼= (Zn2


11/16



θ(A) = A +

1m

1mmAZ




∼= (Zn2


11/16



θ(A) = A +

1m

1mmAZ




∼= (Zn2


11/16



θ(A) = A +

1m

1mmAZ


Some algebraic properties:• If m even, CPMs are involutions (as (1m

m)2 = 0), commutative,∼= (Zn2

2 ,+)

• If m odd, CPMs invertible iff Z + I is invertible, non-commutative,∼= GL(n, 2) if Z + I non-singular

11/16



θ(A) = A +

1m

1mmAZ


Some algebraic properties:• If m even, CPMs are involutions (as (1m

m)2 = 0), commutative,∼= (Zn2


11/16

Differential cryptanalysis in a nutshell

• If Pr[∆0, . . . ,∆r ] is high,cipher is not random

• Leads to key recovery!• Many variants exist

t0⊕

t ′0 = ∆0

t1⊕

t ′1 = ∆1

t2⊕

t ′2 = ∆2

f f

f f

12/16



• Leads to key recovery!

• Many variants exist

t0⊕

t ′0 = ∆0

t1⊕

t ′1 = ∆1

t2⊕

t ′2 = ∆2

f f

f f

12/16



• Leads to key recovery!• Many variants exist

t0⊕

t ′0 = ∆0

t1⊕

t ′1 = ∆1

t2⊕

t ′2 = ∆2

f f

f f

12/16

Diffusion with a CPM

• Goal: no low-weight differential trail

• How about a state like this?

• CPM-kernel issues can be avoided by some transposition(ShiftRows-like)

• How about this one?

13/16


• Goal: no low-weight differential trail• How about a state like this?



13/16





13/16





13/16

How to build a permutation with a CPM

1. Determine design goals

2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis

10. Determine the number of rounds11. Implement it12. Give it a name

14/16


1. Determine design goals2. Pick m, n, and cell width

3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis


14/16


1. Determine design goals2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box

4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis


14/16


1. Determine design goals2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )

5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis


14/16


1. Determine design goals2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition

6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis


14/16


1. Determine design goals2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel

7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis


14/16


1. Determine design goals2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z

8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis


14/16


1. Determine design goals2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks

9. Do more analysis10. Determine the number of rounds11. Implement it12. Give it a name

14/16


1. Determine design goals2. Pick m, n, and cell width3. Pick ‘good’ ‘efficient’ non-linear S-box4. Consider (truncated) trails in the kernel (independent of Z )5. Determine ‘good’ transposition6. Consider (truncated) trails outside the kernel7. Determine ‘good’ Z8. Pick ‘good’ round constants to beat all kinds of invariant attacks9. Do more analysis


14/16



10. Determine the number of rounds

11. Implement it12. Give it a name

14/16



10. Determine the number of rounds11. Implement it

12. Give it a name

14/16




14/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation

• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92

• Outside the kernel: ≥ 40 active cells after 4 rounds (linear), LP 2−80

• 36.69 cyc/byte on ARM Cortex-M3/M4

15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3

• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92



15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]

• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92



15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down

• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92



15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}

• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92



15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row

• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92



15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5

• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92



15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds

• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP2−92



15/16

Mixifer

• 16 rounds (ι ◦ ρ ◦ π ◦ θ ◦ γ), 4× 16× 4 = 256 bits permutation• γ: rotational symmetric, b0 = a1 + a2 + a0a2 + a1a2 + a1a2a3• θ: Z is circulant, first row [0, 1, 1, 0, 0, 1, 0, 0, 0, . . . , 0]• π: rotate rows down• ρ: rotate rows cell-wise to the right by {14, 3, 10, 0}• ι: add 0xF3485763� i in round i to even cells of top row• SAC after 3 rounds, full diffusion after 5• In the kernel: ≥ 52 active cells after 4 rounds• Outside the kernel: ≥ 46 active cells after 4 rounds (differential), DP

2−92



15/16

Mixifer


2−92



15/16

Mixifer


2−92



15/16

Thanks. . .

. . . for your attention

16/16

Documents

Mixing Layers in Symmetric Crypto · MDSmatrices Codingtheoryhasmaximum distance separable (MDS)codes “ReachesSingletonbound” Given[ n, kd] codeoverF q,besterrorcorrectionwhen