View
45
Download
0
Category
Tags:
Preview:
DESCRIPTION
Stream ciphers 2. Session 2. Contents. PN generators with LFSRs Statistical testing of PN generator sequences Cryptanalysis of stream ciphers. PN generators with LFSRs. - PowerPoint PPT Presentation
Citation preview
Stream ciphers 2
Session 2
Contents
• PN generators with LFSRs• Statistical testing of PN generator sequences• Cryptanalysis of stream ciphers
2/75
PN generators with LFSRs
• Computational complexity of the Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence.
• Thus, if the linear complexity is very high, then the task of predicting the next bits of the sequence is too complex.
3/75
PN generators with LFSRs
• Linear complexity achievable with a sole LFSR is small.
• Then, in order to prevent the cryptanalysis of a pseudorandom sequence generator, we must design it in such a way that its linear complexity is too high for the practical application of the Berlekamp-Massey algorithm.
4/75
PN generators with LFSRs
• Since LFSRs have nice properties regarding
statistics of their output sequences, a good
idea is to base PN generators on LFSRs.
• But to increase linear complexity, we have to
combine outputs of several LFSRs in non-linear
manner – through non-linear Boolean
functions.5/75
Algebraic normal form
• It is the form of a Boolean function that uses only the operations and
• In the ANF, the product that includes the largest number of variables is denominated non linear order of the function.
• Example: The non linear order of the functionf(x1,x2,x3)=x1x1x3x2x3 is 2.
6/75
Algebraic normal form
• The ANF of a Boolean function can be determined from its truth table.
7/75
nn
u
ux:xu
,u
n
i
uiun
,u,,u,uu
,a
xfa
xax,,x,xfn
i
10
10
110
0
10
1
0110
The Möbius transform
Algebraic normal form
• Example: n=3
8/75
x0 x1 x2 f
0 0 0 00 0 1 10 1 0 00 1 1 11 0 0 01 0 1 11 1 0 11 1 1 0
Algebraic normal form
• u=000 u=001 u=010
9/75
000001010011100101110111
000001010011100101110111
000001010011100101110111
a000=f(0,0,0)=0 a001=f(0,0,0)++f(0,0,1)=0+1=1
a010=f(0,0,0)++f(0,1,0)=0+0=0
x x x
Algebraic normal form
• u=011 u=100 u=101
10/75
000001010011100101110111
000001010011100101110111
000001010011100101110111
a011=f(0,0,0)+ f(0,0,1) +f(0,1,0)+f(0,1,1)= 0+1+0+1=0
a100=f(0,0,0)++f(1,0,0)=0+0=0
a101=f(0,0,0)+ f(0,0,1) +f(1,0,0)+f(1,0,1)= 0+1+0+1=0
x x x
Algebraic normal form
• u=110 u=111
11/75
000001010011100101110111
a110=f(0,0,0)+ f(0,1,0) +f(1,0,0)+f(1,1,0)= 0+0+0+1=1
a111=f(0,0,0)+ f(0,0,1) +f(0,1,0)+f(0,1,1)+ f(1,0,0) +f(1,0,1)+f(1,1,0)+ f(1,1,1) = 0
Then:f(x0,x1,x2)=a001x2+a110x0x1=x2+x0x1
x
Non-linear combiners
• In these generators, the keystream sequence is obtained by combining the output sequences of various LFSRs in a non linear manner.
• Example – it is possible to use a Boolean function (without memory).
12/75
Non-linear combiners
• If F is a Boolean function of N periodic input sequences a1(t), a2(t), ..., aN(t), then the output sequence b(t) = F(a1(t), a2(t), ..., aN(t)) is a linear combination of various products of sequences.
• These products are determined by determining the ANF of the function F.
13/75
Non-linear combiners
• Given the ANF of the function F, if we create a function F* from F in such a way that instead of the sum and product modulo 2 in F we use the sum and product of integers, for the linear complexity and the period of the output sequence of F the following holds:
14/75
N
N
aPer,,aPer,aPerbPer
aLC,,aLC,aLC*FbLC
21
21
lcm
Non-linear combiners
• Example (1)
– If the characteristic polynomials of the input sequences are:
15/75
20100210
20100210
xxxxxx,x,x*F
xxxxxx,x,xF
522
431
40
1
1
1
XX:a
XX:a
XX:a
All these polynomials are
primitive!
Non-linear combiners
• Example (2)– Then
16/75
465311515lcm
4054444
,,bPer
bLC
Non-linear combiners
• The sum of N sequences in GF(q) (1)
– The equality holds if the characteristic polynomials of the input sequences do not have common factors.
17/75
N
iiaLCbLC
1
Non-linear combiners
• The sum of N sequences in GF(q) (2)
– Obviously, if the periods of the input sequences are mutually prime then
18/75
N
N
ii
aPer,,aPer,aPerbPer
aLCbLC
21
1
lcm
thenIf
N
iiaPerbPer
1
Non-linear combiners
• The sum of N sequences in GF(q) (3)– Example:
19/75
89653
2
6110651
1
1
XXXXXf
XXXXXf
1212
89618961
Per
LC
Primitive!
The periods are Mersenne primes
Non-linear combiners
• The product of N sequences in GF(q) (1)– Theorem (Golić, 1989)
• If Per(ai) are mutually prime, then
– Theorem (Lidl, Niedereiter)
Per(ai) are mutually prime
20/75
N
iiaLCbLC
1
N
iiaPerbPer
1
Non-linear combiners
• Example
21/75
89653
2
6110651
1
1
XXXXXf
XXXXXf
1212
542989618961
Per
LC
Primitive!
The periods are Mersenne primes
Non-linear combiners
• The general case (1)– Let be the Boolean function obtained by
removing all the products from the function F except those of the maximum order. Let be the corresponding integer function.
22/75
^
F
*F^
Non-linear combiners
• The general case (2)– Theorem (Golić, 1989)
• F depends on all the N input variables.• Per(ai) are mutually prime.• Then
23/75
N
ii
N
^
aPerbPer
aLC,,aLC*FbLC
1
1 11
Non-linear combiners
• The general case (3)– Example (1)
24/75
2010210
2010210
20100210
20100210
xxxxx,x,x*F
xxxxx,x,xF
xxxxxx,x,x*F
xxxxxx,x,xF
^
^
Non-linear combiners
• The general case (4)– Example (2)
• If the characteristic polynomials of the input sequences are:
• Then
25/75
107974
2
896531
6110650
1
1
1
XXXXXf
XXXXXf
XXXXXf
Primitive, periods Mersenne
primes
121212
116401066088601078961
Per
LC
Non-linear combiners
• The general case (5)– Example – Geffe’s generator (1)
26/75
322133221321 1 xxxxxxxxxx,x,xF
Non-linear combiners
• The general case (6)– Example – Geffe’s generator (2) –
• Equivalent scheme
27/75
Non-linear combiners
• The general case (7)– Example – Geffe’s generator (3)
• If we set the feedback polynomials primitive, with periods that are Mersenne primes:
• Then
28/75
107974
3
896532
6110651
1
1
1
XXXXXf
XXXXXf
XXXXXf
121212
146081068888601078961
Per
LC
Statistical testing of PN generators
• The output sequence of a generator of pseudorandom sequences looks random, but it is not.
• Pseudorandom generators expand a truly random sequence (the key) to a much longer sequence, such that an adversary cannot distinguish between the pseudorandom sequence and a truly random sequence.
29/75
Statistical testing of PN generators
• In order to obtain a guarantee of the security of this type of generators, various statistical tests are applied, especially designed for this purpose.
• The fact that a generator passes a set of statistical tests should be considered a necessary condition, although not a sufficient one, for the security of the generator.
30/75
Statistical testing of PN generators
• If the result X of an experiment can take any real value, then X is a continuous random variable.
• The probability density function f(x) of a continuous random variable X can be integrated and the following holds:
f(x)0, for all xRFor all a, bR the following holds
31/75
1dxxf
b
a
dxxfbXaP
Statistical testing of PN generators
• A continuous random variable has a normal distribution with the mean and the variance 2 if its probability density function is:
• We say that X is• If X is , then we say that X has a standard
normal distribution.
32/75
xexf
x2
2
2
2
1
2,N
1,0N
Statistical testing of PN generators
• If the random variable X is , then the variable is .
• The Euler’s gamma function:
33/75
2,N
/ XZ 1,0N
0
1 dxext xt
Statistical testing of PN generators
• A continuous random variable X has a 2 distribution with degrees of freedom if its probability density function is
34/75
00
022
1 21
22
x,
x,ex/xf
x
/
22
Statistical testing of PN generators
• A statistical hypothesis H is an affirmation about the distribution of one or more random variables.
• A hypothesis test is a procedure based on the observed values of the random variable that leads to the acceptance or rejection of the hypothesis H.
35/75
Statistical testing of PN generators
• The test only provides a measure of the strength of evidence given by the data against the hypothesis.
• The conclusion is probabilistic.• The level of significance of the test of the
hypothesis H is the probability of rejecting the hypothesis H when it is true.
36/75
Statistical testing of PN generators
• The hypothesis to be tested is denominated the null hypothesis, H0.
• The alternative hypothesis is denoted by H1 or Ha.
• In cryptography:– H0 – the given generator is a random sequence
generator.– is between 0,001 and 0,05.
37/75
Statistical testing of PN generators
• A test:– Determines a statistic for the sample of the output
sequence.– This statistic is compared with the expected value
for a random sequence.
38/75
Statistical testing of PN generators
• How is the comparison carried out? (1)– The computed statistic – X0 – follows (usually) a 2
distribution with degrees of freedom.– It is assumed that this statistic takes large values
for non random sequences.
39/75
Statistical testing of PN generators
• How is the comparison carried out? (2)– In order to achieve , a threshold X is chosen (by
means of the corresponding table), such that P(X0>X)=.
– If the value of the statistic for the sample of the output sequence, Xs, satisfies Xs>X, then the sequence fails on the test.
40/75
Statistical testing of PN generators
• Basic tests for cryptographic use:– frequency test, – serial test, – poker test, – runs test, – autocorrelation test, – etc.
41/75
Statistical testing of PN generators
• Frequency test (1)– Purpose: determine if the number of zeros and
ones in a sequence s is approximately the same.– n0 – number of zeros, n1 – number of ones.– The statistic:
42/75
10
210
1
nnnn
nnX
Statistical testing of PN generators
• Frequency test (2)– The statistic follows a 2 distribution with 1 degree
of freedom.– The approximation is good enough if n10.
43/75
Statistical testing of PN generators
• Serial test (1)– Tries to determine if the number of occurrences of
00, 01, 10 and 11, as subsequences of s is approximately the same.
– The statistic:
44/75
1
12
1
4
11100100
21
20
211
210
201
2002
nnnnn
nnn
nnnnn
X
Statistical testing of PN generators
• Serial test (2)– The statistic follows a 2 distribution with 2
degrees of freedom.– The approximation is good enough if n21.
45/75
Statistical testing of PN generators
• Poker test (1)– A positive integer m is considered such that
– The sequence s is divided into k parts of size m.– ni is the number of occurrences of the type i of the
sequence of length m, 1i2m (that is, i is the value of the integer whose binary representation is the sequence of length m.
46/75
m
m
nk 25
Statistical testing of PN generators
• Poker test (2)– The test determines if every sequence of length m
appears approximately the same number of times.– The statistic:
– The statistic follows approximately a 2 distribution with 2m-1 degrees of freedom.
47/75
knk
Xm
ii
m
12
0
23
2
Statistical testing of PN generators
• Runs test (1)– A run of length i – a subsequence of s formed by i
consecutive zeros or i consecutive ones that are neither preceded nor followed by the same symbol.
– A run of zeros – gap– A run of ones – block
48/75
Statistical testing of PN generators
• Runs test (2)– Purpose: determine if the number of runs of
different lengths in the sequence s is that expected in a random sequence.
– The number of gaps (or blocks) of length i in a random sequence of length n is
– It is considered that k is equal to the largest integer i for which ei5.
49/75
223 ii /ine
Statistical testing of PN generators
• Runs test (3)– We denote by Bi and Hi the number of blocks and
gaps of length i in s, for each i, 1ik.– The statistic
– The statistic follows approximately a 2 distribution with 2k-2 degrees of freedom.
50/75
k
i i
iik
i i
ii
e
eH
e
eBX
1
2
1
2
4
Statistical testing of PN generators
• Autocorrelation test (1)– Checks the correlation between s and shifted
versions of s.– An integer d, 1 d n/2 is considered. – The number of bits in s that are not equal to the
d-shifts is
51/75
1
0
dn
idii ssdA
Statistical testing of PN generators
• Autocorrelation test (2)– The statistic
– The statistic follows approximately a N (0,1) distribution.
– The approximation is good enough if n-d 10.
52/75
dn
dndA
X
2
2
5
Cryptanalysis of stream ciphers
53/75
A
Plaintext
KEY
decipher
decrypt
Cryptanalysis
Ciphertextencipher
Plaintext
KEY
B
Cryptanalysis of stream ciphers
• The problem of cryptanalysis– Given some information related to the
cryptosystem (at least the ciphertext), determine plaintext and/or the key.
• The goal of the designer is to make this problem as difficult as possible for the cryptanalyst.
54/75
Cryptanalysis of stream ciphers
• General assumption – all the details of the cryptosystem are known to the cryptanalyst.
• The only unknown is the key.• Types of attack
– Ciphertext-only attack– Known plaintext attack– Chosen plaintext attack– Chosen ciphertext attack
55/75
Cryptanalysis of stream ciphers
• The ciphertext-only attack is the most difficult one for the cryptanalyst (in general).
• The more information known to the cryptanalyst, the easier the attack.
56/75
Cryptanalysis of stream ciphers
• The “brute force attack”– Elementary attack – no knowledge about
cryptanalysis is necessary.– Assumptions
• The cryptosystem is known• The ciphertext is known
– The goal• Determine the key/plaintext
– The means• Trying all the possible keys
57/75
Cryptanalysis of stream ciphers
• Complexity of the brute force attack– Extremely high, if there are many possible keys –
impractical• Key space – the total number of keys possible
in a cryptosystem
58/75
Cryptanalysis of stream ciphers
• Examples of key space size
59/75
Key space – 40 bits 11012
Key space – 56 bits (DES) 71016
Key space – 128 bits 31038
Key space – 256 bits 11077
Number of 256-bit primes 11072
Age of the Sun in seconds 11016
Number of clock pulses of a 3GHz computer clock through the Sun’s age
5.41026
Cryptanalysis of stream ciphers
• A cryptosystem’s security is ultimately determined by the size of its key space
• However, this is the upper limit of that security measure
• There may be a problem in the system design that may cause a significant reduction of the effective key space
• The task of the cryptanalyst – to find this pitfall and to use it to attack the system
60/75
Cryptanalysis of stream ciphers
• Basic attack methods against stream (and block) ciphers– Algebraic– Statistical
• Algebraic attacks (1)– The key symbols (e.g. bits) are the unknowns in
the system of equations assigned to the PRNG
61/75
Cryptanalysis of stream ciphers
• Algebraic attacks (2)– Given all the details of the PRNG to be
cryptanalyzed (except the key bits), determine the system of equations that relates the bits of the output sequence with the bits of the key
– The designer’s goal• To make this system as non-linear as possible• The reason
– non-linear systems are difficult to solve – there is no general method other than trying all the possible values of the variables: 2n possibilities for a system with n variables.
62/75
Cryptanalysis of stream ciphers
• Algebraic attacks (3)– The problem of solving a non-linear system in
GF(2) – the satisfiability problem (SAT)– Cook’s theorem (1971)
• SAT is NP-complete
– However, some instances of the SAT problem may be easier to solve
– The designer should check the system assigned to the PRNG
63/75
Cryptanalysis of stream ciphers
• Algebraic attacks (4)– Example – LFSR– The output sequence: 1110…– The initial state: a0, a1, a2, a3
– The output bits: y0=1, y1=1, y2=1, y3=0– The equations
64/75
41 xxxf
323
212
101
030
ayy
ayy
ayy
aay
a 3210y0 1100y1 1110y2 1111y3 0111
Linear system – easy to solve!
Cryptanalysis of stream ciphers
• Algebraic attacks (5)– Example (1): consider the non-linear PRNG below
65/75
Cryptanalysis of stream ciphers
• Algebraic attacks (6)– Example (2): The system of equations
• (1) y1=(x1+x4)(x5+x7)=x1x5+x1x7+x4x5+x4x7• (2) y2=(x1+x4+x3)(x5+x7+x6)=
=x1x5+x1x7+x1x6+x4x5+x4x7+x4x6+x3x5+x3x7+x3x6• … (we need 7 independent equations)
66/75
Cryptanalysis of stream ciphers
• Algebraic attacks (7)– Example (3): Methods of solving the system
• The brute force method: try all the possible 27-1 solutions (all zeros are not permitted)
• The linearization method– Replace all the products by new variables– Solve the obtained linear system (e.g. by Gaussian algorithm)– Try to guess the variables that were included in the products,
given the values of the new variables, in such a way that the overall system is consistent
67/75
Cryptanalysis of stream ciphers
• Algebraic attacks (8)– Example (4): The linearized system
• y1=z1+z2+z3+z4
• y2=z1+z2+z5+z3+z4+z6+z7+z8+z9
• ...
68/75
Cryptanalysis of stream ciphers
• Algebraic attacks (9)– Other methods of solving non-linear systems,
applied in cryptanalysis• Linear consistency test (LCT)• Methods of computational commutative algebra
(Gröbner bases etc.)• etc.
– No matter how sophisticated the method of solving the system is applied, cryptanalysis of a seriously designed system always includes search
69/75
Cryptanalysis of stream ciphers
• Statistical methods (1)– In the previous example, the majority of the
output symbols will be zero, due to the AND combining function
– The non-linearity of the assigned system of equations is the highest possible
– However, it is possible to make use of bad statistical properties of the output sequence to determine the plaintext sequence
70/75
Cryptanalysis of stream ciphers
• Statistical methods (2)– Example
• With the AND output combiner, the probability of zero in the output sequence will be ¾.
• This means that, upon enciphering with this sequence as the keystream, the probability that the plaintext bit is equal to the ciphertext bit is ¾.
• Consequence – easy reconstruction of the plaintext.
71/75
Cryptanalysis of stream ciphers
• Statistical methods (3)– Correlation – The output sequence coincides too
much with one or more internal sequences – this enables correlation attacks – a kind of statistical attack.
– Correlation attacks• It is possible to divide the task of the cryptanalyst into
several less difficult tasks – “Divide and conquer”
72/75
Cryptanalysis of stream ciphers
• Statistical methods (4)– Typical example – the Geffe’s generator
73/75
322133221321 1 xxxxxxxxxx,x,xF
F balanced – good statistical properties
Cryptanalysis of stream ciphers
• Statistical methods (5)– Problem: Correlation!
74/75
4
3
4
3
2
10
11
2
1
21
21
nn
nn
nnn
nnn
ssPr
ssPrsssPr
sssPr
Cryptanalysis of stream ciphers
• Statistical methods (6)– Since the output sequence is correlated with both
input sequences, we can independently guess the input sequences’ bits with high probability if the output sequence is known.
75/75
Recommended