Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Fall 2018 CS 222: Discrete Structures1
Cryptography Intro and RSA
Well, a gentle intro to cryptography, followed by a description of public key crypto and RSA.
Fall 2018 CS 222: Discrete Structures2
Definition
• Cryptology is the study of secret writing• Concerned with developing algorithms which may be
used:– To conceal the content of some message from all except the
sender and recipient (privacy or secrecy), and/or– Verify the correctness of a message to the recipient
(authentication or integrity)• The basis of many technological solutions to computer
and communication security problems
Fall 2018 CS 222: Discrete Structures3
Terminology• Cryptography: The art or science encompassing the
principles and methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back to its original form
• Plaintext: The original intelligible message • Ciphertext: The transformed message
• Cipher: An algorithm for transforming an intelligible message into one that is unintelligible
Fall 2018 CS 222: Discrete Structures4
Terminology (cont).
• Key: Some critical information used by the cipher, known only to the sender & receiver– Or perhaps only known to one or the other
• Encrypt: The process of converting plaintext to ciphertext using a cipher and a key
• Decrypt: The process of converting ciphertext back into plaintext using a cipher and a key
• Cryptanalysis: The study of principles and methods of
transforming an unintelligible message back into an intelligible message without knowledge of the key!
Fall 2018 CS 222: Discrete Structures5
Concepts
• Encryption: The mathematical operation mapping plaintext to ciphertext using the specified key:
C = EK(P)
• Decryption: The mathematical operation mapping ciphertext to plaintext using the specified key: P = EK
-1(C) = DK (C)
• Cryptographic system: The family of transformations from which the cipher function EK is chosen– It is a family of transformations since each key K
effectively creates a different transformation
Fall 2018 CS 222: Discrete Structures6
Concepts (cont.)
• Key: Is the parameter which selects which individual transformation is used, and is selected from a keyspace K
• Usually assume the cryptographic system is public, and only the key is secret information – Why? Because we don’t want to rely on “security through
obscurity”
Fall 2018 CS 222: Discrete Structures7
Rough Classification
• Symmetric-key encryption algorithms • Public-key encryption algorithms • Digital signature algorithms • Hash functions • Cipher Classes
– Block ciphers– Stream ciphers
Fall 2018 CS 222: Discrete Structures8
Symmetric-Key Encryption System
Message SourceM
Adversary
Message Dest.M
Encrypt M withKey K
C = EK(M)
Decrypt C withKey K
M = DK(C)
Key K savedKey source
Random key Kproduced
K
C
K
K
C
Insecure communication channel
Secure key channel
Fall 2018 CS 222: Discrete Structures9
Symmetric-Key Encryption Algorithms
• A Symmetric-key encryption algorithm is one where the sender and the recipient share a common, or closely related, key – Managing this key is nontrivial– Plus there is the question: how does the key come
to be shared?• Historically, symmetric-key algorithms were developed
first– They are generally good at efficiently encrypting
large amounts of data• As of Feb. 2017, an Intel i7 with integrated AES
instruction set can encrypt almost 12 GB/s
Fall 2018 CS 222: Discrete Structures10
Exhaustive Key Search• Always theoretically possible to simply try every key • Most basic attack, directly proportional to key size
• Typically, key is large enough so that exhaustive search is not computationally feasible– Do the math: Consider a 128-bit key. Key space is
roughly 3.4 x 1038 keys. one billion machines each testing one billion keys each second requires (3.4 x 1038)/(1018) seconds to test them all. That’s 3.4 x 1020 seconds, or 10.7 trillion years
Fall 2018 CS 222: Discrete Structures11
The Caeser Cipher
• 2000 years ago Julius Caesar used a simple substitution cipher, now known as the Caesar cipher – First attested use in military affairs (e.g., Gallic Wars)
• Concept: replace each letter of the alphabet with another letter that is k letters after original letter
• Example: replace each letter by 3rd letter after
L FDPH L VDZ L FRQTXHUHG I CAME I SAW I CONQUERED
Fall 2018 CS 222: Discrete Structures12
The Caeser Cipher
• Can describe this mapping (or translation alphabet) as:
Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ Cipher: DEFGHIJKLMNOPQRSTUVWXYZABC
Fall 2018 CS 222: Discrete Structures13
General Caesar Cipher
• Can use any shift from 1 to 25 – I.e. replace each letter of message by a letter a fixed distance
away • Specify key letter as the letter a plaintext A maps to
– E.g. a key letter of F means A maps to F, B to G, ... Y to D, Z to E, I.e. shift letters by 5 places
• Hence have 26 (25 useful) ciphers – Hence breaking this is easy. Just try all 25 keys one by one.
Fall 2018 CS 222: Discrete Structures14
Mathematics
• If we assign the letters of the alphabet the numbers from 0 to 25, then the Caesar cipher can be expressed mathematically as follows:
For a fixed key k, and for each plaintext letter p, substitute the ciphertext letter C given by
C = (p + k) mod(26)Decryption is equally simple:
p = (C – k) mod (26)
Fall 2018 CS 222: Discrete Structures15
Mixed Monoalphabetic Cipher
• Rather than just shifting the alphabet, could shuffle (jumble) the letters arbitrarily
• Each plaintext letter maps to a different random ciphertext letter, or even to 26 arbitrary symbols
• Key is 26 letters long
Fall 2018 CS 222: Discrete Structures16
Security of Mixed Monoalphabetic Cipher
• With a key of length 26, now have a total of 26! ~ 4 x 1026 keys– A computer capable of testing a key every ns would take
more than 12.5 billion years to test them all.– On average, expect to take more than 6 billion years to find
the key.
• With so many keys, might think this is secure…but you’d be wrong
Fall 2018 CS 222: Discrete Structures17
Security of Mixed Monoalphabetic Cipher
• Variations of the monoalphabetic substitution cipher were used in government and military affairs for many centuries into the middle ages
• The method of breaking it, frequency analysis was discovered by Arabic scientists
• All monoalphabetic ciphers are susceptible to this type
of analysis
Fall 2018 CS 222: Discrete Structures18
Language Redundancy and Cryptanalysis
• Human languages are redundant• Letters in a given language occur with different
frequencies.– Ex. In English, letter e occurs about 12.75% of time, while
letter z occurs only 0.25% of time. • In English the letters e is by far the most common
letter
Fall 2018 CS 222: Discrete Structures19
Language Redundancy and Cryptanalysis
• t,r,n,i,o,a,s occur fairly often, the others are relatively rare
• w,b,v,k,x,q,j,z occur least often • So, calculate frequencies of letters occurring in
ciphertext and use this as a guide to guess at the letters. This greatly reduces the key space that needs to be searched.
Fall 2018 CS 222: Discrete Structures20
Language Redundancy and Cryptanalysis
• Tables of single, double, and triple letter frequencies are available
Fall 2018 CS 222: Discrete Structures21
Public Key Cryptography
Fall 2018 CS 222: Discrete Structures22
Terminology• Asymmetric cryptography• Public key (known to entire world)• Private key (kept secret)• Encryption process (P to C with public key)• Decryption Process (C to P with private key)
• Can also do this in reverse: encrypt with private key, decrypt with public key
• This doesn’t keep info secret, but does verify who sent it! (called a digital signature - Only holder of private key can sign, so can’t be forged)
Fall 2018 CS 222: Discrete Structures23
Uses
• Orders of magnitude slower than symmetric key crypto, so usually used to initiate symmetric key session
• Much easier to configure, so used widely in network protocols to establish temporary shared key that is used to transmit secret (symmetric) key
Fall 2018 CS 222: Discrete Structures24
Uses
• Transmitting over insecure channel• Alice <Apu, Apr> , Bob <Bpu, Bpr>• Alice to Bob encrypt m with Bpu
• Bob to alice encrypt m with Apu
• Accurately knowing public key of other person is one of biggest challenges of using public key crypto.
Fall 2018 CS 222: Discrete Structures25
The General Idea
• We use two one-way functions– Multiplication vs factoring– modular exponentiation vs modular logarithm
• Both can be one way trap door processes
Fall 2018 CS 222: Discrete Structures26
The General Idea
• Multiplication• Relatively easy, even if you are multiplying two huge
numbers• Factoring
• Difficult: No matter how it is done, need to check many possible factors
• Think of it as finding the combination for a lock (prime factorization)
• Here: n = pq, where p and q are both (very) large primes
Fall 2018 CS 222: Discrete Structures27
The General Idea• Modular exponentiation
• Relatively easy: Think of a clock face with the requisite number of numbers on it
• Modular multiplication like winding a length of rope around it and seeing where it stops.
Thanks toKahnAcademy
46 mod 12 = 10
Fall 2018 CS 222: Discrete Structures28
The General Idea
• Computing modular logarithm (discrete logarithm) is difficult• modular exponentiation distributes values in manner
close to uniformly random around clock face• Finding discrete log means testing many possible
values• For large numbers, this is a prohibitively expensive
operation
Fall 2018 CS 222: Discrete Structures29
The General Idea
Fall 2018 CS 222: Discrete Structures30
The General Idea
• We use two tools to make this work• First, the Euler totient function
• This is a one-way trap door function!• We use Euler’s Theorem
Fall 2018 CS 222: Discrete Structures31
Totient Function
• Allegedly from total and quotient• How many numbers less than n are relatively prime to
n?• Totient function, φ(n) gives this.• If n is prime, φ(n) = n-1 (1,2,…n-1)• If p and q are prime, φ(pq) = (p-1)(q-1)
– p, 2p, … (q-1)p q, 2q, … (p-1)q not rel. prime so have – pq – 1 – [(p-1) + (q-1)] = (p-1)(q-1)
Fall 2018 CS 222: Discrete Structures32
Totient Function• This is trap door
• difficult, in general, to determine value• easy if you know the prime factorization
Fall 2018 CS 222: Discrete Structures33
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures34
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures35
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures36
Euler’s Theorem
Fall 2018 CS 222: Discrete Structures37
Euler’s Theorem
Upshot: We can do exponentiation mod the totient function
Fall 2018 CS 222: Discrete Structures38
RSA
• Key length variable (but should now be at least 1024 bits)
• Plaintext block must be smaller than key length• Ciphertext block will be length of key
Fall 2018 CS 222: Discrete Structures39
RSA• Choose two large primes (around 1024 bits each) p and
q. Let n = pq (very difficult to factor)• Choose number e that is relatively prime to φ(n). Can do
this since you know p and q and thus φ(pq) and from the derivation know exactly which numbers are relatively prime!
• Public key is <e, n>• To make private key, find d that is the multiplicative
inverse of e mod φ(n) (so ed = 1 mod φ(n)) (use Euclid’s algorithm)
• Private key is <d,n>• To encrypt a number m, compute c = me mod n.• To decrypt: m = cd mod n.
Fall 2018 CS 222: Discrete Structures40
RSA Example
1. Select primes: p=17 & q=112. Compute n = pq =17×11=1873. Compute ø(n)=(p–1)(q-1)=16×10=1604. Select e : gcd(e,160)=1; choose e=75. Determine d: de=1 mod 160 and d < 160 Value
is d=23 since 23×7=161= 10×160+16. Publish public key KU={7,187}7. Keep secret private key KR={23,17,11}
Fall 2018 CS 222: Discrete Structures41
RSA Example cont
• sample RSA encryption/decryption is: • given message M = 88 (nb. 88<187)• encryption:
C = 887 mod 187 = 11 • decryption:
M = 1123 mod 187 = 88
Fall 2018 CS 222: Discrete Structures42
Questions
• Why does it work?• Why is it secure?• Are operations sufficiently efficient?• How do we find big primes?
Fall 2018 CS 222: Discrete Structures43
Why Does It Work?
• We chose d and e so that de = 1 mod φ(n), so for any x,
• x(ed) mod n = x(ed mod φ(n)) mod n = x1 mod n = x mod n. • And (xe)d = x(ed)
Fall 2018 CS 222: Discrete Structures44
Why Is It Secure?
• We’re not sure it is, but it seems to be• Based on premise that factoring a big number is difficult.
– Semiprimes, the product of two (not necessarily distinct) primes, are most difficult numbers to factor.
– Largest such semiprime yet factored is RSA-768, 768 bits, 232 decimal digits.
• Took two years, hundreds of machines, several research institutions, and highly optimized code.
• Equivalent of 2000 CPU years on a single-core 2.2 GHz AMD Opteron
Fall 2018 CS 222: Discrete Structures45
Why Is It Secure?
• If you can factor n, you’re golden:– Problem is one of finding modular log (i.e. inverse of
exponential) – Why? Adversary knows <e,n>. So for message m, knows
ciphertext is c = me mod n. – So if adversary can reverse the exponentiation (that is, find
the number x s.t. xe mod n = c), she’s got the original message m!
– Remember how we originally find this inverse: By knowing φ(n). Which is difficult to know if you can’t factor n
Fall 2018 CS 222: Discrete Structures46
Why Is It Secure?
• We don’t know that there are not easier ways to break it (we do know that breaking it is no harder than factoring)
• We do know that it can be broken with a quantum computer using Shor’s Algorithm (1994) which has cubic time and linear space complexity in the number of bits of the number being factored– So if quantum computers become practical...
Fall 2018 CS 222: Discrete Structures47
Finding Big Primes
Fall 2018 CS 222: Discrete Structures48
Finding Big Primes• No nice way of absolutely determining that a huge
number is prime, but we can guess pretty accurately• Fermat’s Theorem: If p is prime, and 0 < a < p, then
a^(p-1) mod p = 1 mod p. – Works because though it’s possible for a^(n-1) = 1 mod n for a
non-prime, it’s not likely. For a randomly generated number of about 100 digits, probability that n is not prime but relation holds is about 1 in 10^(13).
– Other similar probabilistic algorithms for finding large primes
Fall 2018 CS 222: Discrete Structures49
Finding Big Primes• Update: usually Fermat test with base 2 is applied
• because it can be optimized• Then several Miller-Rabin tests applied
• How many depends on how small you want the probability of being wrong
• Typically somewhere around 20 tests run• Gets probability of being wrong down to around 2-100
• See FIPS Pub 186-4 for details
Fall 2018 CS 222: Discrete Structures
UPDATE!!! The AKS Algorithm!• The Agrawal-Keyal-Saxena Primality Test
– Published in 2002 (after previous slide created)– A deterministic polynomial time primality-proving algorithm– Developed by three researchers at the Indian Institute of
Technology Kanpur– Answered a centuries old question (and in a surprising way)!– Won 2006 Godel Prize and 2006 Fulkerson Prize
• Unfortunately the “constants” involved in the computational complexity estimates are very large
– So not yet practical for identifying large primes (but making this competitive with probabilistic algorithms is a ongoing research area)
50
Fall 2018 CS 222: Discrete Structures
2016 UPDATE!!! The AKS Algorithm!• The Agrawal-Keyal-Saxena Primality Test
– Still not used– Real difficulty: probability of a hardware error running this
algorithm is higher than the probability of accidentally choosing a non-prime with the earlier methods!
51
Fall 2018 CS 222: Discrete Structures52
Diffie-Hellman• Oldest public key cryptosystem still in use• Does neither encryption nor digital signatures.• Used because it is fastest at what it does: allow two
individuals to agree on a symmetric key even though they can only communicate over insecure channels.
• Remarkable because neither Alice nor Bob need any apriori information, yet after the exchange of two messages, they share a secret number.
• One bad thing: no authentication, so Alice may be setting up a key with Trudy!
Fall 2018 CS 222: Discrete Structures53
The Process
• Alice and Bob agree on two primes, p and g, where p is a large prime and g is a number less than p (with some restrictions)
• Each chooses a random 1024 bit number (SA for Alice, SB for Bob).
• Alice computes TA = gSA mod p. Bob computes TB = gSB mod p.
• They exchange their T values• Alice computes TBSA mod p, Bob computes TASB mod
p.• Done: TBSA = (gSB)SA = g(SB*SA) = g(SA*SB) = (gSA)SB =
TASB mod p.
Fall 2018 CS 222: Discrete Structures54
Why It Is Secure
• Whole world knows gSA and gSB, but getting g(SA*SB) means having to do a modular logarithm– If can find y such that gy = gSA, then know SA.
• And well, it’s not exactly secure -- it has a problem with a person-in-the-middle attack (the lack of authentication of endpoints)