Lecture Three
Today’s Topics Definitions and Review Cryptography Overview Cryptanalysis
Next class: Symmetric Key Ciphers
Definitions review
Cryptography: the study of mathematical techniques related to information security that have the following objectives:
Authentication: corroboration of the identity of an entity.
Confidentiality: ensuring information is accessible only by authorized persons.
Data integrity: ensuring information is has not been altered by unauthorized or unknown means.
Non-repudiation: preventing the denial of previous commitments or actions.
Definitions review
Cryptography is one tool (not the only) useful for providing security services such as:
Authorization: conveyance of official sanction to do or be something to another entity.
Access Control: restricting access to resources to privileged entities.
Availability: ensuring a system is available to authorized entities when needed.
Anonymity: concealing the identity of an entity involved in some process
Certification: endorsement of information by a trusted entity. Revocation: retraction of certification or authorization
The most widely used tool for securing information and services is cryptography.
Cryptography relies on ciphers: mathematical functions used for encryption and decryption of a message. Encryption: the process of disguising a message in such a way
as to hide its substance. Ciphertext: an encrypted message Decryption: the process of returning an encrypted message back
into plaintext.
Cryptography
Encryption Decryption Plaintext Ciphertext
Original Plaintext
Ciphers
The security of a cipher may rest in the secrecy of its restricted algorithm . Whenever a users leaves a group, the algorithm must change. Can’t be scrutinized by people smarter than you. Unfortunately, secrecy is a popular approach.
Modern cryptography relies on keys, a selected value from a large set (a keyspace), e.g., a 1024-bit number. 21024 values! Security is based on secrecy of the key, not the details of the
algorithm. Change of authorized participants requires only a change in key.
Ciphers: Terminology and notation
For some message M, let’s denote the encryption of that message into cipher text as
{M}Kab = C Kab is the key shared by participants A and B. The
decryption into plain text is written as {C}Kab = M
Notice, {{M}Kab}Kab = M symmetric key algorithms.
Some algorithms use different keys for each operation: {{M}K+}K- = M public-key algorithms.
Shift cipher: each plaintext characters is replaced by a character k to the right. (When k=3, it’s a Caesar cipher). “Watch out for Brutus!” => “Jngpu bhg sbe Oehghf!” Only 25 choices! Not hard to break by brute force.
Substitution Cipher: each character in plaintext is replaced by a corresponding character of ciphertext. E.g., cryptograms in newspapers.
plaintext code: a b c d e f g h i f k l m n o p q r s t u v w x y z ciphertext code: m n b v c x z a s d f g h j k l p o i u y t r e w q
(26!) Possible pairs. Is it really that hard to break?
Example Ciphers
Common Tools
The most common cryptographic tools are Symmetric key ciphers
DES, 3DES, AES, Blowfish, Twofish, IDEA Fast and simple (based on addition, masks, and shifts) One key shared and kept secret Typical key lengths are 40, 128, 256, 512
Asymmetric key ciphers RSA, El Gamal two keys Slow, but versatile (usually requires exponentiation) Typical key lengths are 512, 1024, 2048
Keys
Symmetric key algorithms have a separate key for each pair of entities sharing a key.
Public-Key algorithms use a public-key and private-key pair over a message. Only the public-key can decrypt a message encrypted with the
private key. Similarly, only the private key can decrypt a message encrypted
with the public key.
Often, a symmetric session key is generated by one of participants and encrypted with the other’s public key. Further communication occurs with the symmetric key.
Symmetric key example
Alice and Bob would like to pass a secret message. They have a shared secret, e.g, “fooB@r32” They separate. Alice sends “Meet me at 3pm” to Bob encrypted with the key they
share Kab. Let the message be M=“meet me at 3pm”. A -> B: {M}Kab
Bob decrypts the message by running through the decryption algorithm Usually the algorithm is run in reverse with the same key.
Advantages: typically a fast algorithm! Disadvantage: Alice can’t use the same key with Carol.
Asymmetric Key Example
Alice and Bob would like to pass a secret message. Each generates a public and private key pair. Alice tells Bob her public key Ka+.
Bob tells Alice his public key Kb+.
They separate. Alice sends “Meet me at 3pm” to Bob encrypted with her private key
Ka-. A -> B: {M}Ka-
Bob decrypts the message by running through the decryption algorithm with her public key B: { {M}Ka-}Ka+ =“meet me at 3pm”
Advantages: Alice can use the same key with Carol. Disadvantages: typically slow slow slow.
Public Key Infrastructure (PKI)
The impractical dream of security is a PKI. In a PKI, every person or entity on the Internet is put in a
(distributed) database that holds their public key. Want to talk to me? Just look up “Brian Levine” in the
PKI! Problems:
How do you know you’ve got the correct Brian Levine? (name collision)
Who should you trust? And why should you trust the people they trust? And so on…
Implementation of revocation has been poor. Universal interoperability is tough.
Hashes
Hashes are going to be a tool we use primarily for authentication.
While related, these are not the same hashes you would use as the function in a hash table.
They have stricter requirements.
Hash Functions
A hash H is a one-way function that operates on arbitrary-length message m, and returns a fixed-length value h.
h=H(m) Hashes provide a fingerprint of m. (e.g., 128 bits) Typically, hash functions are known for their speedy computation. Usually a
bunch of shifts and substitutions in a tight loop. Three fundamental rules of a good cryptographic hash:
1. Given h, it is hard to compute m such that H(m)=h. 2. Given specific m, it is hard to find another message m’, such that
H(m)=H(m’). 3. Given a large set of messages M, it’s hard to find any pair (mi,mj) that hash
to the same value.
Digital Signatures
Real signatures provide a number of features Signature provides authenticity for a documents Signatures are “hard” to forge Signatures, as parts of the document, aren’t reusable. Signatures are unalterable or erasable. Signatures can’t be repudiated.
In reality there are ways around all of these for real signatures.
Signing with Hash Functions and a key
1. Alice produces a one-way has of the document. A: h=H(D)
2. Alice encrypts the hash A: {h}KA-
3. Alice sends the document and the signed hash to Bob. A->B: D, {h}KA-
4. Bob verifies by producing the same hash and decrypting the hash Alice sent.
Our shorthand notation for signing things is A->B: [D]KA-
Typically the key is a private key; recipient verifies with public key.
Digital Signing Documents
1. Alice encrypts the hash of the document with her private key. 2. Alice sends the document plus hash to Bob. 3. Bob hashes the document and compares the result to what he
decrypted, thereby verifying the signature.
- The sig is authentic (the hashes match) - The sig is unforgeable (as long as no one has the private
key but Alice) - The sig is not reusable (it’s a function of the document) - The signed doc is unalterable (the hashes wouldn’t
match) - The document can’t be repudiated.
One Time Pads
The key is as long as your input. The algorithm:
Print up a series of random numbers on a pad. Make a copy. Give one to your correspondent. Each bit of plaintext is XOR’d with the key. As you use each bit of the key, cross it out off the pad. When you are done with a page, tear it off.
This technique was used extensively in the cold war. Requires perfect random number generation.
Cryptanalysis
Cryptanalysis is the science of recovering the plaintext of a message without access to the key.
Doesn’t have to discover the key necessarily. The loss of a key without cryptanalysis is called a compromise.
Ciphertext-only attack The attacker has to recover the plaintext from only the ciphertext.
Known-plaintext attack Portions of the cipher are known as plaintext. The rest may be easier to recover.
Chosen-plaintext attack The attacker can choose what plaintext to encrypt, again making it easier to
recover other ciphertext. Chosen-ciphertext attack
The attacker chooses the cipher text to decrypt with an unknown key. (Normally, the attacker then adapts and chooses another cipher text based on previous results)
Rubber-hose cryptanalysis. Threats, blackmails, torture, pay-offs.
Cryptanalysis
Ideally, the attacker has to use brute force in an exhaustive search of the key-space.
It is the complexity of launching the attack that secures us: Data complexity: a large number of expected inputs (e.g.,
ciphertexts or plaintexts to analyze) Storage complexity: a large amount of storage units required.
(i.e., launching a dictionary attack is hard) Processing complexity: a large number of operations required.
i.e., we need until the heat-death of the universe to get an answer… or perhaps only until after the information isn’t useful!
Cryptanalysis
A simple substitution cipher over a natural language can be easy. “Don’t attack. We aren’t ready.” “Vkj’u muumbf. Rc mocj’u ocmvw.”
“e” and “t” are the most frequent letters in English, contractions end in “t” or “s” generally, etc.
Sure, you can take out spaces and punctuation, but you can also analyze clusters of letters.
A digram is a two-letter combination “th” and “he” are common. Of the 262 digrams, the top 15 account for 27% of all
occurrences.
Cryptanalysis
http://www.schneier.com/paper-self-study.pdf More of a reference of references You are not responsible for this material, but take a look at the
link to get an idea of the richness of the field of cryptanalysis…
Quiz on Tuesday 16 Sept
First three lectures: Ethics, Policy Security Basics and Definitions
Acceptable Use Policy Introduction to cryptography Symmetric/Asymmetric algorithms (what we complete)
First three readings: Bishop Ch. 1, 2, 3, 8
The reading are covered on the quiz…